Server less execution engines such as Google Cloud’s Workflows can automate and coordinate large language model (LLM) use cases
Summarising a brief document is as simple as typing its content in its entirety as a prompt into the context window of an LLM
Google cloud assess the document individually, much like the map/reduce method
When a new text document is added to a Cloud Storage bucket, the workflow is started
A subworkflow, all calls to the Gemini 1.0 Pro model are made
The subworkflow in generate chunk summary, which summarises that portion of the document, is then invoked using the Gemini model
Google cloud more recent large language model, Gemini 1.5, which can summarise a lengthy document in a single run and accept up to one million tokens as input