Templates

Templates are pre-built, reusable, and open source Apache Beam pipelines that are ready to deploy. They perform specific data tasks like classification, summarization, or retrieval-augmented generation (RAG). Each template abstracts the complexity of building and managing pipelines, and can be executed directly on runners such as Google Cloud Dataflow, Apache Flink, or Spark with minimal configuration.

📄️ LLM batch processor

Llm Batch Processor is a pre-built Apache Beam pipeline that lets you process a batch of text inputs using an LLM (OpenAI models) and save the results to a GCS path. You provide an instruction prompt that tells the model how to process the input data—basically, what to do with it. The pipeline uses the model to transform the data and writes the final output to a GCS file.