📄️ Kafka to HelixDB
The Kafka to helixdb Pipeline is a pre-built Apache Beam streaming pipeline that lets you consume real-time text data from Kafka topics, generate embeddings using OpenAI models, and store the vectors into HelixDB node for similarity search and retrieval. The pipeline automatically handles windowing, embedding generation, and upserts to HelixDb's endpoint.
📄️ Kafka to Pinecone
The Kafka to Pinecone Pipeline is a pre-built Apache Beam streaming pipeline that lets you consume real-time text data from Kafka topics, generate embeddings using OpenAI models, and store the vectors in Pinecone for similarity search and retrieval. The pipeline automatically handles windowing, embedding generation, and upserts to pinecone vector db, turning live Kafka streams into vectors for semantic search and retrieval in Pinecone.
📄️ LLM batch processor
Llm Batch Processor is a pre-built Apache Beam pipeline that lets you process a batch of text inputs using an LLM (OpenAI models) and save the results to a GCS path. You provide an instruction prompt that tells the model how to process the input data—basically, what to do with it. The pipeline uses the model to transform the data and writes the final output to a GCS file.