Overview
This project is a reference implementation of a Retrieval-Augmented Generation (RAG) pipeline using Python, LangChain, and an OpenAI-compatible LLM. Feed it your documentation, blog posts, or code and ask natural-language questions — it answers using only your content.
Architecture
- Ingestion — parse Markdown/HTML docs, chunk by heading
- Embedding — embed chunks with Sentence Transformers
- Retrieval — pgvector for ANN search (HNSW index)
- Generation — Claude / GPT-4 with retrieved context injected into system prompt
- Evaluation — RAGAS metrics for faithfulness and context recall
Use Cases
- Personal knowledge base assistant
- Company docs chatbot
- Codebase Q&A