Backend Architecture
Offline RAG Pipeline.
A 100% local, privacy-first Retrieval-Augmented Generation system. Built with LangChain LCEL, ChromaDB, and Ollama (Phi-3) to allow secure chatting with technical PDFs without internet access or data leaks.
Pipeline Architecture
1. Ingestion
PyPDFLoader extracts text, splits into 500-char chunks with overlap.
2. Vectorization
HuggingFace 'all-MiniLM-L6-v2' embeds chunks into a local ChromaDB.
3. LCEL Inference
Ollama (Phi-3) retrieves top 5 chunks and synthesizes a local response.
Core Implementation
import os from langchain_community.document_loaders import PyPDFLoader from langchain_text_splitters import RecursiveCharacterTextSplitter from langchain_chroma import Chroma from langchain_community.embeddings import HuggingFaceEmbeddings # 1. Settings PDF_PATH = "data.pdf" DB_PATH = "my_vector_db" print("--- STARTED: Loading PDF ---") # 2. Load the PDF loader = PyPDFLoader(PDF_PATH) docs = loader.load() # 3. Split text into chunks text_splitter = RecursiveCharacterTextSplitter( chunk_size=500, # Characters per chunk chunk_overlap=50 ) splits = text_splitter.split_documents(docs) # 4. Create the Vector Database embedding_function = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2") vector_db = Chroma.from_documents( documents=splits, embedding=embedding_function, persist_directory=DB_PATH )