Backend Architecture

Offline RAG Pipeline.

A 100% local, privacy-first Retrieval-Augmented Generation system. Built with LangChain LCEL, ChromaDB, and Ollama (Phi-3) to allow secure chatting with technical PDFs without internet access or data leaks.

Pipeline Architecture

1. Ingestion

PyPDFLoader extracts text, splits into 500-char chunks with overlap.

2. Vectorization

HuggingFace 'all-MiniLM-L6-v2' embeds chunks into a local ChromaDB.

3. LCEL Inference

Ollama (Phi-3) retrieves top 5 chunks and synthesizes a local response.

Core Implementation

import os
from langchain_community.document_loaders import PyPDFLoader
from langchain_text_splitters import RecursiveCharacterTextSplitter
from langchain_chroma import Chroma
from langchain_community.embeddings import HuggingFaceEmbeddings

# 1. Settings
PDF_PATH = "data.pdf"
DB_PATH = "my_vector_db"

print("--- STARTED: Loading PDF ---")

# 2. Load the PDF
loader = PyPDFLoader(PDF_PATH)
docs = loader.load()

# 3. Split text into chunks
text_splitter = RecursiveCharacterTextSplitter(
    chunk_size=500,  # Characters per chunk
    chunk_overlap=50
)
splits = text_splitter.split_documents(docs)

# 4. Create the Vector Database
embedding_function = HuggingFaceEmbeddings(model_name="all-MiniLM-L6-v2")

vector_db = Chroma.from_documents(
    documents=splits,
    embedding=embedding_function,
    persist_directory=DB_PATH
)