This project solves the problem of summarizing and querying private documents securely and efficiently using Retrieval-Augmented Generation (RAG) architecture.
Instead of uploading sensitive documents to public LLM services, this solution:
- Stores & indexes documents locally.
- Converts them into embeddings for efficient search.
- Retrieves only relevant information chunks.
- Sends only small, relevant snippets to the cloud-based IBM Watsonx.ai LLM for question answering.
Ideal use-case: Onboarding at a company with a large volume of private policies, guidelines, or project documents you need to quickly understand, but can’t risk uploading publicly.
| Technology | Purpose |
|---|---|
| LangChain | Document loading, splitting, RAG pipeline |
| Chroma DB | Local vector storage for embeddings |
| Hugging Face Embeddings | Convert documents to numerical vectors |
| IBM Watsonx.ai LLM | Language model for summarization and QA |
| dotenv | Securely load API keys and configs |
| Python 3.8+ | Language |
rag_summarizer/
├── data/ # Private documents
│ └── companyPolicies.txt
├── config/ # Environment configs
│ └── .env
├── embeddings/ # Embedding-related functions
│ └── embedder.py
├── retriever/ # Retriever setup
│ └── retriever.py
├── llm/ # Watsonx LLM setup
│ └── watsonx_llm.py
├── chain/ # Building the RAG conversation chain
│ └── qa_chain.py
├── utils/ # Helpers (optional)
├── main.py # Entry point
├── README.md # Project documentation
└── requirements.txt # Dependencies
- Clone the Repository
git clone https://github.com/yourusername/Knowledge-Augmented-Gen.git
cd Knowledge-Augmented-Gen- Install Dependencies
pip install -r requirements.txt- Setup
.envFile
Inside /config/.env:
IBM_URL=your_ibm_url
IBM_API_KEY=your_ibm_api_key
IBM_PROJECT_ID=your_project_id
📝 Ensure your IBM Watsonx credentials are secured here.
- Load documents:
.txtfiles (company policies, guidelines, manuals, etc.) - Split into chunks: Using
CharacterTextSplitterfor manageable processing. - Convert to embeddings: Using Hugging Face models.
- Store locally: In Chroma DB for fast vector-based retrieval.
- Query initiated by user.
- Relevant text chunks retrieved locally.
- Only relevant snippets are sent to IBM Watsonx.ai LLM.
- Returns accurate, contextual, and private responses.
- Retains multi-turn conversation history, allowing context-aware follow-up questions.
Run the project:
python main.pyYou'll enter an interactive prompt:
Welcome! Ask your questions (type 'exit' to quit):
Question: What is the mobile policy?
Answer: [LLM-generated answer]
Question: What is the aim of it?
Answer: [Context-aware response]Type exit to quit.
- Replace IBM Watsonx with Local LLMs (Mistral, Llama2, GPT4All).
- Expand for PDF/CSV documents.
- Add Flask or Streamlit UI.
- Integrate logging and analytics for usage tracking.
- Keeps private documents secure & local.
- Minimizes data exposure.
- Leverages state-of-the-art RAG architecture.
- Easily extensible and production-ready.
MIT License – feel free to fork, modify, and build upon!
- [Md Yeasin Arafath] (Developer, Architect)