🚀 LuminaDoc: Your Offline Document Intelligence Hub 🔍

LuminaDoc is a powerful, privacy-focused document analysis tool that brings RAG (Retrieval-Augmented Generation) capabilities to your local environment. Process, analyze, and interact with your documents using advanced LLMs—all without requiring an internet connection.

📚 Key Features

100% Offline Processing: Your documents never leave your machine.
Built-in RAG Pipeline: Advanced retrieval and generation workflows.
Local Vector Storage: Supports ChromaDB for efficient retrieval.
Format Support: PDF, DOCX, TXT, CSV, XLSX.
Intelligent Chunking: Semantic search with precise document splitting.
Conversational Interface: Ask questions, get contextual answers.
Custom Knowledge Base: Build and manage your knowledge repositories.
Multi-file Upload: Upload and process multiple documents at once.
Simple Streamlit UI: Clean, native Streamlit interface—no HTML required.

💡 Perfect For:

📊 Researchers: Analyze private datasets securely.
⚖️ Legal Professionals: Handle confidential legal documents locally.
💼 Business Analysts: Process sensitive data without cloud dependencies.
🧑‍💻 Developers: Build offline-first AI-powered applications.
🔐 Privacy Advocates: Ensure your data stays secure and private.

🛠️ Tech Stack

Backend: Python 3.9+
Frontend: Streamlit
Database: Local Vector DB (ChromaDB)
Embedding Model: Ollama (nomic-embed-text)
LLM Support: Local LLMs (e.g., llama3.2:latest)
Cross-Encoder: Sentence Transformers (ms-marco-MiniLM-L-6-v2)

🔒 Privacy First

No Cloud Services: All processing happens locally.
No API Keys Required: Run seamlessly without third-party dependencies.
No Data Sharing: Your documents remain secure on your machine.

📦 Installation Guide

Follow these steps to install and run LuminaDoc locally:

✅ Step 1: Clone the Repository

Open your terminal and clone the LuminaDoc repository:

git clone https://github.com/yourusername/luminadoc.git
cd luminadoc

✅ Step 2: Create a Virtual Environment

Create an isolated Python environment:

python3 -m venv venv
source venv/bin/activate  # On Linux/MacOS
venv\Scripts\activate     # On Windows

✅ Step 3: Install Dependencies

Install the required Python packages:

pip install --upgrade pip
pip install -r requirements.txt

Ensure Required Dependencies Include:

streamlit
langchain-ollama
chromadb
sentence-transformers
PyMuPDF
httpx
langchain-community
langchain
ollama

✅ Step 4: Install Ollama and Pull Models

If Ollama is not installed, run the following commands:

For macOS:

brew install ollama

For Linux:

curl -fsSL https://ollama.com/install.sh | sh

Verify the installation:

ollama --version

Pull Required Models:

Make sure to download the required models:

ollama pull nomic-embed-text
ollama pull llama3.2:latest

Start the Ollama server:

ollama serve

Verify the models:

curl http://localhost:11434/api/tags

Ensure the nomic-embed-text and llama3.2 models are listed.

✅ Step 5: Start LuminaDoc

Run the Streamlit app locally:

streamlit run app.py

The app will start, and you’ll see a URL in your terminal, usually:

Local URL: http://localhost:8501

Open this in your browser.

🚀 Using LuminaDoc

Upload Documents:
- In the sidebar, upload one or more documents (PDF, DOCX, TXT, CSV, XLSX).
- Click on the Process Document(s) button.
- Need another format? Request it in the feedback!
Ask Questions:
- Enter your question in the Ask a Question text area.
- Click Ask to retrieve relevant information.
View Results:
- The system will show the retrieved context and the AI-generated answer.
- Explore retrieved documents and relevant text via expandable sections.

✅ Environment Variables (Optional)

If you want to customize configurations:

Create a .env file:

OLLAMA_SERVER_URL=http://localhost:11434
VECTOR_DB_PATH=./rag-chroma

Update your code to load .env using dotenv if needed.

✅ Troubleshooting

Port Conflicts: Ensure Ollama is running on http://localhost:11434.
Dependencies Issues: Run pip install -r requirements.txt again.
Streamlit Errors: Clear the cache:
```
streamlit cache clear
```
Logs: Check server logs for errors:
```
ollama serve --verbose
```

🧠 How It Works

Document Upload & Processing:
- Documents are uploaded and split into smaller semantic chunks.
- Chunks are embedded using nomic-embed-text and stored in ChromaDB.
User Query:
- Your query searches the vector database for relevant chunks.
- Results are re-ranked using CrossEncoder.
LLM Response:
- The context is passed to LLM (e.g., llama3.2) for final generation.
- The answer is displayed in the Streamlit interface.

🔗 Next Steps

Explore advanced features like custom pipelines.
Experiment with different local LLM models.
Contribute to the project on GitHub!

Illuminate your documents with AI-powered insights—locally and securely. 🚀📚

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
.github/workflows		.github/workflows
.gitattributes		.gitattributes
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
app.py		app.py
demo.gif		demo.gif
prompt.py		prompt.py
requirements.txt		requirements.txt
screenshot.png		screenshot.png
test_app.py		test_app.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🚀 LuminaDoc: Your Offline Document Intelligence Hub 🔍

📚 Key Features

💡 Perfect For:

🛠️ Tech Stack

🔒 Privacy First

📦 Installation Guide

✅ Step 1: Clone the Repository

✅ Step 2: Create a Virtual Environment

✅ Step 3: Install Dependencies

✅ Step 4: Install Ollama and Pull Models

For macOS:

For Linux:

Verify the installation:

Pull Required Models:

✅ Step 5: Start LuminaDoc

🚀 Using LuminaDoc

✅ Environment Variables (Optional)

✅ Troubleshooting

🧠 How It Works

🔗 Next Steps

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

🚀 LuminaDoc: Your Offline Document Intelligence Hub 🔍

📚 Key Features

💡 Perfect For:

🛠️ Tech Stack

🔒 Privacy First

📦 Installation Guide

✅ Step 1: Clone the Repository

✅ Step 2: Create a Virtual Environment

✅ Step 3: Install Dependencies

✅ Step 4: Install Ollama and Pull Models

For macOS:

For Linux:

Verify the installation:

Pull Required Models:

✅ Step 5: Start LuminaDoc

🚀 Using LuminaDoc

✅ Environment Variables (Optional)

✅ Troubleshooting

🧠 How It Works

🔗 Next Steps

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages