| rag_logic | ||
| .gitignore | ||
| README.md | ||
| requirements.txt | ||
Ollama OpenRouter LangChain RAG System
A sophisticated Retrieval-Augmented Generation (RAG) system built with FastAPI, LangChain, and support for both local Ollama models and cloud-based OpenRouter models. This system enables intelligent document analysis and question-answering across multiple specialized knowledge domains.
🚀 Features
- Multi-Service Architecture: Pre-configured RAG services for 11 different academic subjects
- Dual LLM Support: Works with both local Ollama models and cloud OpenRouter models
- Document Processing: Advanced PDF processing with relevance filtering and metadata standardization
- Streaming Responses: Real-time streaming for both Q&A and summarization
- Vector Database: Powered by Chroma with multilingual embeddings
- RESTful API: Complete FastAPI-based REST API with automatic documentation
- Document Relevance Filtering: AI-powered document filtering based on topic relevance
- Batch Processing: Efficient document processing in configurable batches
📋 Pre-configured Services
The system comes with 11 pre-configured academic services:
- Macroeconomia - Macroeconomics
- Microeconomia - Microeconomics
- Diritto Commerciale - Commercial Law
- Bilancio - Accounting/Balance Sheet
- Diritto Tributario - Tax Law
- Diritto Privato - Private Law
- Storia Economica - Economic History
- Statistica I - Statistics I
- Informatica - Computer Science
- Metodi Statistici - Statistical Methods
- Marketing - Marketing
🛠 Installation
Prerequisites
- Python 3.8+
- Either Ollama (for local models) or OpenRouter API key (for cloud models)
Setup
- Clone the repository:
git clone <repository-url>
cd ollama_openrouter_langchain_rag
- Install dependencies:
pip install fastapi uvicorn langchain langchain-community langchain-huggingface langchain-openai chromadb pymupdf sentence-transformers
- Configure environment variables:
For Ollama (local models):
export OLLAMA_BASE_URL="http://localhost:11434" # Single or comma-separated URLs
export USE_OPENROUTER="False"
For OpenRouter (cloud models):
export USE_OPENROUTER="True"
export OPENROUTER_API_KEY="your-openrouter-api-key"
Optional:
export ASSEMBLYAI_API_KEY="your-assemblyai-key" # For audio transcription
🚀 Usage
Starting the Server
cd rag_logic
python rag_app.py
The server will start on http://localhost:8000 with automatic API documentation available at http://localhost:8000/docs.
API Endpoints
Service Management
- GET
/services/status- Get status of all services - GET
/services/{service_id}/status- Get status of specific service - POST
/services/{service_id}/initialize- Initialize a service - PATCH
/services/{service_id}/topic- Update service topic
Document Management
- POST
/services/{service_id}/upload-documents- Upload PDF documentscurl -X POST "http://localhost:8000/services/1/upload-documents" \ -F "files=@document.pdf"
Question & Answer
-
POST
/services/{service_id}/ask- Ask a question (blocking)curl -X POST "http://localhost:8000/services/1/ask" \ -H "Content-Type: application/json" \ -d '{"question": "What is macroeconomics?"}' -
POST
/services/{service_id}/stream-ask- Ask a question (streaming)
Summarization
- POST
/services/{service_id}/summarize- Get document summary (blocking) - POST
/services/{service_id}/stream-summarize- Get document summary (streaming)
Python Client Example
import requests
# Initialize service
response = requests.post("http://localhost:8000/services/1/initialize")
# Upload documents
files = {"files": open("document.pdf", "rb")}
response = requests.post("http://localhost:8000/services/1/upload-documents", files=files)
# Ask a question
question_data = {"question": "Explain the concept of GDP"}
response = requests.post("http://localhost:8000/services/1/ask", json=question_data)
print(response.json())
🏗 Architecture
Core Components
- RAGService (
rag_service.py): Core RAG logic with document processing, vector storage, and LLM integration - FastAPI Application (
rag_api.py): RESTful API endpoints with async support - Application Runner (
rag_app.py): Main application entry point
Key Features
- Document Processing: Automatic PDF text extraction with metadata standardization
- Relevance Filtering: AI-powered filtering to include only relevant documents
- Vector Storage: Persistent Chroma database with multilingual embeddings
- Streaming Support: Real-time response streaming for better user experience
- Error Handling: Comprehensive error handling and graceful degradation
Model Support
- Ollama Models: Support for local models like
llama3.2:1b - OpenRouter Models: Cloud-based models like
deepseek/deepseek-r1:free - Multiple Endpoints: Support for multiple Ollama instances via comma-separated URLs
⚙️ Configuration
Environment Variables
| Variable | Description | Default |
|---|---|---|
USE_OPENROUTER |
Use OpenRouter instead of Ollama | False |
OPENROUTER_API_KEY |
OpenRouter API key | Required if using OpenRouter |
OLLAMA_BASE_URL |
Ollama base URL(s) | http://localhost:11434 |
ASSEMBLYAI_API_KEY |
AssemblyAI API key for audio | Optional |
Service Configuration
Services are configured in DEFAULT_SERVICE_CONFIGS with:
- Service ID: Numeric identifier (1-11)
- Topic: Subject matter for the service
- Model: LLM model to use
Document Processing
- Batch Size: 100 documents per batch
- Relevance Threshold: 0.9 (configurable)
- Embedding Model:
sentence-transformers/paraphrase-multilingual-mpnet-base-v2 - Vector Database: Persistent Chroma storage
📁 Project Structure
ollama_openrouter_langchain_rag/
├── README.md
└── rag_logic/
├── rag_app.py # Main application entry point
└── lib/
├── __init__.py
├── rag_api.py # FastAPI application and endpoints
└── rag_service.py # Core RAG service implementation
🔧 Development
Adding New Services
- Add configuration to
DEFAULT_SERVICE_CONFIGSinrag_service.py - Restart the application
- The new service will be automatically available
Customizing Models
Edit the model configuration in DEFAULT_SERVICE_CONFIGS:
"12": {"topic": "New Subject", "model": "your-model-name"}
Extending Document Types
The system supports PDF documents by default. To add support for other formats, extend the document loading logic in RAGService.
📝 API Documentation
Once the server is running, visit:
- Swagger UI:
http://localhost:8000/docs - ReDoc:
http://localhost:8000/redoc
🤝 Contributing
- Fork the repository
- Create a feature branch
- Make your changes
- Add tests if applicable
- Submit a pull request
📄 License
This project is open source. Please check the license file for details.
🆘 Troubleshooting
Common Issues
- Model not found: Ensure Ollama is running and the model is pulled
- OpenRouter authentication: Verify your API key is correct
- Document upload fails: Check file permissions and available disk space
- Slow responses: Consider using smaller models or increasing batch sizes
Performance Tips
- Use local Ollama models for faster responses
- Adjust batch sizes based on available memory
- Pre-initialize services at startup for better performance
- Use SSD storage for vector database persistence
📞 Support
For issues and questions:
- Check the API documentation at
/docs - Review the troubleshooting section
- Create an issue in the repository