Giovygio97/ollama_openrouter_langchain_rag

Simple Project of a RAG system (with React interface) with Ollama and OpenRouter interface via Langchain library

Find a file

Giovygio97 7974106fca Add some files to check environment		2025-10-25 15:26:26 +02:00
rag_logic	Updated the hole code to langchain 1.0	2025-10-25 15:25:33 +02:00
.gitignore	Add some files to check environment	2025-10-25 15:26:26 +02:00
README.md	Initial commit	2025-10-10 16:02:37 +02:00
requirements.txt	Add some files to check environment	2025-10-25 15:26:26 +02:00

README.md

Ollama OpenRouter LangChain RAG System

A sophisticated Retrieval-Augmented Generation (RAG) system built with FastAPI, LangChain, and support for both local Ollama models and cloud-based OpenRouter models. This system enables intelligent document analysis and question-answering across multiple specialized knowledge domains.

🚀 Features

Multi-Service Architecture: Pre-configured RAG services for 11 different academic subjects
Dual LLM Support: Works with both local Ollama models and cloud OpenRouter models
Document Processing: Advanced PDF processing with relevance filtering and metadata standardization
Streaming Responses: Real-time streaming for both Q&A and summarization
Vector Database: Powered by Chroma with multilingual embeddings
RESTful API: Complete FastAPI-based REST API with automatic documentation
Document Relevance Filtering: AI-powered document filtering based on topic relevance
Batch Processing: Efficient document processing in configurable batches

📋 Pre-configured Services

The system comes with 11 pre-configured academic services:

Macroeconomia - Macroeconomics
Microeconomia - Microeconomics
Diritto Commerciale - Commercial Law
Bilancio - Accounting/Balance Sheet
Diritto Tributario - Tax Law
Diritto Privato - Private Law
Storia Economica - Economic History
Statistica I - Statistics I
Informatica - Computer Science
Metodi Statistici - Statistical Methods
Marketing - Marketing

🛠 Installation

Prerequisites

Python 3.8+
Either Ollama (for local models) or OpenRouter API key (for cloud models)

Setup

Clone the repository:

git clone <repository-url>
cd ollama_openrouter_langchain_rag

Install dependencies:

pip install fastapi uvicorn langchain langchain-community langchain-huggingface langchain-openai chromadb pymupdf sentence-transformers

Configure environment variables:

For Ollama (local models):

export OLLAMA_BASE_URL="http://localhost:11434"  # Single or comma-separated URLs
export USE_OPENROUTER="False"

For OpenRouter (cloud models):

export USE_OPENROUTER="True"
export OPENROUTER_API_KEY="your-openrouter-api-key"

Optional:

export ASSEMBLYAI_API_KEY="your-assemblyai-key"  # For audio transcription

🚀 Usage

Starting the Server

cd rag_logic
python rag_app.py

The server will start on http://localhost:8000 with automatic API documentation available at http://localhost:8000/docs.

API Endpoints

Service Management

GET /services/status - Get status of all services
GET /services/{service_id}/status - Get status of specific service
POST /services/{service_id}/initialize - Initialize a service
PATCH /services/{service_id}/topic - Update service topic

Document Management

POST /services/{service_id}/upload-documents - Upload PDF documents

curl -X POST "http://localhost:8000/services/1/upload-documents" \
     -F "files=@document.pdf"

Question & Answer

POST /services/{service_id}/ask - Ask a question (blocking)

curl -X POST "http://localhost:8000/services/1/ask" \
     -H "Content-Type: application/json" \
     -d '{"question": "What is macroeconomics?"}'

POST /services/{service_id}/stream-ask - Ask a question (streaming)

Summarization

POST /services/{service_id}/summarize - Get document summary (blocking)
POST /services/{service_id}/stream-summarize - Get document summary (streaming)

Python Client Example

import requests

# Initialize service
response = requests.post("http://localhost:8000/services/1/initialize")

# Upload documents
files = {"files": open("document.pdf", "rb")}
response = requests.post("http://localhost:8000/services/1/upload-documents", files=files)

# Ask a question
question_data = {"question": "Explain the concept of GDP"}
response = requests.post("http://localhost:8000/services/1/ask", json=question_data)
print(response.json())

🏗 Architecture

Core Components

RAGService (rag_service.py): Core RAG logic with document processing, vector storage, and LLM integration
FastAPI Application (rag_api.py): RESTful API endpoints with async support
Application Runner (rag_app.py): Main application entry point

Key Features

Document Processing: Automatic PDF text extraction with metadata standardization
Relevance Filtering: AI-powered filtering to include only relevant documents
Vector Storage: Persistent Chroma database with multilingual embeddings
Streaming Support: Real-time response streaming for better user experience
Error Handling: Comprehensive error handling and graceful degradation

Model Support

Ollama Models: Support for local models like llama3.2:1b
OpenRouter Models: Cloud-based models like deepseek/deepseek-r1:free
Multiple Endpoints: Support for multiple Ollama instances via comma-separated URLs

⚙️ Configuration

Environment Variables

Variable	Description	Default
`USE_OPENROUTER`	Use OpenRouter instead of Ollama	`False`
`OPENROUTER_API_KEY`	OpenRouter API key	Required if using OpenRouter
`OLLAMA_BASE_URL`	Ollama base URL(s)	`http://localhost:11434`
`ASSEMBLYAI_API_KEY`	AssemblyAI API key for audio	Optional

Service Configuration

Services are configured in DEFAULT_SERVICE_CONFIGS with:

Service ID: Numeric identifier (1-11)
Topic: Subject matter for the service
Model: LLM model to use

Document Processing

Batch Size: 100 documents per batch
Relevance Threshold: 0.9 (configurable)
Embedding Model: sentence-transformers/paraphrase-multilingual-mpnet-base-v2
Vector Database: Persistent Chroma storage

📁 Project Structure

ollama_openrouter_langchain_rag/
├── README.md
└── rag_logic/
    ├── rag_app.py              # Main application entry point
    └── lib/
        ├── __init__.py
        ├── rag_api.py          # FastAPI application and endpoints
        └── rag_service.py      # Core RAG service implementation

🔧 Development

Adding New Services

Add configuration to DEFAULT_SERVICE_CONFIGS in rag_service.py
Restart the application
The new service will be automatically available

Customizing Models

Edit the model configuration in DEFAULT_SERVICE_CONFIGS:

"12": {"topic": "New Subject", "model": "your-model-name"}

Extending Document Types

The system supports PDF documents by default. To add support for other formats, extend the document loading logic in RAGService.

📝 API Documentation

Once the server is running, visit:

Swagger UI: http://localhost:8000/docs
ReDoc: http://localhost:8000/redoc

🤝 Contributing

Fork the repository
Create a feature branch
Make your changes
Add tests if applicable
Submit a pull request

📄 License

This project is open source. Please check the license file for details.

🆘 Troubleshooting

Common Issues

Model not found: Ensure Ollama is running and the model is pulled
OpenRouter authentication: Verify your API key is correct
Document upload fails: Check file permissions and available disk space
Slow responses: Consider using smaller models or increasing batch sizes

Performance Tips

Use local Ollama models for faster responses
Adjust batch sizes based on available memory
Pre-initialize services at startup for better performance
Use SSD storage for vector database persistence

📞 Support

For issues and questions:

Check the API documentation at /docs
Review the troubleshooting section
Create an issue in the repository