A modern chatbot application built with Streamlit that leverages Large Language Models via OpenAI-compatible APIs. Features include document upload, RAG (Retrieval-Augmented Generation) capabilities, and a clean, interactive UI.
- 🤖 OpenAI-Compatible API Integration - Works with local LLM servers like vLLM
- 📄 Document Upload & Processing - Support for various file types with intelligent parsing
- 🔍 RAG (Retrieval-Augmented Generation) - Vector search with Qdrant for enhanced responses
- 🎨 Clean Streamlit UI - Modern, responsive chat interface
- ⚙️ Configurable - Extensive configuration options via environment variables
- 🐳 Docker Support - Ready-to-use Docker Compose setup for LLM servers
- Python 3.10+
- Docker (optional, for local LLM servers)
- NVIDIA GPU (optional, for local LLM inference)
-
Clone the repository:
git clone <repository-url> cd llm-chat
-
Install the package:
# Basic installation pip install -e . # Or install all dependencies (recommended for development) pip install -e ".[all]" # Using make targets (see Requirements section below) make install-all
-
Set up environment variables:
cp .env.example .env # Edit .env with your preferred settings -
Run the chatbot:
chatbot
The application uses environment variables for configuration. Copy .env.example to .env and customize:
CHATBOT_OPENAI_API_BASE=http://localhost:8000/v1 # LLM API endpoint
CHATBOT_LLM_MODEL_NAME=Qwen/Qwen2.5-Coder-7B-Instruct-AWQ
CHATBOT_TEMPERATURE=0.7
CHATBOT_MAX_TOKENS=2000
CHATBOT_HOST=127.0.0.1
CHATBOT_PORT=8080CHATBOT_QDRANT__URL=http://localhost:6333 # Vector database URL
CHATBOT_QDRANT__COLLECTION_NAME=chatbot # Collection name
CHATBOT_RAG__EMBEDDING_MODEL=BAAI/bge-small-en-v1.5
CHATBOT_RAG__CHUNK_SIZE=1024
CHATBOT_RAG__TOP_K=5Start a local LLM server with vLLM:
# For 7B model (recommended for most users)
docker-compose up vllm-qwen-2.5-coder-7b-instruct-awq
# For 14B model (requires more VRAM)
docker-compose up vllm-qwen-2.5-coder-14b-instruct-awq
# Start vector database
docker-compose up qdrantNote: Update the model paths in docker-compose.yaml to match your local model storage.
The project provides flexible dependency installation through make targets:
# Install all dependencies (default, recommended)
make install-all
# Install only base/production dependencies
make install-base
# Install development dependencies only
make install-dev
# Install profiling dependencies only
make install-profiling
# Install uv package manager for faster installs
make install-uv
# Show all available requirements targets
make -f .makefiles/requirements.mk help- Base: Core application dependencies (streamlit, openai, llama-index, etc.)
- Dev: Testing and linting tools (pytest, ruff, mypy, bandit)
- Profiling: Performance analysis tools (matplotlib, plotly, py-spy)
- All: Combines all dependency groups above
# Install all dependencies (includes dev and profiling tools)
make install-all
# Set up pre-commit hooks and complete dev environment
make dev
# Install uv package manager for faster dependency management
make install-uv# Run all linting and formatting
make lint-all
# Individual tools
make format # Format with ruff
make lint # Check with ruff
make type-check # Type check with mypy
make security # Security scan with bandit
# Quick validation (no formatting changes)
make check# Run all tests
make test
# Run with coverage
make test-coverage
# Unit tests only
make test-unitThe application follows a modular design with clear separation of concerns:
- UI Layer (
chatbot/utils/ui.py) - Streamlit interface components - Chat Logic (
chatbot/utils/chat.py) - LLM interaction and conversation management - RAG System (
chatbot/rag.py) - Document processing and vector search - Configuration (
chatbot/settings.py) - Centralized settings management - Web Processing (
chatbot/utils/web.py) - URL handling and content fetching - Resources (
chatbot/resources.py) - Resource management and initialization
The application works with any OpenAI-compatible API. Tested models include:
- Qwen2.5-Coder series (7B, 14B) - Optimized for code generation
- Custom models via vLLM server
- OpenAI GPT models (with API key)