TiMemory is an advanced AI chatbot system that maintains persistent, contextual memory across conversations using TiDB's vector database capabilities. The system features intelligent memory extraction, consolidation, and retrieval, providing users with a truly personalized conversational experience.
- Persistent Memory System: Advanced AI-powered memory extraction and consolidation using vector embeddings
- Real-time Streaming: Live chat responses via Server-Sent Events with incremental message updates
- Session Management: Organize conversations into sessions with AI-generated titles
- Memory Consolidation: Intelligent merging and conflict resolution for memories to prevent duplication
- Vector Database: TiDB-powered similarity search for contextual memory retrieval
- Terminal-style Interface: Clean, focused chat UI with retro terminal aesthetics
- User Isolation: Complete data separation and privacy between users
- Memory Forgetting: Human-like memory updates that preserve context rather than deleting information
The heart of the system is a sophisticated memory processing pipeline:
- Memory Extraction: AI-powered fact extraction from conversations with semantic categorization
- Similarity Search: Vector-based matching against existing memories using 1536-dimensional embeddings
- Memory Consolidation: LLM-driven conflict resolution and intelligent memory merging
- Vector Storage: Persistent storage in TiDB with user isolation and temporal tracking
Memory Categories:
- Personal (relationships, important dates)
- Preferences (likes, dislikes, opinions)
- Activities (events, plans, behaviors)
- Professional (career, work details)
- Health (dietary restrictions, fitness)
- Miscellaneous (entertainment, interests)
- API Layer: RESTful endpoints with FastAPI, comprehensive error handling
- Services Layer: Business logic for chat, memory, user, and session management
- Memory Integration: TiMemory system wrapper for web API integration
- Authentication: Simple username-based authentication with user data isolation
- Middleware: Request ID tracking, CORS handling, structured exception responses
- Observability: Logfire integration for monitoring, logging, and performance tracking
- Real-time Chat: Streaming chat interface with Server-Sent Events
- Session Management: Create, switch, and delete conversation sessions
- Terminal UI: Monospace, dark-themed interface optimized for chat
- Local Persistence: Username storage via localStorage
- Auto-scroll & Focus: Smart UX with automatic message scrolling and input focus
- Python 3.9+ with
uvpackage manager - Node.js 16+ with npm
- TiDB Cloud account or local TiDB instance with vector search enabled
- OpenAI API key
-
Install TiMemory core system:
cd TiMemory uv install -
Configure TiMemory environment:
cp .env.example .env # Edit .env with TiDB and OpenAI credentials -
Setup and run backend:
cd ../backend uv install cp .env.example .env # Edit .env with your configuration # Initialize database tables uv run python -c "from app.db.database import create_tables; create_tables()" # Start development server uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
-
Install and run frontend:
cd frontend npm install npm run dev -
Access the application:
http://localhost:5173
# Run the entire stack
docker-compose up --build
# Access frontend: http://localhost:3000
# Backend API: http://localhost:8000TiMemory/
βββ TiMemory/ # Core memory engine (separate package)
β βββ timemory.py # Main memory processing pipeline
β βββ tidb_vector.py # Vector database operations
β βββ models/ # Data models (Memory, Session, etc.)
β βββ embedding/ # OpenAI embedding integration
β βββ llms/ # LLM wrappers and integrations
β βββ config/ # Configuration management
β βββ MEMORY.md # Detailed technical documentation
βββ backend/ # FastAPI web application
β βββ app/
β β βββ api/v1/ # API endpoints (chat, admin)
β β βββ services/ # Business logic layer
β β βββ models/ # SQLAlchemy database models
β β βββ schemas/ # Pydantic request/response schemas
β β βββ dependencies/ # FastAPI dependency injection
β β βββ middleware/ # Request processing middleware
β β βββ core/ # Configuration and exceptions
β β βββ db/ # Database connection management
β βββ CLAUDE.md # Backend development guide
βββ frontend/ # React application
β βββ src/
β β βββ components/ # React components (Chat, Login, etc.)
β β βββ contexts/ # React context providers
β β βββ router/ # Client-side routing
β β βββ styles/ # CSS styling
β βββ CLAUDE.md # Frontend development guide
βββ docker-compose.yml # Multi-service deployment
TiMemory Core:
cd TiMemory
uv run python -c "from timemory import TiMemory; # test core functionality"Backend:
cd backend
uv run uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload # Dev server
uv run python app/main.py # Direct run
uv add <package> # Add dependency
uv sync # Sync environmentFrontend:
cd frontend
npm run dev # Development server (http://localhost:5173)
npm run build # Production build
npm run preview # Preview production buildCore TiMemory (.env in TiMemory/):
# OpenAI Configuration
OPENAI_API_KEY=your_openai_api_key
MODEL_CHOICE=gpt-4o-mini
EMBEDDING_MODEL=text-embedding-3-small
EMBEDDING_MODEL_DIMS=1536
# TiDB Configuration
TIDB_HOST=your_tidb_host
TIDB_PORT=4000
TIDB_USER=your_username
TIDB_PASSWORD=your_password
TIDB_DB_NAME=your_database_name
TIDB_USE_SSL=true
# Memory System Settings
MEMORY_COLLECTION_NAME=memories
MEMORY_SEARCH_LIMIT=10
MESSAGE_LIMIT=20
SUMMARY_THRESHOLD=10Backend API (.env in backend/):
# Inherits TiMemory config plus:
LOGFIRE_TOKEN=your_logfire_token_for_observability- Create TiDB Cloud cluster with vector search enabled
- Configure SSL certificates for secure connections
- Create database and enable vector operations
- Update connection parameters in environment files
- Conversation Analysis: User-AI message pairs are analyzed for factual content
- Memory Extraction: AI extracts structured facts and categorizes them by type
- Similarity Search: New memories are compared against existing ones using vector embeddings
- Consolidation: Related memories are intelligently merged, with conflicts resolved through LLM reasoning
- Storage: Consolidated memories are stored with vector embeddings for future similarity search
Memory Forgetting vs. Deletion:
- Memories are marked as "outdated" rather than deleted
- Preserves context for understanding preference changes
- Enables nuanced AI responses based on memory evolution
- Maintains audit trail of user interactions
Cross-Referential Inference:
- Extracts implied facts from multi-turn conversations
- Handles complex references to previous assistant responses
- Infers preferences from contextual cues and behavioral patterns
- Resolves temporal references and implicit connections
POST /{user_id}/new - Create new chat session
- Content negotiation: JSON response or streaming via Server-Sent Events
- Creates session with AI-generated title
- Processes first message with memory context
POST /{user_id}/{session_id} - Continue existing session
- Streams AI responses with memory integration
- Updates conversation context and extracts new memories
GET /{user_id}/sessions - List user's chat sessions
GET /{user_id}/sessions/{session_id} - Get specific session details
PUT /{user_id}/sessions/{session_id} - Update session metadata
DELETE /{user_id}/sessions/{session_id} - Delete session and messages
Memory Operations:
- POST
/{user_id}/memories/search- Search user memories by query - GET
/{user_id}/memories/summary- Get AI-generated memory summary - DELETE
/{user_id}/memories- Clear all user memories
GET /health - System health check with component status
- SSL-enabled database connections with certificate verification
- User data isolation at database level with indexed partitioning
- Input validation using Pydantic schemas across all endpoints
- Comprehensive error handling with sanitized responses
- Request tracking with UUID correlation across services
- Memory attribution with immutable user associations
Optimization Features:
- Vector indexing for O(log n) similarity search
- Connection pooling for database efficiency
- Deferred loading of vector columns in search results
- User-isolated queries with B-tree indexing
- Batch embedding generation for cost efficiency
Resource Usage:
- ~$0.00002 per 1K tokens for embeddings (OpenAI pricing)
- ~$0.00015 per 1K tokens for LLM processing (gpt-4o-mini)
- ~1KB per memory + 6KB vector storage in database
- ~10MB vector index overhead per 1K memories
- Use environment variables for all secrets and configuration
- Enable SSL/TLS for all external connections
- Configure proper database connection pooling and timeouts
- Set up monitoring with Logfire or similar observability platform
- Use process managers (PM2, systemd) for backend service management
- Serve frontend through CDN or optimized static hosting
# Multi-service deployment
docker-compose up --build
# Individual services
docker build -t timemory-backend ./backend
docker build -t timemory-frontend ./frontendTiMemory/MEMORY.md- Detailed technical architecture documentationbackend/CLAUDE.md- Backend API development guide with examplesfrontend/CLAUDE.md- Frontend development guide and component architecture
- Fork the repository
- Create feature branch (
git checkout -b feature/amazing-feature) - Make changes with appropriate tests
- Update documentation as needed
- Submit pull request with detailed description
This project is licensed under the MIT License - see the LICENSE file for details.
- TiDB - Vector database and distributed SQL capabilities
- OpenAI - Language models and embedding services
- FastAPI - High-performance Python web framework
- React & Vite - Modern frontend development tools
- Logfire - Python-native observability platform
Built with β€οΈ using TiDB Vector Database, OpenAI, and modern web technologies