A production-ready document search service built with FastAPI, Elasticsearch, and Redis. Features multi-tenant isolation, rate limiting, caching, and comprehensive error handling.
- Full-Text Search: Powered by Elasticsearch with support for fuzzy matching and multi-field search
- Multi-Tenant Isolation: Complete data isolation between tenants using tenant IDs
- Redis Caching: Fast search result caching with automatic invalidation
- Rate Limiting: Per-tenant rate limiting to prevent abuse
- Pagination: Efficient pagination for large result sets
- Error Handling: Comprehensive error handling with retry logic
- Health Checks: Real-time monitoring of dependencies
- Logging: Structured logging for debugging and monitoring
- Input Validation: Pydantic models for request/response validation
- Docker Support: Full Docker Compose setup for local development
├── app/
│ ├── __init__.py
│ ├── main.py # FastAPI application and endpoints
│ ├── config.py # Configuration management
│ ├── logger.py # Logging setup
│ ├── models.py # Pydantic models
│ ├── search_client.py # Elasticsearch client
│ └── cache.py # Redis cache and rate limiting
├── docker-compose.yml # Docker services configuration
├── Dockerfile # Application container
├── requirements.txt # Python dependencies
└── .env.example # Environment variables template
- Docker and Docker Compose
- Python 3.11+ (for local development)
-
Clone and navigate to the project:
cd search_service -
Create environment file:
cp .env.example .env
-
Start all services:
docker-compose up -d
-
Check service health:
curl http://localhost:8000/health
-
Install dependencies:
pip install -r requirements.txt
-
Start Elasticsearch and Redis:
docker-compose up -d elasticsearch redis
-
Set environment variables:
export ELASTICSEARCH_URL=http://localhost:9200 export REDIS_URL=redis://localhost:6379
-
Run the application:
uvicorn app.main:app --reload
Once running, access the interactive API docs at:
- Swagger UI: http://localhost:8000/docs
- ReDoc: http://localhost:8000/redoc
GET /healthPOST /documents
Headers: X-Tenant-ID: tenant-123
Body: {
"title": "My Document",
"content": "Document content here",
"tags": ["tag1", "tag2"],
"metadata": {"key": "value"}
}GET /search?q=query&page=1&size=10&fuzzy=false
Headers: X-Tenant-ID: tenant-123Query Parameters:
q: Search query (required)page: Page number (default: 1)size: Results per page (default: 10, max: 100)fields: Comma-separated fields to search (default: title,content)fuzzy: Enable fuzzy matching (default: false)
GET /documents/{doc_id}
Headers: X-Tenant-ID: tenant-123DELETE /documents/{doc_id}
Headers: X-Tenant-ID: tenant-123All configuration is managed through environment variables. See .env.example for available options:
| Variable | Default | Description |
|---|---|---|
| ELASTICSEARCH_URL | http://elasticsearch:9200 | Elasticsearch connection URL |
| REDIS_URL | redis://redis:6379 | Redis connection URL |
| RATE_LIMIT_REQUESTS | 100 | Max requests per window |
| RATE_LIMIT_WINDOW | 60 | Rate limit window in seconds |
| REDIS_CACHE_TTL | 300 | Cache TTL in seconds |
| SEARCH_DEFAULT_SIZE | 10 | Default search results per page |
| LOG_LEVEL | INFO | Logging level |
- Multi-Tenant Isolation: All operations enforce tenant boundaries
- Rate Limiting: Prevents abuse with configurable per-tenant limits
- Input Validation: Pydantic models validate all inputs
- Error Handling: Secure error messages without information leakage
curl http://localhost:8000/healthReturns status of all dependencies:
{
"status": "healthy",
"dependencies": {
"elasticsearch": "up",
"redis": "up"
}
}View application logs:
docker-compose logs -f appcurl http://localhost:8000/healthResponse:
{
"status": "healthy",
"dependencies": {
"elasticsearch": "up",
"redis": "up"
}
}curl -X POST http://localhost:8000/documents \
-H "Content-Type: application/json" \
-H "X-Tenant-ID: tenant-123" \
-d '{
"title": "Test Document",
"content": "This is a test document with some content",
"tags": ["test", "example"],
"metadata": {"author": "John Doe", "category": "testing"}
}'Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"title": "Test Document",
"content": "This is a test document with some content",
"tags": ["test", "example"],
"metadata": {"author": "John Doe", "category": "testing"},
"tenant_id": "tenant-123",
"created_at": "2026-01-23T10:30:00Z"
}curl "http://localhost:8000/search?q=test&page=1&size=10" \
-H "X-Tenant-ID: tenant-123"Response:
{
"results": [
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"score": 2.45,
"document": {
"title": "Test Document",
"content": "This is a test document with some content",
"tags": ["test", "example"],
"tenant_id": "tenant-123",
"created_at": "2026-01-23T10:30:00Z"
}
}
],
"total": 1,
"page": 1,
"size": 10,
"total_pages": 1
}curl "http://localhost:8000/search?q=docment&fuzzy=true&page=1&size=10" \
-H "X-Tenant-ID: tenant-123"curl "http://localhost:8000/search?q=test&fields=title&page=1&size=10" \
-H "X-Tenant-ID: tenant-123"# Replace {doc_id} with actual document ID from previous responses
curl http://localhost:8000/documents/550e8400-e29b-41d4-a716-446655440000 \
-H "X-Tenant-ID: tenant-123"Response:
{
"id": "550e8400-e29b-41d4-a716-446655440000",
"title": "Test Document",
"content": "This is a test document with some content",
"tags": ["test", "example"],
"metadata": {"author": "John Doe", "category": "testing"},
"tenant_id": "tenant-123",
"created_at": "2026-01-23T10:30:00Z"
}# Replace {doc_id} with actual document ID
curl -X DELETE http://localhost:8000/documents/550e8400-e29b-41d4-a716-446655440000 \
-H "X-Tenant-ID: tenant-123"Response:
{
"status": "deleted",
"id": "550e8400-e29b-41d4-a716-446655440000"
}Missing Tenant ID:
curl -X POST http://localhost:8000/documents \
-H "Content-Type: application/json" \
-d '{"title": "Test", "content": "Content"}'Response (403):
{
"detail": "X-Tenant-ID header is required"
}Invalid Search Query:
curl "http://localhost:8000/search?q=&page=1&size=10" \
-H "X-Tenant-ID: tenant-123"Response (422):
{
"detail": [
{
"loc": ["query", "q"],
"msg": "ensure this value has at least 1 characters",
"type": "value_error.any_str.min_length"
}
]
}- Ensure Elasticsearch is running:
docker-compose ps - Check Elasticsearch logs:
docker-compose logs elasticsearch - Wait for Elasticsearch to be ready (can take 30-60 seconds on first start)
- Ensure Redis is running:
docker-compose ps - Check Redis logs:
docker-compose logs redis
- Check rate limit configuration in
.env - Disable rate limiting:
RATE_LIMIT_ENABLED=false
-
Elasticsearch:
- Adjust heap size:
ES_JAVA_OPTS=-Xms1g -Xmx1g - Increase shards for large datasets
- Adjust heap size:
-
Redis:
- Increase max connections:
REDIS_MAX_CONNECTIONS=100 - Adjust cache TTL:
REDIS_CACHE_TTL=600
- Increase max connections:
-
Rate Limiting:
- Adjust limits per tenant needs
- Use sliding window for smoother rate limiting
For production deployment, refer to the comprehensive production readiness analysis:
- Architecture:
ARCHITECTURE.md- Complete technical architecture - Production Readiness:
ARCHITECTURE.mdSection 9 - Detailed production considerations - Quick Reference:
ARCHITECTURE_SUMMARY.md- At-a-glance production summary
Scalability:
- Horizontal scaling to handle 100x growth
- Multi-AZ deployment across 3 availability zones
- Auto-scaling policies based on CPU and request rate
Resilience:
- Circuit breakers and retry mechanisms
- Multi-region failover (RTO: 15 min, RPO: 5 min)
- Graceful degradation when dependencies fail
Security:
- JWT-based authentication (OAuth2/OIDC)
- End-to-end encryption (TLS 1.3, mTLS)
- Compliance ready (GDPR, SOC 2, HIPAA)
Observability:
- Full metrics stack (Prometheus + Grafana)
- Distributed tracing (OpenTelemetry + Jaeger)
- Centralized logging (ELK/CloudWatch)
- 24/7 alerting with on-call rotation
SLA:
- 99.95% availability target (~22 min downtime/month)
- p95 latency < 500ms
- Error rate < 0.1%
Kubernetes (Recommended):
# Deploy with Helm
helm install search-service ./helm-chart \
--set replicaCount=3 \
--set elasticsearch.nodes=3 \
--set redis.sentinel.enabled=true
# Blue-green deployment
kubectl apply -f k8s/blue-green/
# Canary deployment (progressive)
kubectl apply -f k8s/canary/Docker Compose (Development Only):
docker-compose -f docker-compose.prod.yml up -dHealth Endpoints:
- Liveness:
GET /health/live - Readiness:
GET /health/ready
Metrics Endpoint:
- Prometheus:
GET /metrics
Key Metrics to Monitor:
- Request rate, error rate, latency (RED)
- Cache hit rate (target: >70%)
- Elasticsearch query latency (target: <200ms p95)
- Circuit breaker state
Automated Backups:
- Elasticsearch snapshots every 6 hours to S3
- Redis RDB + AOF persistence
- 30-day retention policy
Disaster Recovery:
- Monthly DR drills
- Documented runbooks
- Multi-region replication
Production deployment (100x scale):
- ~$21,000/month for infrastructure
- See
ARCHITECTURE_SUMMARY.mdfor detailed breakdown
MIT License