OpenMockLLM

A FastAPI-based mock LLM API server that simulates multiple Large Language Model API providers.

Supported backends:

Backend	Endpoints
vLLM	• /v1/chat/completions • /v1/models • /health
Mistral	• /v1/chat/completions • /v1/models • /v1/embeddings
Text Embeddings Inference	• /v1/embeddings • /health • /info • /rerank

Quickstart

Install the package:

pip install git+https://github.com/etalab-ia/openmockllm.git

Run the server:
```
openmockllm
```

Usage

Command-line arguments

Common Arguments

Argument	Type	Default	Description
`--backend`	str	`vllm`	Backend to use: `vllm`, `mistral`, or `tei`
`--port`	int	`8000`	Port to run the server on
`--max-context`	int	`128000`	Maximum context length
`--owned-by`	str	`OpenMockLLM`	Owner of the API
`--model-name`	str	`openmockllm`	Model name to return in responses
`--embedding-dimension`	int	`1024`	Embedding dimension
`--api-key`	str	`None`	API key for authentication
`--tiktoken-encoder`	str	`cl100k_base`	Tiktoken encoder
`--faker-langage`	str	`fr_FR`	Langage used for generating prompt responses
`--faker-seed`	str	`None`	Seed for Faker generation
`--simulate-latency`	flag	`False`	Simulate latency
`--reference-tps`	int	`100`	Reference tokens per second for latency simulation

TEI-Specific Arguments

Argument	Type	Default	Description
`--payload-limit`	int	`2000000`	Payload size limit in bytes (2MB)
`--max-client-batch-size`	int	`32`	Maximum number of inputs per request
`--auto-truncate`	flag	`False`	Automatically truncate inputs longer than max size
`--max-batch-tokens`	int	`16384`	Maximum total tokens in a batch

Test Examples

Chat Completion (vLLM/Mistral)

Streaming response:

curl -N -X POST http://localhost:8000/v1/chat/completions \
 -H "Content-Type: application/json" \
 -d '{ "model": "openmockllm", "messages": [{"role": "user", "content": "Bonjour"}], "stream": true }'

Non-streaming response:

curl -X POST http://localhost:8000/v1/chat/completions \
 -H "Content-Type: application/json" \
 -d '{ "model": "openmockllm", "messages": [{"role": "user", "content": "Bonjour"}], "stream": false }'

Embeddings (TEI)

# Generate embeddings
curl -X POST http://localhost:8002/v1/embeddings \
 -H "Content-Type: application/json" \
 -d '{ "input": "Hello, world!", "model": "openmockllm" }'

# Get model info
curl http://localhost:8002/info

# Rerank documents
curl -X POST http://localhost:8002/rerank \
 -H "Content-Type: application/json" \
 -d '{ "query": "What is Deep Learning?", "texts": ["Deep Learning is...", "Machine Learning is..."] }'

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

Development

Install the dependencies:

pip install -e ".[dev]"

Run a server:

python -m openmockllm.main --reload --backend mistral

Generate Pydantic schemas

From openapi.json file:

BACKEND=tei
datamodel-codegen --input docs/${BACKEND}  --input-file-type openapi --output openmockllm/${BACKEND}/schemas.py --output-model-type pydantic_v2.BaseModel --strict-nullable

Another recommand method is to use official SDK of the backend.

Testing

Install the dependencies:

pip install -e ".[dev]"

Run a server:

python -m openmockllm.main --reload --backend mistral

Run the tests:

pytest tests/test_mistral

Name		Name	Last commit message	Last commit date
Latest commit History 42 Commits
docs		docs
openmockllm		openmockllm
tests		tests
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OpenMockLLM

Quickstart

Usage

Command-line arguments

Common Arguments

TEI-Specific Arguments

Test Examples

Chat Completion (vLLM/Mistral)

Embeddings (TEI)

Contributing

Development

Generate Pydantic schemas

Testing

About

Uh oh!

Releases 4

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

OpenMockLLM

Quickstart

Usage

Command-line arguments

Common Arguments

TEI-Specific Arguments

Test Examples

Chat Completion (vLLM/Mistral)

Embeddings (TEI)

Contributing

Development

Generate Pydantic schemas

Testing

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 4

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages