dtlvnative

Provides pre-built native dependencies for Datalevin database. This is done by packaging the compiled native libraries and JavaCPP JNI library files in the platform specific JAR files.

In addition to JavaCPP's JNI library, these native libraries are included:

dlmdb a fork of LMDB key value storage library.
usearch a vector indexing and similarity search library that is exposed directly for callers.
llama.cpp built as a CPU-only GGUF runtime for embeddings and prompt-based text generation.
dtlv wraps DLMDB. It implements Datalevin iterators, counters and samplers.

The following platforms are currently supported:

macosx-arm64
freebsd-x86_64
linux-arm64
linux-x86_64
windows-x86_64

The name of the released JAR is org.clojars.huahaiy/dtlvnative-PLATFORM, where PLATFORM is one of the above.

Vector support using usearch on Windows is experimental.

llama.cpp text + embedding

dtlvnative packages the CPU backend of llama.cpp with OpenMP enabled. The packaged native API now supports embedding models, decoder-only text models for prompt-based generation, and multimodal OCR with PaddleOCR-VL GGUF models.

Embedding API

Function	Description
`dtlv_llama_embedder_create`	Load a GGUF model and create an embedder
`dtlv_llama_embedder_n_embd`	Return the embedding dimension
`dtlv_llama_embedder_n_ctx`	Return the context size (max tokens)
`dtlv_llama_token_count`	Count tokens for a string without allocating
`dtlv_llama_tokenize`	Tokenize a string into a caller-owned `int[]` buffer
`dtlv_llama_detokenize`	Convert tokens back to a UTF-8 string
`dtlv_llama_embed`	Compute an embedding for a single string
`dtlv_llama_embed_batch`	Compute embeddings for multiple strings in one call
`dtlv_llama_embedder_destroy`	Free the embedder

The model must be a GGUF embedding model. The current smoke test uses multilingual-e5-small-Q8_0.gguf.

dtlv_llama_embedder_create takes model_path, n_ctx, n_batch, n_threads, and normalize. Pass 0 for n_ctx and n_batch to use model defaults. A non-zero normalize returns L2-normalized embeddings.

Single embedding

DTLV.dtlv_llama_embedder embedder = new DTLV.dtlv_llama_embedder();
int rc = DTLV.dtlv_llama_embedder_create(
        embedder,
        "multilingual-e5-small-Q8_0.gguf",
        0, 0, 4, 1);

int nEmbd = DTLV.dtlv_llama_embedder_n_embd(embedder);
float[] output = new float[nEmbd];
rc = DTLV.dtlv_llama_embed(embedder, "query: hello world", output, nEmbd);

DTLV.dtlv_llama_embedder_destroy(embedder);

Token counting and tokenization

// check token count before embedding
int nTokens = DTLV.dtlv_llama_token_count(embedder, text);
int maxTokens = DTLV.dtlv_llama_embedder_n_ctx(embedder);

// tokenize, truncate, detokenize
int[] tokens = new int[maxTokens];
int actual = DTLV.dtlv_llama_tokenize(embedder, text, tokens, maxTokens);
if (actual > maxTokens) {
    // truncate to fit
    actual = maxTokens;
}
byte[] buf = new byte[text.length() * 4];
int len = DTLV.dtlv_llama_detokenize(embedder, tokens, actual, buf, buf.length);
String truncated = new String(buf, 0, len, StandardCharsets.UTF_8);

Batch embedding

PointerPointer texts = new PointerPointer("query: hello", "query: world");
int nTexts = 2;
float[] output = new float[nTexts * nEmbd];
rc = DTLV.dtlv_llama_embed_batch(embedder, texts, nTexts, output, output.length);
// output[0..nEmbd-1] = embedding for "query: hello"
// output[nEmbd..2*nEmbd-1] = embedding for "query: world"

The Java test in src/java/datalevin/dtlvnative/Test.java will use target/embedding-models/multilingual-e5-small-Q8_0.gguf if present, fall back to a repository-root copy if present, and otherwise download the model from Hugging Face before running the embedding smoke test.

Text generation API

The text-generation API is aimed at decoder-only instruction models such as Qwen 3.5 0.8B Instruct in GGUF format.

Function	Description
`dtlv_llama_generator_create`	Load a GGUF decoder-only text model
`dtlv_llama_generator_n_ctx`	Return the context size
`dtlv_llama_generator_token_count`	Count tokens for a prompt/document
`dtlv_llama_generate`	Generate text for a raw prompt
`dtlv_llama_summarize`	Build a summarization prompt and generate a summary
`dtlv_llama_generator_destroy`	Free the generator

dtlv_llama_generate and dtlv_llama_summarize return the number of UTF-8 bytes written to the caller-owned output buffer. When n_predict <= 0, they default to a 128-token generation budget. Prompt text that exceeds the context size is automatically truncated to the leading tokens that fit.

DTLV.dtlv_llama_generator generator = new DTLV.dtlv_llama_generator();
int rc = DTLV.dtlv_llama_generator_create(
        generator,
        "Qwen3.5-0.8B-Instruct-Q4_K_M.gguf",
        2048, 0, 4);

byte[] output = new byte[8192];
int len = DTLV.dtlv_llama_summarize(
        generator,
        "Datalevin embeds data locally and can pair vector search with LMDB-backed storage.",
        128,
        output,
        output.length);

String summary = new String(output, 0, len, StandardCharsets.UTF_8);
DTLV.dtlv_llama_generator_destroy(generator);

If you want to supply your own instruction prompt instead of the built-in summary helper, call dtlv_llama_generate directly.

Vision / OCR API

The vision API is aimed at multimodal GGUF models with a matching projector GGUF, such as PaddleOCR-VL-1.5-GGUF.

Function	Description
`dtlv_llama_vision_generator_create`	Load a multimodal text GGUF and matching `mmproj` GGUF
`dtlv_llama_vision_generator_n_ctx`	Return the context size
`dtlv_llama_vision_generate`	Generate text for a single image plus prompt
`dtlv_llama_ocr`	Run OCR with the built-in `OCR:` prompt
`dtlv_llama_vision_generator_destroy`	Free the vision generator

dtlv_llama_vision_generator_create takes model_path, mmproj_path, n_ctx, n_batch, n_threads, image_min_tokens, and image_max_tokens. Pass 0 for the numeric tuning parameters to keep the model defaults. The runtime is CPU-only in this package.

dtlv_llama_vision_generate and dtlv_llama_ocr return the number of UTF-8 bytes written to the caller-owned output buffer. The image prompt is single image only. If the prompt passed to dtlv_llama_vision_generate does not contain the multimodal marker, the native layer prepends it automatically.

DTLV.dtlv_llama_vision_generator generator = new DTLV.dtlv_llama_vision_generator();
int rc = DTLV.dtlv_llama_vision_generator_create(
        generator,
        "PaddleOCR-VL-1.5.gguf",
        "PaddleOCR-VL-1.5-mmproj.gguf",
        0, 0, 4, 0, 0);

byte[] output = new byte[8192];
int len = DTLV.dtlv_llama_ocr(
        generator,
        "page.png",
        16,
        output,
        output.length);

String text = new String(output, 0, len, StandardCharsets.UTF_8);
DTLV.dtlv_llama_vision_generator_destroy(generator);

Local llama smoke test

To refresh the JavaCPP platform libraries and run the local llama smoke tests with a real decoder model:

script/test-llama-summarization --text-model=target/text-models/qwen2.5-0.5b-instruct-q5_k_m.gguf

The script runs Test.java with --llama-only, so it covers both the llama embedding smoke test and the summarization flow. If you prefer, set DTLV_TEXT_MODEL_PATH=/abs/path/model.gguf instead of passing --text-model.

OCR smoke test

To refresh the JavaCPP platform libraries and run only the PaddleOCR-VL smoke test:

script/test-llama-ocr \
  --vision-model=/path/to/PaddleOCR-VL-1.5.gguf \
  --vision-mmproj=/path/to/PaddleOCR-VL-1.5-mmproj.gguf \
  --ocr-image=/path/to/image.png \
  --ocr-n-predict=16

The OCR script runs Test.java with --ocr-only, so it skips LMDB, usearch, embedding, and summarization. It also prints the extracted OCR text. You can set DTLV_VISION_MODEL_PATH, DTLV_VISION_MMPROJ_PATH, DTLV_OCR_IMAGE_PATH, and DTLV_OCR_N_PREDICT instead of passing the flags explicitly.

For CPU-only smoke tests, keep --ocr-n-predict small. 16 is a practical default for checking that OCR works end to end. Large document images are much slower than small or resized inputs, so for quick validation it helps to reduce the longest edge to around 512 pixels first.

Additional dependencies

Right now, the included shared libraries depend on some system libraries.

libc
libmvec
libomp or libgomp

We bundle libomp in the Jar. However, on systems that the bundled library is not working, or libc is not available, you will have to install them yourself. For example, on Ubuntu/Debian, apt install libgomp1, or apt install gcc-12 g++-12; on MacOS, brew install libomp libllvm

License

This program and the accompanying materials are made available under the terms of the Eclipse Public License 2.0 which is available at http://www.eclipse.org/legal/epl-2.0.

This Source Code may also be made available under the following Secondary Licenses when the conditions for such availability set forth in the Eclipse Public License, v. 2.0 are satisfied: GNU General Public License as published by the Free Software Foundation, either version 2 of the License, or (at your option) any later version, with the GNU Classpath Exception which is available at https://www.gnu.org/software/classpath/license.html.

Name		Name	Last commit message	Last commit date
Latest commit History 537 Commits
.github/workflows		.github/workflows
freebsd-x86_64		freebsd-x86_64
linux-arm64		linux-arm64
linux-x86_64		linux-x86_64
macosx-arm64		macosx-arm64
patches		patches
script		script
src		src
windows-x86_64		windows-x86_64
.gitignore		.gitignore
.gitmodules		.gitmodules
CHANGELOG.md		CHANGELOG.md
LICENSE		LICENSE
README.md		README.md
release.clj		release.clj

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

dtlvnative

llama.cpp text + embedding

Embedding API

Single embedding

Token counting and tokenization

Batch embedding

Text generation API

Vision / OCR API

Local llama smoke test

OCR smoke test

Additional dependencies

License

About

Uh oh!

Releases 134

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

dtlvnative

llama.cpp text + embedding

Embedding API

Single embedding

Token counting and tokenization

Batch embedding

Text generation API

Vision / OCR API

Local llama smoke test

OCR smoke test

Additional dependencies

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 134

Uh oh!

Contributors

Uh oh!

Languages