【工具自荐】基于视觉大模型的 OCR 识别工具

[ollama-ocr](https://github.com/dwqs/ollama-ocr) 是一个基于视觉大模型的 OCR 识别工具。

## 主要特性
- **多模型支持**：目前已支持的模型有 `LLaVA 13B` 和 `Llama 3.2 Vision 11B`，后续还会支持其它视觉模型
- **多格式输出**：支持 `Markdown`、`JSON` 和 `Plain Text` 等格式输出

## 快速开始
可[点击](https://github.com/dwqs/ollama-ocr#quick-start)，按照步骤进行；也可 docker 搜索 `debounce/ollama-ocr`，快速运行 Demo。技术栈主要是 Vue 3 + Vite

## 示例
#### Input Image1

![input-image](https://image-static.segmentfault.com/149/814/1498143911-677575ecd6977_fix732)

#### Output Markdown

![output-markdown.png](https://image-static.segmentfault.com/338/339/3383395719-67757691e9b37_fix732)

#### Input Image2

![input-image](https://image-static.segmentfault.com/257/222/2572220334-677579c2747c7_fix732)

#### Output JSON

![output-json.png](https://image-static.segmentfault.com/104/188/1041885248-677579f517f02_fix732)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

【工具自荐】基于视觉大模型的 OCR 识别工具 #5846

主要特性

快速开始

示例

Input Image1

Output Markdown

Input Image2

Output JSON

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

【工具自荐】基于视觉大模型的 OCR 识别工具 #5846

Description

主要特性

快速开始

示例

Input Image1

Output Markdown

Input Image2

Output JSON

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions