PaddleOCR-VL-1.6 Smashes Document Parsing Accuracy At 96.33%

PaddleOCR-VL-1.6 is a compact document parsing model that reaches a new state-of-the-art accuracy of 96.33% on the OmniDocBench v1.6 benchmark. It boosts recognition of text, formulas, tables, ancient documents, rare characters, seals, and charts under challenging real‑world conditions. The release upgrades PaddleOCR-VL-1.5 with a region‑aware data optimization framework and progressive post‑training.
PaddlePaddle built this version to fix under‑optimized regions that limited earlier models, applying targeted data enhancement and reinforcement learning across multiple stages. The result outperforms both open‑source and closed‑source alternatives while keeping the same 0.9B size. Because the architecture is fully compatible with version 1.5, existing users can drop it in instantly with zero code changes.
New accuracy benchmarks and enhanced capabilities
- 96.33% SOTA on OmniDocBench v1.6.
- Zero‑cost plug‑and‑play migration from v1.5.
- Stronger table, ancient document, rare character recognition.
- Region‑aware data optimization for weak areas.
- Progressive post‑training with reinforcement learning.
- Multi‑task: OCR, formula, chart, seal, spotting.
This lightweight vision‑language model suits users who run AI locally on consumer GPUs, as its compact footprint delivers fast inference. Small agencies and privacy‑focused professionals can pull complex document structures without sending data to the cloud. Simple upgrading and strong benchmark results make it a practical pick for anyone who needs reliable, in‑the‑wild document parsing.
Developer notes and practical limits
The official PaddleOCR pipeline handles full page‑level parsing, while the provided transformers script currently supports only element‑level recognition and text spotting. Users can enable flash‑attn2 to cut memory usage and speed up inference on local machines. The model was stress‑tested on the tough Real5‑OmniDocBench, which includes distortions like warping, screen photography, and skewed documents.
"PaddleOCR-VL-1.6 achieves a new state-of-the-art score of 96.33% on OmniDocBench v1.6, sets new records on OmniDocBench v1.5 and Real5-OmniDocBench as well, and demonstrates strong competitiveness against top-tier VLMs." — Source: Hugging Face