Someone just released an OCR model that is embarrassingly outperforming models hundreds of times larger.
It's called Unlimited OCR.
Despite having only 3B parameters and activating just 500M during inference, it achieved:
→ 93.23 on OmniDocBench v1.5
→ 93.92 on OmniDocBench v1.6
Both scores set a new end-to-end SOTA.
For comparison:
• Qwen3-VL (235B) scored 89.15
• Qwen2.5-VL (72B) scored 87.02
• Gemini 2.5 Pro scored 88.03
A model activating a fraction of the parameters of these systems still managed to beat them all.
But the benchmark numbers aren't the most interesting part.
Unlimited OCR can read and understand 40+ page documents in a single pass without losing context or slowing down. Instead of treating pages independently, it maintains document-level understanding from the first page to the last.
That's something OCR models have struggled with for years.
Another reminder that smarter design often beats brute-force scale.
The model and code are now open source on GitHub and Hugging Face.
Definitely worth bookmarking.
Repo👇
Release: LichtFeld Studio v0.5.3 is out!
With 316 commits merged into master, this release is a huge step forward for LichtFeld Studio.
What's new in v0.5.3
• Vulkan viewer/rendering migration: New Vulkan viewport pipeline, pass graph, VkSplat renderer, Vulkan point-cloud renderer, 3DGUT/VkSplat support, improved alpha/depth composition, tighter CUDA/Vulkan interoperability, and device matching on multi-GPU systems.
• RAD + LOD workflow: Added RAD file export/import, RAD LOD viewer, Spark-style GPU LOD selection, GPU-driven page prefetching, a bounded VRAM pool, out-of-core PLY-to-RAD LOD conversion, and RAD import/export speedups of approximately 3–5×.
• HiGS / macro-tile inference: Added a macro-tile inference path for the Vulkan viewer, including macro sorting, batched rasterization, composition, and capacity management.
• Asset Manager: Added and significantly enhanced the Asset Manager with thumbnails, SH information, faster synchronization, import-from-URL support, docked mode, data-loading popup integration, and general UI cleanup.
• Viewport export: Integrated viewport export directly into the application as a toolbar/overlay tool, added fast render_view_u8-style readback paths, fixed high-resolution clipping issues, improved orthographic export parity, resolved 32K image/video export problems, and added post-export GPU resource cleanup.
• Selection and tooling: Added and reworked selection toolbar controls, the Select menu, ring selection, color eyedropper, distance-from-center selection, faster point-cloud and zoomed-out selection paths, Vulkan measurement tool fixes, and drag-and-drop scene graph improvements.
• UI/RmlUi platform work: Major RmlUi redesign efforts, hot reloading for RML/RCSS/Python UI files, reactive UI/store integration, viewport toolbar flyouts, improved histogram interactions, input settings enhancements, custom TRS gizmos, and numerous panel, tooltip, and localization fixes.
• Windowing and UX: Added borderless window support, title bar drag/maximize/restore behavior, work-area-aware maximize functionality, resize responsiveness and performance improvements, and DPI/UI scaling fixes.
• Training and data features: Added adaptive depth loss and depth gradients for the EWA rasterizer, mask loading/application fixes, a new combined Ignore+Segment mask mode, --add-splat, --freeze, improved checkpoint and training state handling, and training speed and VRAM optimizations.
• COLMAP/equirectangular support: Added SPHERICAL/equirectangular camera model support and canonical EQUIRECTANGULAR handling, along with fixes for undistortion and camera export.
This release will be available to all supporters as a Windows binary via https://t.co/mdGITFOGVQ approximately in about an hour.
At the same time, LichtFeld Studio remains committed to being free and open source under GPLv3 and can also be built directly from source.
Please consider supporting the ongoing development of LichtFeld Studio through a donation via the portal or the supporters page.
Thank you to everyone who supports this project financially, contributes code, reports bugs, provides datasets, helps with the website, and contributes in countless other ways.
A special thank you to our foundational sponsor Core11 and our Gold Sponsor Volinga, whose support has helped make the current state of the software possible. Thank you as well to every donor and to all of our new Bronze Sponsors.
Looking ahead to v0.6
For the next major release, work will focus primarily on stability and user experience. This includes improved cleanup workflows and the ability to modify training parameters while training is in progress. I would also like to introduce a native .licht project format that allows users to save and restore their complete editor state.
You can find links to our main sponsors below. Please also visit our website to discover all our Bronze Sponsors.
Hint: We do not yet have a Silver Sponsor or Platinum 😉
Unreal Engine 5.8 is now live!
Build your immersive worlds with advanced terrain tools, real-time vegetation workflows, and simplified lighting⛰️
Create characters and animations faster by capturing high-fidelity digital humans, and speed up content creation through the MCP plugin🧍
Take a look at what’s new: https://t.co/cDITLWWv2F
Want to catch bad images in your SFM reconstruction?
Introducing ColmapView v0.7 with Gaussian Splatting QA (and more).
- Detects all PLY in your dataset and URL.
- Overlay images exactly at the 3DGS reconstruction.
- Compute PSNR and SSIM in your browser.
- Sort your images based on reconstrction quality.
Plug and play at:
https://t.co/j9Ebve1jSA
Smart Splat Decimator is now live in SplatBox!.
~up to 50% removal, ~5× smaller.!
Scores every splat by visual importance — opacity, size, shape — and removes the junk(Splats) first, not the detail.✍️
Built for big scenes on Android VR. Preview → decimate → save compressed.
Less memory. Faster frames. Same scene — smarter.🧐
#GaussianSplatting #VR #Quest #Unity #GameDev #SplatBox
Introducing D4RT: A unified AI model for 4D scene reconstruction and tracking across space and time. 🎯 Catch the demo with Skanda Koppula at 12 pm at our #CVPR2026 Google booth kiosk! https://t.co/p6SclNe1zi @GoogleDeepMind
The ArtiFixer code and model weights are now released! Links to both on our project page: https://t.co/Q0WnpcyyQj
Per-scene methods like 3DGS look great on captured views but collapse off-trajectory. We repair those artifacts and beat prior SOTA by 1–3 dB PSNR. 🧵(1/6)
Production-quality detail, down to trees and puddle reflections.
XYN Spatial Scan generates 3DCG assets that hold up to the demands of real production pipelines.
Please watch the forest scene in this post 🌿
--------
木々の細部や地面に広がる水たまりの反射を忠実に再現。
NABにてご好評いただいたXYN Spatial Scanで生成した森林シーンにカメラワークを加えた映像をぜひご覧ください📷
#XYN #3DCG #3DGS #SpatialContent #VirtualProduction
Tired of spending weeks of SFM reconstruction?
Try VGGT-horseshoe Plugin in @lichtfeldstudio to get pose and dense point + sky segmentation in 20 seconds.
Drone in. 3D out.📍
Up to 10,000 images, 5K resolution, 100M splats.
Entire neighborhoods, terrains & infrastructure -- reconstructed in full 3D automatically via the cloud.
https://t.co/HXaCuHpQfw