๐ PaddleOCR 3.5 is here!
Introducing PaddleOCR 3.5 โ now with browser-based OCR, document-to-Markdown conversion, and Transformers backend integration.
๐ Key Highlights:
๐ธ PaddleOCR.js for direct browser deployment
๐ธOne-click conversion: Word/Excel/PPT to Markdown
๐ธUnified inference engine supporting Transformers
๐ธWebGPU & Wasm acceleration support
Run OCR directly in your browser, keep data private, and integrate seamlessly with Hugging Face ecosystem.
โจBuilt for the AI agent era. Ready for production.
๐ Read more: https://t.co/oNOfB6hbSY
#PaddlePaddle #PaddleOCR #AI #ComputerVision
๐Surprise: free to use โ give it a try๏ผ
Baidu CoBuddy is now live on @novita_labs โ A powerful code-focused model for developers and AI agents ๐
โจ131K context ยท 65K output ยท native tool calling
๐ Baidu CoBuddy is now available on Novita AI.
A code-focused model built for developers, coding agents, and complex software engineering workflows.
CoBuddy brings:
โข Long-context coding support
131K context window and up to 65K output tokens
โข Agent-ready capabilities
Native tool calling and reasoning support for AI agent workflows
โข Fast, production-ready inference
High-throughput, low-latency access through Novitaโs serverless API
Excited to bring Baidu CoBuddy to developers on Novita.
@ErnieforDevs@PaddlePaddle
๐ฅ Official Launch: PaddleOCR & ERNIE Image Are Now Available as Dify Plugins โ Powering the Next Wave of AI Agents
Weโre excited to see PaddleOCR and ERNIE Image now available as official Dify plugins โ bringing document intelligence and image generation directly into Difyโs agentic AI workflows.
A big thank you to @dify_ai , our valued PaddleOCR OCEAN Alliance member, for the continued ecosystem collaboration. ๐ค
1๏ธโฃ PaddleOCR โ Document AI for agent workflows
Turn images, scanned PDFs, and multilingual documents into clean structured data.
๐น Powered by PP-OCRv5, PP-StructureV3 & PaddleOCR-VL
๐น High-precision OCR and document parsing
๐น Structured outputs for downstream chunking, vectorization, and RAG
๐น Supports private/on-prem deployment for enterprise scenarios
๐ https://t.co/pSZFGXAjfD
2๏ธโฃ ERNIE-Image โ Image generation inside Dify
Generate visual assets directly in your Dify workflows.
๐น Free image generation
๐น Turbo mode with 8-step inference
๐น OpenAI-style API for easy integration
๐น Built for posters, social media visuals, and creative production
๐ https://t.co/LWAcCDux1U
No extra deployment. Add the plugin node, connect your workflow, and start building. โจ
๐ป Learn more:
PaddleOCR ๐ https://t.co/oNOfB6hbSY
ERNIE Image ๐ https://t.co/oVUs1SBo8n
Dify ๐ https://t.co/hxxdDXIzpD
#PaddleOCR #ERNIEImage #Dify #DocumentAI #AIGC #AgenticAI #OpenSource
๐PaddleOCR-VL 1.6 Officially Released!
We are thrilled to announce the official release of PaddleOCR-VL 1.6 โ this version has set a new SOTA record of 96.33% on OmniDocBench, outperforming both open-source and proprietary solutions in text, formula, and table recognition.
๐ Key Highlights:
๐ธ Ranked #1 on OmniDocBench v1.5 and Real5-OmniDocBench as well
๐ธ Significant improvements in table, classic text, and rare character recognition
๐ธ Enhanced seal, spotting, and chart recognition
๐ธ Fully compatible with v1.5 architecture โ zero migration, plug-and-play
From financial contracts and legal documents to research reports and historical archives โ empower your document intelligence workflows by providing high-quality data to large language models (LLMs) and retrieval-augmented generation (RAG) systems.
โจ Industry-leading accuracy. Zero migration. Plug-and-play.
๐ Read more: https://t.co/oNOfB6hbSY
#PaddlePaddle #PaddleOCR #AI #ComputerVision
Congratulations to NabuOCR ๐
Thanks Zack for trusting us and building something truly meaningful with PaddleOCR.
From ancient cuneiform tablets to AI-powered text recognition โ technology's journey across millennia is what makes it so inspiring.
Zack Williams brought one of the world's oldest writing systems to the ERNIE AI Developer Challenge: ancient cuneiform tablets.
Using PaddleOCR, he built NabuOCR to help read cuneiform from tablet images.
See the story behind @BoatbomberRBLX's winning project ๐
US vs China update. Stanford's AI Index put the USโChina gap at 2.7%. Here's what two years of real-world use from the Text Arena shows.
Gap three years ago: +278. Today: +29.
@AnthropicAI's Claude Opus 4.6 Thinking vs. Baidu's
@ErnieforDevs Ernie 5.1 at the top.
The US has never lost #1, but the race keeps closing.
๐ PaddleOCR 3.5: Transformers Backend Support Now Live!
We're excited to share that PaddleOCR 3.5 now supports Hugging Face Transformers as an inference backend. Run PP-OCRv5 and PaddleOCR-VL 1.5 models directly within the Transformers ecosystem.
โจ What you get:
๐ธ Run OCR models with engine="transformers"
๐ธ Seamless integration for RAG & Document AI apps
๐ธ Less friction, more natural HF stack connection
๐ธ Same PaddleOCR pipeline, your favorite backend
Big thanks to the @huggingface team for the collaboration! ๐
๐ Read the blog: https://t.co/TDYkcTvcx5
#PaddlePaddle #PaddleOCR #HuggingFace #Transformers #OCR
โจBig update from @Baidu_Inc@PaddlePaddle
Baidu's PaddleOCR now supports Transformers as an inference backend ๐ฅ Really cool to see it becoming easier to use within the @huggingface ecosystem!
Here is a quick addition to your metrics vocabulary: DAA.
Short for Daily Active Agents, it is the agent era's equivalent of DAU. Where tokenomics tracks cost, DAA tracks output โ how much work agents are actually getting done.
See the full comparison โ
What does evolution look like in the AI era?
To kick off Baidu Create, our CEO Robin Li, laid out a new theory of evolution across three layers:
> AI agents moving beyond passive response to active execution
> Individuals becoming AI-empowered builders
> Enterprises organizing around hybrid teams of people and agents
OpenClaw brought that first layer into wider view. It marked the first time agents took center stage, following the rise of models.
Robin proposed Daily Active Agents (DAA) as a defining metric for the agent era, a counterpart to DAU in the mobile internet era.
While token consumption reflects cost more than value, DAA brings the conversation back to output.
As Robin noted, to measure the health of a platform or ecosystem, more attention should be paid to the DAA metric โ the number of agents actively working and delivering results.
ERNIE 5.1 ranked No. 4 globally on @arena's Search Leaderboard, with a score of 1,223.
That ranking reflects stronger multi-source retrieval and synthesis, helping generate more consistent, reliable answers for content generation, AI assistants, enterprise knowledge management, and agent applications.
ERNIE 5.1 just dropped.
Built on ERNIE 5.0's pre-training foundation, our latest foundation model upgrades search, reasoning, knowledge Q&A, creative writing, and agentic capabilities, while using only around 6% of the pre-training cost of comparable models.
More in the thread ๐งต
๐ Congratulations to our ERNIE team on the release of ERNIE 5.1 !
This release demonstrates a strong balance between model capability and training efficiency โ achieving competitive performance across reasoning, agentic tasks, search, and world knowledge, while significantly reducing parameter scale and pretraining cost.
Particularly notable:
๐น Strong results on ฯ3-bench and SpreadsheetBench
๐น 99.6 on AIME26 with tools
๐น #1 among Chinese models on Arena Search
๐น Competitive GPQA and MMLU-Pro performance
A meaningful step forward for efficient large-model engineering and practical deployment. Looking forward to seeing more developers explore ERNIE 5.1. ๐
#LLM #ERNIE
ERNIE 5.1 is here ๐
ERNIE 5.1 significantly reduces pretraining cost while compressing total parameters to ~1/3 and activated parameters to ~1/2 โ using only ~6% of the pretraining cost compared to models at similar scale, while achieving leading performance in its class.
๐กKey highlights:
1/ Strong agentic performance approaching leading frontier models. ERNIE 5.1 surpasses DeepSeek-V4-Pro on both ฯ3-bench and SpreadsheetBench-Verified.
2/ Strong world knowledge and creative writing capabilities, with GPQA and MMLU-Pro performance approaching leading closed-source models, and creative writing ability nearing Gemini 3.1 Pro.
3/ Frontier-level reasoning performance. ERNIE 5.1 scores 99.6 on the challenging AIME26 benchmark with tools, second only to Gemini 3.1 Pro.
4/ Deep search capability. On May 9, ERNIE 5.1 ranked #4 globally and #1 among Chinese models on the Arena Search leaderboard with a score of 1223.
ERNIE 5.1 is now available on ERNIE and the Baidu AI Studio Model Playground:
๐https://t.co/qhd67Lg3B4
๐https://t.co/AaQSqDmVGU
๐https://t.co/uCNiypIu1q
Agents are moving from buzzword to real-world scale.
Join us at Baidu Create 2026 next Wednesday as our CEO Robin Li unpacks what "Agents at Scale" really means: for the agents themselves, for the people working alongside them, and for the organizations evolving with them.
Ernie-5.1 by Baiduโs @ErnieforDevs has landed as #4 in the Search Arena! This makes Baidu a top 3 lab in Search performance, and the only Chinese model in the top 10 overall.
Congrats to the @ErnieforDev team on this accomplishment!
๐ Huge congratulations to our partner โ and PaddleOCR OCEAN Alliance member โ RAGFlow on reaching an incredible 80K GitHub stars! โญ๐
Weโre proud to build together with the RAGFlow team through deep PaddleOCR integration and document AI collaboration. More to come! ๐
๐ฅ RAGFlow just reached 80K GitHub stars! โญ
Huge thanks to our contributors, users, and community.
From RAG to agentic AI systems, itโs amazing to see what people are building with RAGFlow.
80K is just the beginning. ๐
#OpenSource#AI#RAG#Agents#LLM
Another update from @arena ๐
ERNIE 5.1 is now ranked #4 in Search Arena โ making ERNIE one of the top-performing labs in Search and currently the only Chinese model in the Top 10.
Official release coming very soon ๐
Baidu Create 2026 is coming up fast, and the agenda is packed!
Beyond the main forum, our flagship developer conference returns with special forums on AI infrastructure, agent development, real-world applications, and more โ plus plenty to explore on site.
Take a look ๐
ERNIE 5.1 Preview just went live ๐
With a lighter, more efficient architecture, it delivers strong performance at its scale.
And this is just the start โ more ERNIE model updates to come at Baidu Create 2026.
๐Proud to support @StarHistoryHQ and grateful to @github for powering the open-source momentum behind projects like PaddleOCR!
PaddleOCR is now the #1 most-starred OCR project on GitHub โ built by the community, for developers turning PDFs, images, screenshots, and complex documents into structured, AI-ready data.
Try it, star it, and build with us! โญ
๐ฆ Github ๐ https://t.co/oNOfB6hbSY
๐ Website ๐ https://t.co/wci7AWgokj
#PaddlePaddle #PaddleOCR #Github
Star History is proudly sponsored by PaddleOCR ๐
Turn PDFs and image documents into structured data for AI with a powerful, lightweight OCR toolkit that bridges images, PDFs, and LLMs. ๐ค
https://t.co/jjLedh9rGJ
Thanks for supporting open source! โญ