DeepSeek V4 flash is on par with GPT 5.4 (high), the best part is that itโs much more affordable at scale:
GPT 5.4 pro vs DeepSeek V4 flash:
Input: $30/M vs $0.14/M (214x cost difference)
Output: $180/M vs $0.28/M (643x cost difference)โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Both at a million context, DeepSeek V4 Flash is really a bargain for intelligence.
๐ฌ lab works & stuff
come by to see some of the recent research works of the team.
๐ location: builder hub
๐๏ธ time: april 25th 1pm UTC
./ ๐ฅผ coat on @Gradient_HQ
Awesome to see @tryParallaxโs distributed framework for heterogeneous machines being implemented and serving up inferences!
Build and customize your own clusters for AI like never before ๐ค
./ LFG @Gradient_HQ
๐ค Gradient Live Knowledge Trivia!
Come by to join us on a 20 question trivia! See where you stack up for knowledge among Grads!
๐ Location: Builder Hub
๐๏ธ Time: April 11th 1PM UTC
./ @Gradient_HQ memory engine on ๐ง
Our cofounder @0xEricYang sat down with @yacinelearning to walk through Echo-2โs distributed RL architecture.
Dive in to learn about async RL with distributed infra, and how we are scaling this for businesses to win in the agentic era.
for those interested in distributed reinforcement learning I just finished a ~1h tutorial on the echo2 framework by @Gradient_HQ
we check:
- how to do async RL
- infra split between rollout workers and centralized learner
- interview with gradient cofounder eric yang himself!
china's AI giants just launched a price war over AI coding plans.
zhipu, minimax, kimi, alibaba, bytedance, tencent all competing for the same developer wallet.
entry price: ยฅ7.9/mo (~$1.10) for the first month.
I tested a local OCR setup using:
mlx-community/GLM-OCR-bf16
mlx-vlm
glm-ocr
Tested Inputs
test.jpeg: Traditional Chinese overtime form photo
eng-test.png: clean printed English document photo
hello-123.png: synthetic image with simple text
How It Was Tested
I tested both:
through the glm-ocr pipeline
directly with mlx_vlm.generate
Result
The model did not return usable OCR text.
Instead, outputs were mostly:
placeholder tokens like <|begin_of_image|>
empty markdown blocks
or a wrong single token such as ๅข
This happened even on:
a clean printed English image
a simple synthetic text image (HELLO 123)
Conclusion
The current local OCR setup is not working correctly for actual text extraction.
We're expanding pre-train and model size, it's time to explore post training.
like how we, human learn, experience, we learn as we do , on the go, everyday!
Benchmarks that test what models have memorized are saturating fast. ARC-AGI-3 is asking a harder question: can AI actually learn something new on the fly?
One direction we've been exploring: multi-agent orchestration. In our study, coordinating four frontier LLMs across multiple turns consistently matched or outperformed the strongest single model, even on tasks none of them could solve alone.
The gap between "best single model" and "best coordination of models" is where a lot of the real progress is hiding.
More on our multi-turn, multi-agent orchestration study: https://t.co/LIwRoeggBB