The interesting part isn’t just open-vocabulary detection. It’s treating perception as an iterative process—looking closer when needed, building evidence, and adapting to the task instead of relying on a fixed detector.
Frontier perceptive models are finally cheap enough to call dozens of times on a single task. This means visual agents are now becoming feasible and new capabilities will emerge.
1/ Today we're shipping Perceptron Agentic Detection. Describe what you want in natural language, or show one example crop, and an agent grounds it in the image. No fine-tuning, no fixed class list.
We built a new VLM detection endpoint for finding objects in images using natural language categories, examples, or the image itself.
Demo: https://t.co/Xp5z09270N
Docs: https://t.co/hKlEFHVuHq
We're hiring our first DevRel Engineer (aka Hacker-in-Residence) at Perceptron.
Demand for our model Mk1 is outpacing what our research-first team can serve. You'll help us close the loop between our models and the people building on them.
NVIDIA has done the impossible and nobody's talking about it.
They trained a 12 BILLION parameter LLM in 4-bit precision on 10 trillion tokens.
For years, the AI industry has been stuck.
If you wanted to train a world-class AI, you had to use 16-bit or 8-bit precision. Going lower to 4-bit, was a death sentence for the model. It would become unstable, "hallucinate" its own math, and eventually collapse.
But NVIDIA proved that "impossible" was just a math problem.
They used a new format called NVFP4.
Instead of a standard, rigid structure, NVFP4 uses "micro-scaling." It groups numbers into tiny blocks and applies individual scaling factors to each one. It’s like giving the AI a pair of high-definition glasses for its own data, allowing it to see fine details even with 75% less memory.
The result is a total paradigm shift:
- 2× to 3× faster arithmetic performance.
- 50% reduction in memory usage.
- Near-zero loss in intelligence.
The researchers compared the 4-bit model against a massive 8-bit baseline. The curves are identical. On MMLU, GSM8K, and coding benchmarks, the "tiny" 4-bit version performed within 0.1% of the more expensive model.
This is an economic earthquake.
Training a frontier model used to require tens of thousands of GPUs and months of time. NVIDIA just showed we can get the same results with half the hardware and a fraction of the electricity.
@TimTrautman@kvickart@perceptroninc What exactly were you looking for in docs, perhaps I can link you directly to it or shoot you a dm. Right now only video frames. Audio coming very soon.