Today, we are formally announcing @mirendil. Mirendil exists to be a straight shot at solving bottlenecks to step-change acceleration across all areas of science and technology. If you are excited about this mission, feel free to reach out.
we are launching Auto Endpoints ! here are some features supported out of the box, only on @modal:
> sub-5ms high-perf proxy with auth support
> inference-level metrics built into your Modal dashboard (ttft, itl, accept length, etc)
> spec dec (dflash!) and optimized hardware based on the model you're running
> perf preview before deployment, with benchmarks targeting your use case (real-time gen, agentic multi-turn)
> and more !
this is just the beginning -- more autoinference and optimizations are coming soon. time to own your inference !
p.s. every Auto Endpoint user gets $100 in credits. give it a try and let us know what you think !
Excited to release cho-embedding-0.8b - a Pareto frontier 0.8B vision-language embedder competitive with some of the most widely-used models up to 4x larger.
Built on Qwen3.5-0.8B and post-trained with multi-stage contrastive + distillation objectives, the model achieves 60.7 on MMEB-Image, the best score among <1B models.
• General-purpose embeddings for text, images, video frames, documents, and document screenshots.
• State-of-the-art performance on MMEB-Image among academic models at this scale (some of which highly specialized to the benchmark).
• Drop-in integration for existing Qwen3-VL-Embedding-2/8B pipelines with significantly lower compute and serving cost.
• Fully open under Apache 2.0 for commercial use.
Download model: https://t.co/iP1sFsppMz
PS. The model name is an inside joke and also an acronym for the "Contrastive Hard-negatives Objective" used. 6720 NVIDIA H100 GPU hours were spent on mining, training, and ablations. Feedback is welcome:)
Ultrasound tomography…. but with piezoelectric MEMS semiconductors instead of crystals. Presumably imaging algorithms / hardware systems by Midjourney, device design by Butterfly Networks, manufactured by TSMC… super cool!
Been fun working on this over the past few weeks with the team.
We wake up every day looking for people who are still unknown to the world but won’t be forever.
The founders we back today will one day have biographies of their own.
@novaholdings
"With that, we reframed multimodal generation as structured text/code generation"
Text is ambiguous but code is not. Would love to see more results in having LLM natively think like its coding.
1/ Our new @reve image model is now #2 on the @arena text-to-image leaderboard — behind only GPT Image 2, ahead of Nano Banana Pro, Microsoft, xAI and everyone else.
And it's a 125 point jump over Reve 1.5 from just 3 months ago.
The research story behind it 🧵👇
Glad to see this -- renderers are a foundational component of the LLM stack. Renderers map between tokens and messages, which are invariant to tokenizer and formatting details. Most APIs, datasets, and RL environments are defined in terms of messages.
Getting the details wrong leads to train-test mismatches, caching inefficiencies, and prompt injection vulnerabilities. We included a renderers module in Tinker Cookbook, but it makes sense as a standalone library.