@LusciousPear 💯 I think the biggest unlock is allowing rapid creation & deployment of robot models. Working on giving post training superpower to every robot @mixtrainai
Why do you need your own model?
"We’ve implemented new interventions that limit Claude’s effectiveness for requests targeting frontier LLM development (for example, on building pretraining pipelines, distributed training infrastructure, or ML accelerator design"
Introducing Claude Fable 5: a Mythos-class model that we’ve made safe for general use.
Its capabilities exceed those of any model we’ve ever made generally available.
@bernhardsson I think it makes sense if they consider owning the entire stack is worth as muscle in the long run (and not just about bps). They have done things in programming language, compiler etc for the same reason I believe.
@lukas_m_ziegler Post training platform for robotics, so more and more robotics companies can ship task specific models for every use case x embodiment, while still leveraging the power & economics of generalized learning in large VLA/WAM/VAM.
@eatpraydiehard@madiator This is largely because post training *platforms* aren’t wide spread yet. With right platform partners, post training is not lot more complicated then harnessing around a single model.
@mixtrainai You can't rely on any labs post training offering, because it's not their core focus. You need platform that is independent/agnostic of a model, and brings you frontier grade infra to your problem domains (custom kernels, data curation, eval infra,..)
@mark_k@OpenAI Your post training strategy should be model provider agnostic. Partner with @mixtrainai to get frontier grade platform for your custom models
https://t.co/EavaavgaTY
@mixtrainai You can't rely on any labs post training offering, because it's not their core focus. You need platform that is independent/agnostic of a model, and brings you frontier grade infra to your problem domains (custom kernels, data curation, eval infra,..)
@mixtrainai You can't rely on any labs post training offering, because it's not their core focus. You need platform that is independent/agnostic of a model, and brings you frontier grade infra to your problem domains (custom kernels, data curation, eval infra,..)
If you were looking for a platform (and company) dedicated to post training, we at @mixtrainai are fully focused on helping you build the best model for your task+constraint+evaluation...
OpenAI has announced they will be winding down fine tuning. I got the email today. Existing active @OpenAI customers can keep running fine-tuning jobs until January 6, 2027, but after that no new training jobs can be created. Existing fine-tuned models will still run, but only until the underlying base model is eventually deprecated.
I get the argument that newer models follow instructions much better, and that prompts plus RAG cover more use cases than before. But not all of them.
@willbitsky@oyhsu@JacobZietek 3. Task specific models optimized both for inference compute and latency
4. Efficient training env + kernels when co training video/world models with diffusion heads.
5. Data infra for curation and evaluation for physical AI
@willbitsky@oyhsu@JacobZietek Agree with all 10, and it's great to see more than 5 are bottlenecked on "engineering/deployment" and not on research any more. Few more to add:
1. Latent vs pixel space prediction
2. Long horizon context management between autoregressive backbone and diffusion heads