Btw this doesn’t just make the model less useful it will nerf your code and tell you it’s not. Like you legitimately cannot use this. And how are we to know whether it touches inference optimization or even harness engineering, if we’re not alerted?
Today, we're launching shift. We're starting by cleaning your apartment in New York City, for free.
Here's how it works. Book a shift cleaning. A vetted shift operator comes to your home wearing one of our devices. They clean. They leave. You pay nothing.
In exchange, we record the cleaning. Robotics is being built on data about how people do daily tasks, and the value of that recording is what funds the service. Anything personal in it is anonymized before the recording is processed.
By now, you have heard about the shift to AI more times than you can count. About the shift toward you, the part where you actually feel it, you have heard almost nothing. Shift is what starts to make it concrete, in specific cities, with specific services.
Today, cleaning in New York. Soon, handymen, repairs, and errands across the globe. And this is just one side of shift, with more on the way.
Comment “shift” and we’ll send you an early access link.
@skalskip92 You might not believe it, but I simply manually annotated over 2,000,000 human body parts with ultra-precise detail. Probably no one else could do that.
Ever wished we had fewer X-training hyphenates? Pre, mid, post etc. Why not just Training?
Trying to bridge the divides (and get all our friends into one team again), we intro *Introspective X Training*, an offline RL inspired method that scales effectively across any LLM stage by annotating your data with a thinking reward generated language critique!
Up to 2.8x FLOP efficiency + 5-10 point score gains (esp with math and code) at any stage from scratch to 24T tokens on 8b (active) sized models!! We burned much compute ablating so you wouldn't have to
Moral of the story is‼️don't throw out any data via filtering, just feedback condition it‼️
You can spend FLOPs up front on inference to *classify* data quality and then train so that tokens aren't all treated equally based on the feedback starting early in training itself. Right now they're really only separated out much later during mid/post training
This improves overall compute efficiency and gives us benchmark perf not possible with just baseline methods!
Paper here: https://t.co/9oSYwQEpbi
Thanks to @BrandoCui and @GXiming for leading this w/ @__SyedaAkter@davidjesusacu@hyunw_kim@jaehunjung_com Yuxiao Qu @shrimai_@YejinChoinka
Twice in the past five years, @ArmenAgha has authored a paper that became the field's default architecture 2-3 years before the industry caught up.
Intrinsic Dimensionality (2020) → motivated LoRA and the parameter-efficient fine-tuning industry.
Chameleon (2024) → defined the early fusion approach both GPT-4o and Gemini converged on.
He left @Meta in Nov 2024 to start @perceptroninc and to use learnings from prior arhiectures mixed with thesies on new ones into production for physical AI.
Today the company shipped the third installment. Flagship platform: Perceptron Mk1 for cloud + Isaac 0.2 for edge + unified Python SDK. Mk1 brings grounded perception to long-form video. Isaac is 1B/2B edge models matching or beating systems 10× larger.
What we love about @ArmenAgha is that he has always been tied to one mission and doing it in stealth in Seattle along with @AkshatS07.
Now it's time for the world to know about their secrets and the architectural advancements that are coming to life.
My favorite model release I've worked on - the past few months📈performance on high number and in-context pointing reinforced why ML is a rewarding field to work in. The ROI of thinking creatively about data is very high, qualitative skills and taste are enormous differentiators