@micoolcho Its impressive , yet likely differet rl policies for each config.. so rl does not need to know that robot can be reconfigured..as long as correct policy is active .. so should be almost regular approach to making these run -- very clever mechanical engineering tho
@scaling01@GaryMarcus Nvidia will not have a moat like it has now(pytorch,vllm etc. dont need cuda already) it's peak nvidia today.
No technical reason why Google and others can't match Antrpc/OAI at coding&soon (look at Cursor already)- also coding is only killerapp atm,yet other uses will emerge.
@chamath also its over 60% "software assistant" -- a market that MSFT and Google will eat up once they get their poop togather (as Cursor recently did)
@GaryMarcus There is a positive: all the paid for and deoloyed GPUs (spent capex) will make AI more affordable --- like all then Sun servers and ISPs and fiber broadband investment in the original dot com era, didnt go to waste after the bubble burst they were well utilized and grew later
@chris_j_paxton Clever hack! Yet not ideal.. even paint job would help (zebra pattern etc). --- the new "Everything's Computer" is "Everything is Image" lol -- why add more dimentions to your VLA whe ya can just cram force as an image lol
@GaryMarcus we don't talk about this :) please don't pop the .. lets call it," ai dot com" bubble -- the last time this happened it took the Nasdaq about 15 years to recover back to 5000
@PTrubey@SakanaAILabs extreme example(if this works): u buy a regular 96GB GPU, train DeepSeekV4 at home,if you just wait long enough -- Atm. this is not possible,the min hardware needed is a GPU like B200 to "fit" the entire model in VRAM, and atm that is at min one DGX (8xB200) server, about $500k
@DavidSHolz Power/Electricity is why... also likely why OpenAI paused Sora.. its something like 20x to 10x the # of conccurent users that can be supported for llm vs videogen per GPU -- llms are efficient on compute vs diffusion (albeit diffusion type models are current irreplaceable)
For over a decade, we’ve accepted that end-to-end backprop is the only way to train deep networks. But holding the entire network in memory all at once is why AI training is hitting a resource wall.
We found a new way to break the network into blocks and train them independently. The trick? Treating the network’s forward pass like a diffusion model denoising a signal.
This reinterpretation slashes the memory needed to train deep models. In our #ICLR2026 paper (https://t.co/PK5h0mqQSo), we matched end-to-end performance across ViTs, DiTs, and LLMs. We did this while training just one isolated block at a time.
@JayKapoorNYC@startupjag Hey: #1 robot backflips are not staged or teleop #2 VLAs are not used for most humanoid robot locomotion, like dancing , jumps, etc.
There are people doing useful work in all those domains, and many years of work left, yet that's no reason to missinform and conflate facts ..
@Scobleizer yep, 1) LLM are not good enough out of the box for most enterprise cases.2) post-training is hard, requires both ml skills & clean domain-specific data--enterprise and their consultants do not have the skills to implement this yet,for the next while Vertical Startups will do well