Geosh @Geoshh - Twitter Profile

Pinned Tweet

Geosh @Geoshh

8 months ago

Gonna try to pin a few favorite posts that linger in mind over time:

Derya Unutmaz, MD

@DeryaTR_

9 months ago

Amusing how 99% of people using their own brains forget how it works: The brain is an advanced probability machine. It keeps predicting the next most likely thought, word, or action based on incoming signals and past learning. Under the hood, billions of neurons are doing weighted sums and activations, essentially giant matrix-like math, and the output is often eerily good.

37

507

42

79

48K

5

4

0

2K

Geoshh retweeted

Jon Barron

@jon_barron

about 10 hours ago

My advice to PhD students in 2026: 1) If your advisor hasn't logged >100 hours in a modern agentic IDE, stop listening to your advisor. 2) Write your next paper *inside your codebase* as a .tex (with 80 char word-wrap). Force your advisor to read and make edits inside the IDE.

34

769

28

476

123K

Geoshh retweeted

Bernt Bornich

@BerntBornich

about 9 hours ago

We’re going all in on World Models. Today we’re launching the 1X World Model Lab. The bet is simple: You can’t fine-tune your way to AGI. And you definitely can’t fine-tune your way to robots that can operate in the physical world. General-purpose humanoids need models that understand space, motion, objects, causality, affordances, physics, and action before they ever see a specific task. The frontier is not better VLA wrappers. The frontier is embodied world models. The 1X World Model Lab will focus on large-scale embodied world model pretraining: building the most generalizable foundation model for humanoid robots from the ground up. The next frontier in AI requires scaling: web-scale media + egocentric human videos + sim + dexterous remote operated robot data + on-policy NEO data → real-world deployment for robot data collection and RL → abundance of data → physical AI The robot collects data. The model gets better. The robot gets better. Repeat. To lead this, we brought in one of the best for the mission: @_sam_sinha_ , as Head of World Models. Sam was a founding research scientist at Luma AI and has been at the frontier of scaling multimodal generative video models his whole career. If you’re the best in the world at large-scale pretraining, video models, robotics, RL, infra, or data — and you want your models to move atoms, not just pixels — join us. Send background + evidence of exceptional ability to: [email protected] We’re building the model that makes autonomous labor real.

BerntBornich's tweet photo. We’re going all in on World Models.

Today we’re launching the 1X World Model Lab.

The bet is simple:

You can’t fine-tune your way to AGI.

And you definitely can’t fine-tune your way to robots that can operate in the physical world.

General-purpose humanoids need models that understand space, motion, objects, causality, affordances, physics, and action before they ever see a specific task.

The frontier is not better VLA wrappers.

The frontier is embodied world models.

The 1X World Model Lab will focus on large-scale embodied world model pretraining: building the most generalizable foundation model for humanoid robots from the ground up.

The next frontier in AI requires scaling:

web-scale media + egocentric human videos + sim + dexterous remote operated robot data + on-policy NEO data → real-world deployment for robot data collection and RL → abundance of data → physical AI

The robot collects data.
The model gets better.
The robot gets better.
Repeat.

To lead this, we brought in one of the best for the mission: @_sam_sinha_ , as Head of World Models.

Sam was a founding research scientist at Luma AI and has been at the frontier of scaling multimodal generative video models his whole career.

If you’re the best in the world at large-scale pretraining, video models, robotics, RL, infra, or data — and you want your models to move atoms, not just pixels — join us.

Send background + evidence of exceptional ability to:

wmlab@1x.tech

We’re building the model that makes autonomous labor real.

83

2K

121

618

124K

Geoshh retweeted

Elon Musk

@elonmusk

about 18 hours ago

Hadamard thought in image space

2K

17K

3K

4K

6M

Geoshh retweeted

Jinyu Hou

@jinyuhou0

13 days ago

On popular benchmarks, our 30B model matches systems 20-30x its size (gpt-5.4-xhigh, DeepSeek-V3.2, Kimi-K2.5), while using up to 95% fewer reasoning tokens than comparable 30/32B agentic LLMs. The trick: don't just reason less, reason about the right things. A learned configurator decides when to simulate, how far ahead, and when to skip planning entirely. Efficient reasoning is an allocation problem, not a compression problem. Model and code are openly available.

jinyuhou0's tweet photo. On popular benchmarks, our 30B model matches systems 20-30x its size (gpt-5.4-xhigh, DeepSeek-V3.2, Kimi-K2.5), while using up to 95% fewer reasoning tokens than comparable 30/32B agentic LLMs.

The trick: don't just reason less, reason about the right things. A learned configurator decides when to simulate, how far ahead, and when to skip planning entirely.

Efficient reasoning is an allocation problem, not a compression problem.

Model and code are openly available.

4

247

26

221

24K

Geoshh retweeted

Mingkai Deng

@mdeng34

1 day ago

We agree that the world model should be a simulator that supports decision-making, not rendering beautiful images/videos. Our difference is in how the world state should be represented. Should the world be anchored in Gaussian splats and physics engines for program-as-simulator? Or in learned representations for model-as-simulator? We believe the latter is a more scalable, bitter-lesson-pilled approach. More in our position paper "Critiques of World Models" coauthored with Prof. @ericxing and @jinyuhou0 https://t.co/NqnxGtKNBL

6

50

11

39

6K

Geoshh retweeted

Fei-Fei Li

@drfeifei

1 day ago

https://t.co/Kt50ttQRMJ

113

4K

711

5K

618K

Geoshh retweeted

Neel Nanda

@NeelNanda5

1 day ago

I had a lot of fun working on this paper - we found an elegant story for why subliminal learning happens! A key intuition in interpretability is that basically every interesting phenomena in LLMs boils down to adding a steering vector. Subliminal learning is no exception!

10

315

22

226

34K

Geoshh retweeted

Michael Tschannen @mtschannen

1 day ago

For the past years my research focus was on unifying models and training paradigms across modalities. Today I'm excited that we're releasing our latest model aligned with this theme: Gemma 4 12B, a dense encoder-free model which processes raw text, image, and audio inputs! 1/

mtschannen's tweet photo. For the past years my research focus was on unifying models and training paradigms across modalities. Today I'm excited that we're releasing our latest model aligned with this theme:

Gemma 4 12B, a dense encoder-free model which processes raw text, image, and audio inputs!

1/ https://t.co/4J2JKCtzU5

24

1K

123

496

97K

Geoshh retweeted

Jacky Kwok

@jackyk02

1 day ago

Excited to share that CoVer-VLA has been selected as the Best Paper Finalist at the CVPR 2026 Scalable Robot Learning Workshop 🤖 I’ll be talking about how verification can be scaled 🚀 for robots—both during training and test-time! 📍 Denver Convention Center, Room 610 🕔 June 4, 5:00 PM 🌐 https://t.co/OOsSlmS5dr Happy to catch up and chat if you’re at CVPR!

4

82

12

69

12K

Geoshh retweeted

Keya Hu

@HuLillian39250

1 day ago

Excited to share our CVPR 2026 paper, ARC Is a Vision Problem! 🖼️ The Abstraction and Reasoning Corpus (ARC) is often approached as a language reasoning problem, despite being an inherently visual puzzle for humans. 🧩Introducing Vision ARC (VARC)🧩: we reframe abstract reasoning as an image-to-image translation problem, solved by a plain Vision Transformer.

HuLillian39250's tweet photo. Excited to share our CVPR 2026 paper, ARC Is a Vision Problem! 🖼️

The Abstraction and Reasoning Corpus (ARC) is often approached as a language reasoning problem, despite being an inherently visual puzzle for humans.

🧩Introducing Vision ARC (VARC)🧩: we reframe abstract reasoning as an image-to-image translation problem, solved by a plain Vision Transformer.

5

288

39

142

30K

Geoshh retweeted

Sudo su

@sudoingX

1 day ago

this is the first chart i ever made on this project. that sharp spike dropping out of the noise is a real planet crossing its star, a confirmed world blocking just under 1% of the light, recorded by a NASA telescope and sitting in a public file for over a decade. i pulled four years of raw starlight and folded it back on itself until the thing that was invisible showed up on my screen. then i built a neural net from scratch, and taught it to find these in the noise the way the real pipelines do. you've got the most capable machine ever built within arm's reach and you're using it to reword emails. i pointed mine at the actual sky.

sudoingX's tweet photo. this is the first chart i ever made on this project. that sharp spike dropping out of the noise is a real planet crossing its star, a confirmed world blocking just under 1% of the light, recorded by a NASA telescope and sitting in a public file for over a decade.

i pulled four years of raw starlight and folded it back on itself until the thing that was invisible showed up on my screen.

then i built a neural net from scratch, and taught it to find these in the noise the way the real pipelines do.

you've got the most capable machine ever built within arm's reach and you're using it to reword emails. i pointed mine at the actual sky.

8

156

10

44

33K

Geoshh retweeted

Prince Canuma

@Prince_Canuma

1 day ago

🚀 Gemma 4 12B is here! We partnered with @GoogleDeepMind to bring and optimize their new dense and unifed multimodal model for Apple Silicon. ◈ 12B dense · 256K context ◈ Thinking mode (built-in reasoning) ◈ Vision: dynamic res, OCR, UI + charts ◈ Native audio: ASR + speech translation ◈ Function calling for agents ◈ Text + image + audio, interleaved Runs local. Get started now ⚡ > uv pip install -U mlx-vlm https://t.co/7BvnEuzKvj

Prince_Canuma's tweet photo. 🚀 Gemma 4 12B is here!

We partnered with @GoogleDeepMind to bring and optimize their new dense and unifed multimodal model for Apple Silicon.

◈ 12B dense · 256K context
◈ Thinking mode (built-in reasoning)
◈ Vision: dynamic res, OCR, UI + charts
◈ Native audio: ASR + speech translation
◈ Function calling for agents
◈ Text + image + audio, interleaved

Runs local. Get started now ⚡

> uv pip install -U mlx-vlm

https://t.co/7BvnEuzKvj

50

1K

144

918

152K

Geoshh retweeted

Google Gemma

@googlegemma

1 day ago

Meet Gemma 4 12B! A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license. Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇

googlegemma's tweet photo. Meet Gemma 4 12B!

A unified, encoder-free multimodal model designed to bring high-performance intelligence directly to your laptop, and released under an Apache 2.0 license.

Bridging the gap between edge efficiency and advanced reasoning. Here is what’s new with Gemma 4 12B: 👇 https://t.co/gf4FZv0WZb

356

12K

2K

5K

3M

Geoshh retweeted

Hanna Hajishirzi

@HannaHajishirzi

2 days ago

MAI-Thinking-1 is out! Excited to share what we are building and how climbing from scratch (no distillation) actually works: simple recipes, rigorous science, self-distillation, patience, and great infra. Check out our tech report has the full story of our RL climbs. https://t.co/aLW40sWz4d

HannaHajishirzi's tweet photo. MAI-Thinking-1 is out!

Excited to share what we are building and how climbing from scratch (no distillation) actually works: simple recipes, rigorous science, self-distillation, patience, and great infra.

Check out our tech report has the full story of our RL climbs.
https://t.co/aLW40sWz4d

23

856

124

370

112K

Geoshh retweeted

Neo Research @NeoResearchAI

3 days ago

Read the full report at https://t.co/Jv0wT4xVON. We're hiring research scientists and engineers globally. (5/5)

1

54

1

11

5K

Geoshh retweeted

SciTech Era

@SciTechera

3 days ago

Wow. This is crazy. A developer trained an AI agent in simulation and deployed it onto a real robotic air hockey table using reinforcement learning. This robot can track the puck with millimeter-level accuracy and react in roughly 20 milliseconds, fast enough to challenge even skilled human players. We’re moving from robots that follow programmed rules to machines that learn strategies in simulation and execute them in the physical world.

200

3K

484

1K

2M

Geoshh retweeted

Mustafa Suleyman

@mustafasuleyman

2 days ago

Proud that we’re collaborating with Mayo Clinic to build a frontier AI model for healthcare. Both our organizations exist to serve people at scale – and we believe this could be nothing short of transformative for global healthcare. https://t.co/bWt2fP6HKy