Manifold Research

about 21 hours ago

The team @figbrains, along with our friends @manifoldrg, took three of the best computer-use models and, surprisingly, broke all of them with very simple perturbations like changing zoom or colors. Read on to understand our research, including a new SoTA Evaluation Dataset for Browser-use models + a new kind of interactive data sandbox!

3

16

5

1

1K

ManifoldRG retweeted

5 days ago

At @figbrains, we’re testing frontier models (Fable, Kimi, etc) on simple web tasks that should be solvable. They failed in ways that wouldn't stump a human (we think) Results coming in a few days, but we want to see how good humans are: Which change causes the most failures?

2

9

5

2

262

7 days ago

The Software Control research team at Manifold has been working on advancing new frontiers in long horizon computer control & grounding with @figbrains Check out some of our early research below, with more to come soon!

Knowledge manifests itself in radiant dreams that shimmer like the wild sun Views are my own https://t.co/xqtVHHVI17 on 🦋

7 days ago

Computer Control models can score 90%+ on standard benchmarks, but will fail when you set page zoom to 70%. We're built GUI-DR, an OS pipeline that can restyle, reposition, and remove DOM elements on real webpages to reveal model weaknesses that fixed-scene benchmarks miss.

1

6

2

729

0

3

1

0

59

Who to follow

David Pfau

@pfau

Div Garg

@divgarg

Founder & CEO @AGI_Inc Prev. Stanford PhD (dropout), founder @ MultiOn (pioneered first browser / computer-use agents), worked @ Nvidia, Google AI, Apple

Yanai Elazar

@yanaiela

Assistant Professor at Bar-Ilan University

ManifoldRG retweeted

7 days ago

Computer Control models can score 90%+ on standard benchmarks, but will fail when you set page zoom to 70%. We're built GUI-DR, an OS pipeline that can restyle, reposition, and remove DOM elements on real webpages to reveal model weaknesses that fixed-scene benchmarks miss.

1

6

2

729

18 days ago

Foundation models assume capabilities transfer. MultiNet tests that: what happens when a multimodal model leaves its training domain and has to operate somewhere else? Excited to see this work presented at CVPR 2026, developed with the @figbrains team!

18 days ago

This week at #CVPR2026 we presented MultiNet v1.0 at the MMFM workshop. It is a benchmark built around a question most evaluations skip: what happens to a multimodal model when you take it out of the one domain it was trained for and ask it to handle everything at once?

2

6

2

0

1K

0

1

0

84

ManifoldRG retweeted

19 days ago

Loved @pliang279’s #CVPR2026 talk on AI modalities beyond vision/language: touch, smell, etc. The vision-tactile retrieval work reinforces that good representations make hard-to-observe signals queryable. We’re applying a similar lens to trajectories at @figbrains. More soon!

0

7

4

0

224

ManifoldRG retweeted

20 days ago

I’ll be at CVPR in Denver, along w/ some brilliant colleagues 🚀 If you’re around anytime over the next few days and interested in computer control or long horizon robotics, please reach out - the @figbrains team is around! We’d love to give a sneak peek at what we’re building.

0

6

3

1

430

20 days ago

Members of the GOLEM team at Manifold will be presenting work today at CVPR’s MFMM workshop - come by to learn more about MultiNet, a next gen benchmark for frontier action systems! More details on room and time below ⬇️

20 days ago

We built MultiNet v1.0 to test how well frontier models generalize across domains from text to robotics to gameplay and found surprising patterns of failure. We're presenting at the #CVPR2026 MMFM workshop @ 3PM, room Four Seasons 4. Come hear where & how they break!

0

8

5

0

455

0

31

ManifoldRG retweeted

21 days ago

Headed to #CVPR2026! I'll be there on behalf of @figbrains and @ManifoldRG, presenting our research on next-generation multimodal models and evaluation systems. If you're into multimodal models, VLAs, or how we actually evaluate them, come say hi - I'd love to talk!

0

10

4

0

774

22 days ago

What will it take to build the next generation of AI systems and frontier technologies? The Manifold team will attend both CVPR 2026 and Vision Weekend UK this week! If you’ll be there, come say hello! We’d love to meet folks interested in ambitious science and technology.

Foresight Institute

@foresightinst

3 months ago

Join us for our first ever Vision Weekend in the UK! 2026 marks 40 years of Foresight. Over three days, we will gather leading researchers, builders, and funders to look forward: exploring what scientific and technological frontiers will shape the coming decades, and how to make them reality. June 5–7 | London Confirmed speakers include: • Ed Boyden (MIT) on biologically accurate brain simulation • Greg Wayne (Google DeepMind) on universal AI assistants • Jano Costard (SPRIND) on challenges as a tool for breakthrough innovation • Christine Peterson (Foresight Institute) on Foresight, 40 years later • Dorothy Chou (Google DeepMind) on capital for the long game: financing durable innovation in an age of hype • Irina Rish (Mila) on beyond scaling: toward continual and adaptive intelligence • Chris Rozell (Georgia Tech) on closed-loop neuroengineering: algorithms that learn from the brain in real time • Lee Cronin (University of Glasgow) • Mehmet Fisek (Meridial) on Focused Research Organisation mission and setup • Zoë Brammer (Google DeepMind) on AI for science 2030 • João Pedro de Magalhães (University of Birmingham) on hacking aging biology and many more. Get your tickets: https://t.co/nrK9PKN0ES Powered by: @apolloaievals @ARIA_research @e184media @CUHPartners @RenPhilanthropy @SPRIND @andnowstudio

6

41

13

16

7K

0

3

2

0

104

ManifoldRG retweeted

Foresight Institute

@foresightinst

2 months ago

Meet Sidh Sikka, PhD researcher in orbital robotics, co-founder of Manifold Research, and Foresight Fellow 2026. @SikkaSidh is working toward autonomous robotic swarms, capable of assembling and managing large-scale infrastructure in orbit: the foundational layer for a sustainable industrial economy in space. His R&D institute @ManifoldRG is currently seeking technical collaborators across several research projects, including their autonomous assembly project. Learn more: https://t.co/PtvLOIiMBJ Sidh and his team are also seeking funding to grow this work. Reach out at sid [at] sidhsikka [dot] com

foresightinst's tweet photo. Meet Sidh Sikka, PhD researcher in orbital robotics, co-founder of Manifold Research, and Foresight Fellow 2026.

@SikkaSidh is working toward autonomous robotic swarms, capable of assembling and managing large-scale infrastructure in orbit: the foundational layer for a sustainable industrial economy in space.

His R&D institute @ManifoldRG is currently seeking technical collaborators across several research projects, including their autonomous assembly project. Learn more: https://t.co/PtvLOIiMBJ

Sidh and his team are also seeking funding to grow this work. Reach out at sid [at] sidhsikka [dot] com

0

11

4

2

699

2 months ago

Manifold Research Group works on high-impact problems that fall between academia and industry. Small teams, real systems, published results. Learn more about what we’re building at Manifold: https://t.co/pKzAwX6fqO

0

26

2 months ago

What does it actually take to build in space at scale? The next phase of the space economy will depend on our ability to build and service systems at scale, directly in orbit. We recently gave a talk on this with @foresightinst. Link to the talk in the thread below 🧵

1

2

0

52

2 months ago

We are building toward coordinated, autonomous systems that can enable large scale construction in orbit, turning this from concept into deployable capability. If you want to work with us on this, check out: https://t.co/5Y19ZLR0Yw

1

0

34

3 months ago

Manifold Research Group works on high-impact problems that fall between academia and industry. Small teams, real systems, published results. All open roles → https://t.co/5Y19ZLR0Yw

0

2

0

28

3 months ago

Can a multimodal model that reasons well in language also do so in a grid world? In a 3D sim? On a different task? MultiNet tests whether models really generalize across tasks, across modalities. We're building it at @ManifoldRG, and we want researchers to join. 🧵 Roles below.

ManifoldRG's tweet photo. Can a multimodal model that reasons well in language also do so in a grid world? In a 3D sim? On a different task?

MultiNet tests whether models really generalize across tasks, across modalities. We're building it at @ManifoldRG, and we want researchers to join.

🧵 Roles below. https://t.co/hyJSNPAv03

1

6

5

0

318