Turing Community

3 days ago

Hugging Face: https://t.co/A1Uvpcg2Do

1

9

5

8

157

turingcomdev retweeted

3 days ago

To view this and additional datasets: https://t.co/3lRHD7SunH

0

9

5

8

116

turingcomdev retweeted

Our mission is to accelerate superintelligence to drive real economic progress.

7 days ago

The models are already extraordinary. That's not the hard part anymore. The hard part is letting them touch reality. Real workflows. Real data. Real stakes. The next decade belongs to whoever solves deployment, not whoever builds the best benchmark score. I've been making that bet for seven years. I'm more convinced than ever. Link below.

jonsidd's tweet photo. The models are already extraordinary.
That's not the hard part anymore.
The hard part is letting them touch reality.

Real workflows. Real data. Real stakes.

The next decade belongs to whoever solves deployment, not whoever builds the best benchmark score.

I've been making that bet for seven years.
I'm more convinced than ever. Link below.

4

21

7

8

5K

Who to follow

Turing

@turingcom

GoldForest Investments

@Goldforestinves

Long-term equity investor | Best of Both Worlds | Fundamentals First. Price Action for Discipline. | Research & Smallcases @DivitiaeIn

Ravin

@ravinwashere

building stuff for the internet

turingcomdev retweeted

7 days ago

https://t.co/qUWOrt454F

0

10

6

9

455

turingcomdev retweeted

This Week in AI

@ThisWeeknAI

14 days ago

Who's actually building AI? 3 months and 14 episodes into This Week in AI, @Jason has sat down with founders and operators across infra, models, dev tools, consumer, creative, robotics, healthcare, and more. INFRA & COMPUTE Chase Lochmiller (Crusoe) @ChaseLochmiller Lin Qiao (Fireworks AI) @lqiao Chris Lattner (Modular) @clattner_llvm Nick Harris (Lightmatter) @theanalognick Mitesh Agrawal (Positron AI) @mitesh711 Alex Cheema (EXO Labs) @alexocheema Philip Johnston (Starcloud) @PhilipJohnston Naveen Rao (Unconventional AI) @NaveenGRao Russ d'Sa (LiveKit) @dsa FOUNDATION MODELS & RESEARCH Kanjun Qiu (Imbue) @kanjun Carina Hong (Axiom Math) @CarinaLHong Jeremy Fraenkel (Fundamental) @fraenkelj EVALS & BENCHMARKS Anastasios Angelopoulos (Arena) @ml_angelopoulos DEV TOOLS, CODING & AUTOMATION Karri Saarinen (Linear) @karrisaarinen Matan Grinberg (Factory) @matanSF Spiros Xanthos (Resolve AI) @spirosx Wade Foster (Zapier) @wadefoster CONSUMER & SEARCH Aravind Srinivas (Perplexity) @AravSrinivas Richard Socher (youdotcom & Recursive) @RichardSocher Tanay Kothari (Wispr Flow) @tankots Steven Berlin Johnson (NotebookLM) @stevenbjohnson CREATIVE & MEDIA Demi Guo (Pika) @demi_guo_ Victor Riparbelli (Synthesia) @vriparbelli Mikey Shulman (Suno) @MikeyShulman Grant Lee (Gamma) @thisisgrantlee ROBOTICS Jake Loosararian (Gecko Robotics) @jakeloosy Boris Sofman (Bedrock Robotics) @bsofman HEALTHCARE Shiv Rao (Abridge) @ShivdevRao Trey Holterman (Tennr) @TreyHolterman ENTERPRISE, VERTICAL & DATA George Sivulka (Hebbia) @gsivulka Kashif Ali (TaxGPT) @ChKashifAli Alex Elias (Qloo) @ape TALENT & WORKFORCE Ali Ansari (micro1) @aliansarinik Jonathan Siddharth (Turing) @jonsidd Thank you all for joining! Episode 14 out now: https://t.co/oaDPn5WfT8

4

37

14

24

63K

turingcomdev retweeted

8 days ago

Excited to share that @Turingcom Co-founder @krishnanvijay will be joining an industry panel with: @ravisujith (GVP, @OracleAI), @Kenneth_Marino (@UUtah), and @MingHsuanYang (@ucmerced @GoogleDeepMind) at #CVPR2026 The morning will be dedicated to solving the hardest parts of Agentic Systems and bridging the gap between Computer Vision, NLP and Informational Retrieval.

0

11

5

6

495

turingcomdev retweeted

15 days ago

Last week we released the Open MM-RL Dataset. A PhD-level multimodal STEM benchmark built for verifiable reasoning across physics, chemistry, biology, and math. Four STEM domains, one dataset -Physics: Quantum and Particle Physics, Condensed Matter and Materials, Electromagnetism, Photonics, and Plasma Systems, Astrophysics and Space Physics -Mathematics: Algebra and Structure, Discrete Mathematics, Analysis and Continuous Mathematics, Probability and Geometry -Biology: Evolutionary Systems, Molecular Mechanisms, Cellular Processes and Neural Biology -Chemistry: Chemical Structure, Reaction Mechanisms, Synthesis, Spectroscopy and Properties The bar is raised. Download below.

turingcom's tweet photo. Last week we released the Open MM-RL Dataset.

A PhD-level multimodal STEM benchmark built for verifiable reasoning across physics, chemistry, biology, and math. Four STEM domains, one dataset

-Physics: Quantum and Particle Physics, Condensed Matter and Materials, Electromagnetism, Photonics, and Plasma Systems, Astrophysics and Space Physics
-Mathematics: Algebra and Structure, Discrete Mathematics, Analysis and Continuous Mathematics, Probability and Geometry
-Biology: Evolutionary Systems, Molecular Mechanisms, Cellular Processes and Neural Biology
-Chemistry: Chemical Structure, Reaction Mechanisms, Synthesis, Spectroscopy and Properties

The bar is raised. Download below.

1

9

4

7

678

turingcomdev retweeted

15 days ago

https://t.co/kjLOjaN4nM

0

7

4

7

308

turingcomdev retweeted

21 days ago

Now trending at #1 on @huggingface

4

30

8

20

32K

turingcomdev retweeted

22 days ago

Open MM-RL Dataset is trending on @huggingface. We built something I've wanted for a long time. - PhD-level STEM reasoning across physics, math, biology & chemistry - 100% verifiable, auto-gradable answers - Single-image, multi-panel & multi-image formats - Two-round expert review on every problem - RL-ready reward structure out of the box Most multimodal dataset test perception. This one tests reasoning. The kind that doesn't break under scrutiny. Built by PhD SMEs. Validated for frontier models. Open to the community. Website & Dataset below.

jonsidd's tweet photo. Open MM-RL Dataset is trending on @huggingface.

We built something I've wanted for a long time.

- PhD-level STEM reasoning across physics, math, biology & chemistry
- 100% verifiable, auto-gradable answers
- Single-image, multi-panel & multi-image formats
- Two-round expert review on every problem
- RL-ready reward structure out of the box

Most multimodal dataset test perception. This one tests reasoning. The kind that doesn't break under scrutiny.

Built by PhD SMEs.
Validated for frontier models.
Open to the community.

Website & Dataset below.

3

36

11

23

6K

turingcomdev retweeted

22 days ago

Website: https://t.co/5u6duCCm18

0

12

8

661

turingcomdev retweeted

22 days ago

Most browser agent benchmarks are already solved. We built ones that aren't. 500+ tasks. 100+ templates. 50%+ model-breaking difficulty at delivery. Full case study → Below.

2

10

6

12

3K

turingcomdev retweeted

22 days ago

https://t.co/wZYMau60GS

0

12

5

9

263

turingcomdev retweeted

Turing Community @turingcomdev

23 days ago

Open-MM-RL is trending at #3 on @huggingface! This is a strong signal that the community wants harder, cleaner datasets for frontier model evaluation, training and a sign that the community is actively looking for datasets that make multimodal evaluation more rigorous. Take a look, tell us what you think, below.

1

16

8

2K

24 days ago

@turingcom https://t.co/ybjdGaBqpV

0

7

3

254

turingcomdev retweeted

24 days ago

Introducing the Open MM-RL Dataset. A PhD-level multimodal STEM benchmark built for verifiable reasoning across physics, chemistry, biology, and math. Four STEM domains, one dataset -Physics: Quantum and Particle Physics, Condensed Matter and Materials, Electromagnetism, Photonics, and Plasma Systems, Astrophysics and Space Physics -Mathematics: Algebra and Structure, Discrete Mathematics, Analysis and Continuous Mathematics, Probability and Geometry -Biology: Evolutionary Systems, Molecular Mechanisms, Cellular Processes and Neural Biology -Chemistry: Chemical Structure, Reaction Mechanisms, Synthesis, Spectroscopy and Properties We're raising the bar.

turingcom's tweet photo. Introducing the Open MM-RL Dataset.

A PhD-level multimodal STEM benchmark built for verifiable reasoning across physics, chemistry, biology, and math.

Four STEM domains, one dataset

-Physics: Quantum and Particle Physics, Condensed Matter and Materials, Electromagnetism, Photonics, and Plasma Systems, Astrophysics and Space Physics
-Mathematics: Algebra and Structure, Discrete Mathematics, Analysis and Continuous Mathematics, Probability and Geometry
-Biology: Evolutionary Systems, Molecular Mechanisms, Cellular Processes and Neural Biology
-Chemistry: Chemical Structure, Reaction Mechanisms, Synthesis, Spectroscopy and Properties

We're raising the bar.

2

28

11

28

66K

turingcomdev retweeted

24 days ago

On @huggingface https://t.co/kjLOjaN4nM

0

14

8

10

564

turingcomdev retweeted

24 days ago

Most AI systems break when documents stop being clean and predictable. We built a dataset to fix that: • 15,000+ OCR, summarization, and translation tasks • 10+ document types (handwriting, scans, financial reports, more) • 10+ languages • 95%+ summarization accuracy The hard part wasn't scale. It was realism. Multi-page docs. Rotated text. Tables, math, and messy layouts. Strict no-hallucination summaries. Plus a multi-layer QA system combining automated review and human validation to catch even subtle errors. This is what it takes to train AI for real-world document understanding. Full case study below.

1

14

5

12

2K

turingcomdev retweeted