Hot Take: the best #CVPR conversations happen off the conference floor.
Turing is hosting a Happy Hour in Denver for researchers and enterprise AI leaders. Drinks, hors d'oeuvres, and real talk on LLMs and the future of AI.
DM us for details. Spots are limited.
Exactly.
Without good data, you cannot have deployment
Deployment gets you more high quality data
This is the real world AI feedback loop, and @turingcom is the leader
AI is not replacing the human element of HR. It is amplifying it.
Our SVP @TaylorFromHR sat down with @GoogleWorkspace to share how Turing's People team is transforming HR with AI:
- 33% faster help desk response times
- 80% of support tickets handled by AI assistants
- A lean team scaling a highly complex global talent strategy.
The real story behind it? Failing, iterating, and pushing forward.
Watch more below:
The AI tips you love, now from the people putting them into practice every day.
Introducing Customer AI Boost Bites: a new video series featuring real business leaders sharing how they use Gemini, NotebookLM, Gems, and more to solve challenges and save time.
Start with Taylor Bradley, VP of People at @turingcom, and learn how to build a Strategic Challenger Gem to pressure-test ideas in minutes. 💡 https://t.co/hQEQlWpwHk
Our CEO @jonsidd recently spoke with @politico about what's actually driving progress in AI, and what most people get wrong about synthetic data.
Short version: real-world deployment is where models get better. Human-AI hybrid data pipelines beat pure synthetic. And we need to rethink education before superintelligence arrives.
Full interview below.
The energy at ICLR in Rio was incredible!
From researchers pushing the boundaries of AI to conversations about what's coming next, every interaction reminded us why this community matters.
Next stop: @ICMLConf in Seoul.🇰🇷
We're excited to keep the conversation going.
See you there!
MMLU is saturated. HLE is getting there.
We built Multimodal STEM HLE++: for what comes next, and the top frontier labs publishing SOTA models are already using it.
1,100 PhD-level multimodal STEM problems that break Opus 4.6. Around 20% pass@1 on SOTA. Hard enough to expose reasoning failures. Solvable enough to generate real RL signal.
Every problem requires joint reasoning over images and text, has a deterministic ground-truth answer, and was authored by a PhD-level domain specialist.
50-task public sample on @HuggingFace.
Full pack available now. Links below.
The models are already extraordinary.
That's not the hard part anymore.
The hard part is letting them touch reality.
Real workflows. Real data. Real stakes.
The next decade belongs to whoever solves deployment, not whoever builds the best benchmark score.
I've been making that bet for seven years.
I'm more convinced than ever. Link below.
Who's actually building AI?
3 months and 14 episodes into This Week in AI, @Jason has sat down with founders and operators across infra, models, dev tools, consumer, creative, robotics, healthcare, and more.
INFRA & COMPUTE
Chase Lochmiller (Crusoe) @ChaseLochmiller
Lin Qiao (Fireworks AI) @lqiao
Chris Lattner (Modular) @clattner_llvm
Nick Harris (Lightmatter) @theanalognick
Mitesh Agrawal (Positron AI) @mitesh711
Alex Cheema (EXO Labs) @alexocheema
Philip Johnston (Starcloud) @PhilipJohnston
Naveen Rao (Unconventional AI) @NaveenGRao
Russ d'Sa (LiveKit) @dsa
FOUNDATION MODELS & RESEARCH
Kanjun Qiu (Imbue) @kanjun
Carina Hong (Axiom Math) @CarinaLHong
Jeremy Fraenkel (Fundamental) @fraenkelj
EVALS & BENCHMARKS
Anastasios Angelopoulos (Arena) @ml_angelopoulos
DEV TOOLS, CODING & AUTOMATION
Karri Saarinen (Linear) @karrisaarinen
Matan Grinberg (Factory) @matanSF
Spiros Xanthos (Resolve AI) @spirosx
Wade Foster (Zapier) @wadefoster
CONSUMER & SEARCH
Aravind Srinivas (Perplexity) @AravSrinivas
Richard Socher (youdotcom & Recursive) @RichardSocher
Tanay Kothari (Wispr Flow) @tankots
Steven Berlin Johnson (NotebookLM) @stevenbjohnson
CREATIVE & MEDIA
Demi Guo (Pika) @demi_guo_
Victor Riparbelli (Synthesia) @vriparbelli
Mikey Shulman (Suno) @MikeyShulman
Grant Lee (Gamma) @thisisgrantlee
ROBOTICS
Jake Loosararian (Gecko Robotics) @jakeloosy
Boris Sofman (Bedrock Robotics) @bsofman
HEALTHCARE
Shiv Rao (Abridge) @ShivdevRao
Trey Holterman (Tennr) @TreyHolterman
ENTERPRISE, VERTICAL & DATA
George Sivulka (Hebbia) @gsivulka
Kashif Ali (TaxGPT) @ChKashifAli
Alex Elias (Qloo) @ape
TALENT & WORKFORCE
Ali Ansari (micro1) @aliansarinik
Jonathan Siddharth (Turing) @jonsidd
Thank you all for joining!
Episode 14 out now: https://t.co/oaDPn5WfT8
Last week we released the Open MM-RL Dataset.
A PhD-level multimodal STEM benchmark built for verifiable reasoning across physics, chemistry, biology, and math. Four STEM domains, one dataset
-Physics: Quantum and Particle Physics, Condensed Matter and Materials, Electromagnetism, Photonics, and Plasma Systems, Astrophysics and Space Physics
-Mathematics: Algebra and Structure, Discrete Mathematics, Analysis and Continuous Mathematics, Probability and Geometry
-Biology: Evolutionary Systems, Molecular Mechanisms, Cellular Processes and Neural Biology
-Chemistry: Chemical Structure, Reaction Mechanisms, Synthesis, Spectroscopy and Properties
The bar is raised. Download below.