Himanshu Tyagi @hstyagi - Twitter Profile

about 1 month ago

Evolving skills to hillclimb against benchmarks is a key module for self-evolving agents. Very excited for this new open repo from Sentient.

Sentient

@SentientAGI

about 1 month ago

https://t.co/l1I41XVbFk

45

157

29

65

78K

2

13

1

2

2K

hstyagi retweeted

Oleg Golev

@oleg_golev

3 months ago

This is precisely why I'm excited about https://t.co/YggekdElgL. The goal is to crowdsource as many different solutions as possible for the hardest AI reasoning challenges. The solutions space is so vast nowadays that we have to pursue large volume and evolutionary algorithms to help us explore in parallel

4

44

5

1

6K

hstyagi retweeted

Sentient

@SentientAGI

3 months ago

Test EvoSkill on your own benchmarks: 👉 https://t.co/D2hTgFIunf Read the full technical report: 👉 https://t.co/JTTLSixVVc 👉 https://t.co/yiVjNEa1be Read our technical blog authored by @salahalzubi401: 👉 https://t.co/euN6SEY0it

2

45

5

16

7K

hstyagi retweeted

Sentient

@SentientAGI

3 months ago

Applications are now live! Cohort 0 starts March 13th in Presidio with OpenHands, OpenRouter, alphaXiv, Fireworks, Dedalus Labs, Franklin Templeton, Founders Fund and Pantera. → $25K+ in prizes → 3 weeks building state-of-the-art AI agents → Many more surprises Apply below 👇

565

733

102

167

138K

Who to follow

Prateek Jain

@jainprateek_

Learning machine learning at Google DeepMind.

Aleksander Madry

@aleks_madry

OpenAI and MIT faculty (on leave)

RL Theory Virtual Seminars

@RLtheory

Virtual seminar series featuring the latest advances in theoretical reinforcement learning. Seminars (approximately) every Tuesday at 6pm UTC.

Himanshu Tyagi

@hstyagi

3 months ago

@tripathi_neil Wake me up when Claude can make better tikz images

0

4

0

2K

hstyagi retweeted

Sentient

@SentientAGI

3 months ago

Today we are launching the next phase of AI reasoning development with Founders Fund, Franklin Templeton, Pantera Capital, Fireworks AI, OpenRouter, OpenHands, Dedalus Labs, alphaXiv, and more. AI is advancing at a relentless pace, but there are many reasoning capabilities we have yet to discover. Announcing Arena—an evaluation-driven platform for ideation, prototyping, and high-quality data generation—with top AI developers advancing SOTA performance on real-world enterprise reasoning tasks.

110

434

85

82

272K

Himanshu Tyagi

@hstyagi

3 months ago

@tripathi_neil Reasoning, by definition, is whatever is out of distribution for a model.

1

2

0

591

hstyagi retweeted

Sentient

@SentientAGI

5 months ago

Quick and nostalgic look of our work in 2025. See you all in 2026: the year of open-source reasoning.

231

674

90

16

86K

Himanshu Tyagi

@hstyagi

6 months ago

There is more where this is coming from @iiscbangalore @artparkindia

South Park Commons India

@spc_india

6 months ago

The first-ever deeptech demo night at SPC Bangalore, was stacked with some seriously cool builds! Here's a glimpse of how people are solving hard problems in hard-tech, from India. 🧵

spc_india's tweet photo. The first-ever deeptech demo night at SPC Bangalore, was stacked with some seriously cool builds!

Here's a glimpse of how people are solving hard problems in hard-tech, from India. 🧵 https://t.co/GUpDYG9yTR

9

982

124

222

99K

95

80

2

1

6K

hstyagi retweeted

Oleg Golev

@oleg_golev

6 months ago

Building a general-purpose AI agent with only open-source models is hard. Making it consistent, reliable, and fast enough for production usage is even harder. We at @SentientAGI have been optimizing both👇 Today we’re revealing SERA (Semantic Embeddings & Reasoning Agent): the AI architecture behind SERA-Crypto, our state-of-the-art agent for token research, DeFi analysis, and on-chain reasoning, combining 50+ APIs into market insights. 👉 #1 open-source agent on DMind, ahead of Perplexity Finance & Gemini, within ~2% of GPT-5 Medium on Web3 reasoning 👉 #1 on our live crypto benchmark (198 real user queries across 11 categories), beating GPT-5, Grok 4, Gemini 2.5 Pro, and Perplexity Finance More in 🧵

79

164

14

7

8K

Himanshu Tyagi

@hstyagi

6 months ago

When you want fast reasoning, good old semantic similarity is not bad. Use it to setup your prompts dynamically, all the way to the right tool call. This is what we use for our live crypto knowledge agent which integrates search and about 10 different structured data APIs.

Sentient

@SentientAGI

6 months ago

Announcing SERA-Crypto (Semantic Embedding & Reasoning Agent): our new reasoning architecture built for SOTA crypto research. #1 open-source agent on DMind #1 on our live crypto benchmark Outperforms GPT-5, Grok 4, Gemini 2.5 Pro, and Perplexity Finance…all under 45 seconds.

SentientAGI's tweet photo. Announcing SERA-Crypto (Semantic Embedding & Reasoning Agent): our new reasoning architecture built for SOTA crypto research.

#1 open-source agent on DMind
#1 on our live crypto benchmark

Outperforms GPT-5, Grok 4, Gemini 2.5 Pro, and Perplexity Finance…all under 45 seconds. https://t.co/zWG2VmKeOw

297

867

130

44

284K

39

109

3

1

4K

Himanshu Tyagi

@hstyagi

6 months ago

@bdguan It is such a beautiful subject. Only friends and parents have the patience to indulge in it. Schools are busy teaching.

0

2

0

510

Himanshu Tyagi

@hstyagi

7 months ago

@deedydas Yann was pretty famous in 2010 :D And yes Soumith is a legend!

0

4

0

6K

Himanshu Tyagi

@hstyagi

7 months ago

If diffusion models drive all creative arts, we will learn that humans are not more creative than a kettle dissipating heat to boil water. A bit sad...

240

223

11

2

8K

Himanshu Tyagi

@hstyagi

7 months ago

@abeirami It is a blessing and a burden! You keep on wishing that heuristics driven from beautiful beautiful geometric insights give the best algorithms :)

0

1

0

375

Himanshu Tyagi

@hstyagi

7 months ago

ROMA is a very simple and versatile architecture that recursively breaks complex queries into simpler ones. This method of coordinating multiple agents/tools/models is apt for deep research, long horizon tasks and boosting the power of models. This is emerging as an important primitive for multiagent reasoning systems across industries. This new version of the repo is more builder friendly and comes with prompt optimizer capabilities of DSPy. You can build a lot of stuff on it!

Salah Alzu'bi

@salahalzubi401

7 months ago

[1/8] 🧵 🚀 ROMA (Recursive Open Meta Agents) v0.2.0 is here! Many exciting features have been added to streamline research/production threads: for better reliability and a builder-friendly ecosystem for high-performance recursive multi-agent systems. Stay tuned for the upcoming paper with some exciting results!We've completely rebuilt our framework using@DSPyOSS In this thread: the motivation and technical details behind ROMA, exciting research directions we're exploring, and our vision for recursive agents going forward https://t.co/qVol7xA15A

93

321

34

86

71K

256

385

21

19

43K

hstyagi retweeted

Sentient

@SentientAGI

8 months ago

We’re excited to announce that @NeurIPSConf—the biggest AI conference in the world—has accepted 4 of our papers across various categories. Some might even call it “full-stack excellence” 😁 Here’s a sneak peek at our work that’s been recognized for their breakthroughs: ➡️ OML 1.0 (Main Track): scalable LLM fingerprinting—a hundredfold improvement on legacy fingerprinting attempts for open models, injecting 24,576 persistent prints while the previous max was ~100 fingerprints…without any drop in model performance. ➡️ LiveCodeBenchPro (Data & Benchmark Track): our customized benchmark focusing on programming ability, illustrating the true capabilities of models’ coding performance. On this benchmark, we were able to create models 10x smaller, using 20% of the data, to achieve comparable results to competing models. ➡️ MindGames Arena (Competition Track): selected by NeurIPS to run an AI competition for agents to improve themselves through social games. The next paradigm of AI improvement comes through self-optimization, and we’re extremely excited to be hosting this first-of-its-kind competition to create self-improving AI. ➡️ OML (Workshops & Tutorials—Lock-LLMs): our work established the challenge and solution around model security: a primitive that lets builders develop open models with verifiable, cryptographically enforced control under white-box access. Stay tuned for deep-dive threads throughout the week!

948

2K

305

84

617K