This is precisely why I'm excited about https://t.co/YggekdElgL. The goal is to crowdsource as many different solutions as possible for the hardest AI reasoning challenges.
The solutions space is so vast nowadays that we have to pursue large volume and evolutionary algorithms to help us explore in parallel
Test EvoSkill on your own benchmarks:
👉 https://t.co/D2hTgFIunf
Read the full technical report:
👉 https://t.co/JTTLSixVVc
👉 https://t.co/yiVjNEa1be
Read our technical blog authored by @salahalzubi401:
👉 https://t.co/euN6SEY0it
Applications are now live!
Cohort 0 starts March 13th in Presidio with OpenHands, OpenRouter, alphaXiv, Fireworks, Dedalus Labs, Franklin Templeton, Founders Fund and Pantera.
→ $25K+ in prizes
→ 3 weeks building state-of-the-art AI agents
→ Many more surprises
Apply below 👇
Today we are launching the next phase of AI reasoning development with Founders Fund, Franklin Templeton, Pantera Capital, Fireworks AI, OpenRouter, OpenHands, Dedalus Labs, alphaXiv, and more.
AI is advancing at a relentless pace, but there are many reasoning capabilities we have yet to discover.
Announcing Arena—an evaluation-driven platform for ideation, prototyping, and high-quality data generation—with top AI developers advancing SOTA performance on real-world enterprise reasoning tasks.
The first-ever deeptech demo night at SPC Bangalore, was stacked with some seriously cool builds!
Here's a glimpse of how people are solving hard problems in hard-tech, from India. 🧵
Building a general-purpose AI agent with only open-source models is hard. Making it consistent, reliable, and fast enough for production usage is even harder.
We at @SentientAGI have been optimizing both👇
Today we’re revealing SERA (Semantic Embeddings & Reasoning Agent): the AI architecture behind SERA-Crypto, our state-of-the-art agent for token research, DeFi analysis, and on-chain reasoning, combining 50+ APIs into market insights.
👉 #1 open-source agent on DMind, ahead of Perplexity Finance & Gemini, within ~2% of GPT-5 Medium on Web3 reasoning
👉 #1 on our live crypto benchmark (198 real user queries across 11 categories), beating GPT-5, Grok 4, Gemini 2.5 Pro, and Perplexity Finance
More in 🧵
When you want fast reasoning, good old semantic similarity is not bad. Use it to setup your prompts dynamically, all the way to the right tool call. This is what we use for our live crypto knowledge agent which integrates search and about 10 different structured data APIs.
Announcing SERA-Crypto (Semantic Embedding & Reasoning Agent): our new reasoning architecture built for SOTA crypto research.
#1 open-source agent on DMind
#1 on our live crypto benchmark
Outperforms GPT-5, Grok 4, Gemini 2.5 Pro, and Perplexity Finance…all under 45 seconds.
If diffusion models drive all creative arts, we will learn that humans are not more creative than a kettle dissipating heat to boil water. A bit sad...
@abeirami It is a blessing and a burden! You keep on wishing that heuristics driven from beautiful beautiful geometric insights give the best algorithms :)
ROMA is a very simple and versatile architecture that recursively breaks complex queries into simpler ones. This method of coordinating multiple agents/tools/models is apt for deep research, long horizon tasks and boosting the power of models. This is emerging as an important primitive for multiagent reasoning systems across industries.
This new version of the repo is more builder friendly and comes with prompt optimizer capabilities of DSPy. You can build a lot of stuff on it!
[1/8] 🧵
🚀 ROMA (Recursive Open Meta Agents) v0.2.0 is here! Many exciting features have been added to streamline research/production threads:
for better reliability and a builder-friendly ecosystem for high-performance recursive multi-agent systems. Stay tuned for the upcoming paper with some exciting results!We've completely rebuilt our framework using@DSPyOSS
In this thread: the motivation and technical details behind ROMA, exciting research directions we're exploring, and our vision for recursive agents going forward
https://t.co/qVol7xA15A
We’re excited to announce that @NeurIPSConf—the biggest AI conference in the world—has accepted 4 of our papers across various categories. Some might even call it “full-stack excellence” 😁
Here’s a sneak peek at our work that’s been recognized for their breakthroughs:
➡️ OML 1.0 (Main Track): scalable LLM fingerprinting—a hundredfold improvement on legacy fingerprinting attempts for open models, injecting 24,576 persistent prints while the previous max was ~100 fingerprints…without any drop in model performance.
➡️ LiveCodeBenchPro (Data & Benchmark Track): our customized benchmark focusing on programming ability, illustrating the true capabilities of models’ coding performance. On this benchmark, we were able to create models 10x smaller, using 20% of the data, to achieve comparable results to competing models.
➡️ MindGames Arena (Competition Track): selected by NeurIPS to run an AI competition for agents to improve themselves through social games. The next paradigm of AI improvement comes through self-optimization, and we’re extremely excited to be hosting this first-of-its-kind competition to create self-improving AI.
➡️ OML (Workshops & Tutorials—Lock-LLMs): our work established the challenge and solution around model security: a primitive that lets builders develop open models with verifiable, cryptographically enforced control under white-box access.
Stay tuned for deep-dive threads throughout the week!