Deepak Choudhary @deepak201001 - Twitter Profile

deepak201001 retweeted

6 months ago

Chinese scientists have developed, The best shortest-path algorithm in 41 years! A team from Tsinghua University has broken Dijkstra's "sorting barrier" - the first improvement since 1984. Just use for a world-map 🤯 Paper - https://t.co/0AhR5O7vl4 https://t.co/a9KMVRuYGx

473

29K

3K

18K

3M

deepak201001 retweeted

Rohan Paul

@rohanpaul_ai

7 months ago

New @GoogleDeepMind paper builds a new benchmark and agent design so language models can actually learn from their own experience. Right now most language model agents only keep chat logs or facts, so they remember what happened but not how to solve similar tasks better, and the authors call this conversational recall versus experience reuse. Evo Memory turns existing benchmarks into streams of tasks arriving 1 after another, and forces agents to search past experiences, use them, then update memory each time. The simple baseline ExpRAG stores each solved task as a short text record, retrieves a few similar ones for a new task, and inserts them into the prompt. ReMem goes further by letting the agent choose at each step to think, act, or refine memory, actively pulling useful experiences and pruning or rewriting unhelpful ones. Across math, question answering, tool use, and interactive environments, these self evolving memories, especially ReMem and even simple ExpRAG, boost accuracy, need fewer steps, and make smaller models behave much stronger without any retraining. ---- Paper Link – arxiv. org/abs/2511.20857 Paper Title: "Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory"

rohanpaul_ai's tweet photo. New @GoogleDeepMind paper builds a new benchmark and agent design so language models can actually learn from their own experience.

Right now most language model agents only keep chat logs or facts, so they remember what happened but not how to solve similar tasks better, and the authors call this conversational recall versus experience reuse.

Evo Memory turns existing benchmarks into streams of tasks arriving 1 after another, and forces agents to search past experiences, use them, then update memory each time.

The simple baseline ExpRAG stores each solved task as a short text record, retrieves a few similar ones for a new task, and inserts them into the prompt.

ReMem goes further by letting the agent choose at each step to think, act, or refine memory, actively pulling useful experiences and pruning or rewriting unhelpful ones.

Across math, question answering, tool use, and interactive environments, these self evolving memories, especially ReMem and even simple ExpRAG, boost accuracy, need fewer steps, and make smaller models behave much stronger without any retraining.

----

Paper Link – arxiv. org/abs/2511.20857

Paper Title: "Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory"

27

621

108

512

49K

deepak201001 retweeted

Maryam Miradi, PhD

@MaryamMiradi

7 months ago

🔥 Andrew Ng just quietly revolutionized AI agents. And nobody's talking about the real impact. 》Let me explain what just changed. I've been building 300+ production AI agents. Healthcare systems. Financial pipelines. Aviation platforms. And I can tell you: document extraction has ALWAYS been the bottleneck that kills agent workflows. Until yesterday. 》 The Problem We've All Been Ignoring Your AI agent can orchestrate complex workflows. It can reason through multi-step problems. It can call APIs and make decisions. But show it a scanned medical report with a messy table? A financial statement with merged cells? A handwritten construction log? It hallucinates. Shifts cells. Loses data. Fails silently. And suddenly your brilliant agent architecture is worthless because it can't extract the INPUT DATA correctly. I've watched hundreds of my students hit this wall. Their LangGraph flows are perfect. Their CrewAI orchestration is elegant. But the document extraction? That's where production breaks. 》 What Actually Changed Andrew Ng's team just released DPT (Document Pre-trained Transformer), and here's why it matters for agent builders: ✸ The model breaks tables into structure FIRST ✸ Then extracts data from isolated sections ✸ Parallel processing makes it actually fast ✸ Three lines of code. Seriously. The breakthrough isn't just accuracy. It's that the model was trained to use an agentic workflow internally. Think about that. 》 Why This Makes Your Agents Unstoppable Here's what I'm building this week: ✸ Healthcare agents that extract lab results from any hospital format ✸ Financial agents that pull data from messy quarterly reports into live dashboards ✸ Construction agents that digitize handwritten logs in real-time The pattern? Agentic document extraction becomes the FIRST node in your LangGraph. Your agent starts with clean, structured data instead of garbage. No more "the AI got confused by the PDF" conversations with stakeholders. DPT solves this. For free. With three lines of code. 》 What I'm Testing Right Now → Chaining DPT with PydanticAI for validated extraction → Using it as a tool in OpenAI Swarm multi-agent systems → Building MCP servers that expose DPT extraction to any agent framework The SDK is stupid simple. The accuracy is production-ready. The speed makes real-time workflows possible. 》 Bottom Line If you're building AI agents that touch documents (and you probably are), this changes your architecture. Your agents just got significantly more powerful. Not because they got smarter. Because they can finally SEE the data correctly. https://t.co/NUImGAmRmp ≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣ ꆛ 𝗙𝗥𝗘𝗘 𝗧𝗥𝗔𝗜𝗡𝗜𝗡𝗚: (𝗹𝗶𝗺𝗶𝘁𝗲𝗱 𝘁𝗶𝗺𝗲) ⫸ 𝗛𝗼𝘄 𝘁𝗼 𝗕𝘂𝗶𝗹𝗱 𝗥𝗲𝗮𝗹-𝗪𝗼𝗿𝗹𝗱 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁𝘀 𝗶𝗻 𝟯𝟬 𝗠𝗶𝗻𝘂𝘁𝗲𝘀 (𝗭𝗲𝗿𝗼 𝘁𝗼 𝗛𝗲𝗿𝗼) + 𝗕𝗢𝗡𝗨𝗦: 49-page guide with battle-tested strategies 👉 𝗚𝗘𝗧 𝗜𝗡𝗦𝗧𝗔𝗡𝗧 𝗔𝗖𝗖𝗘𝗦𝗦: https://t.co/yFzdKeytmp

MaryamMiradi's tweet photo. 🔥 Andrew Ng just quietly revolutionized AI agents.

And nobody's talking about the real impact.

》Let me explain what just changed.

I've been building 300+ production AI agents.

Healthcare systems.
Financial pipelines.
Aviation platforms.

And I can tell you:
document extraction has ALWAYS been the bottleneck that kills agent workflows.

Until yesterday.

》 The Problem We've All Been Ignoring

Your AI agent can orchestrate complex workflows. It can reason through multi-step problems. It can call APIs and make decisions.

But show it a scanned medical report with a messy table?
A financial statement with merged cells?
A handwritten construction log?

It hallucinates.
Shifts cells.
Loses data.
Fails silently.

And suddenly your brilliant agent architecture is worthless because it can't extract the INPUT DATA correctly.

I've watched hundreds of my students hit this wall.
Their LangGraph flows are perfect.
Their CrewAI orchestration is elegant.

But the document extraction?
That's where production breaks.

》 What Actually Changed

Andrew Ng's team just released DPT (Document Pre-trained Transformer), and here's why it matters for agent builders:

✸ The model breaks tables into structure FIRST
✸ Then extracts data from isolated sections
✸ Parallel processing makes it actually fast
✸ Three lines of code. Seriously.

The breakthrough isn't just accuracy. It's that the model was trained to use an agentic workflow internally.

Think about that.

》 Why This Makes Your Agents Unstoppable

Here's what I'm building this week:

✸ Healthcare agents that extract lab results from any hospital format
✸ Financial agents that pull data from messy quarterly reports into live dashboards
✸ Construction agents that digitize handwritten logs in real-time

The pattern?

Agentic document extraction becomes the FIRST node in your LangGraph. Your agent starts with clean, structured data instead of garbage.

No more "the AI got confused by the PDF" conversations with stakeholders.

DPT solves this.
For free.
With three lines of code.

》 What I'm Testing Right Now

→ Chaining DPT with PydanticAI for validated extraction

→ Using it as a tool in OpenAI Swarm multi-agent systems

→ Building MCP servers that expose DPT extraction to any agent framework

The SDK is stupid simple.
The accuracy is production-ready.
The speed makes real-time workflows possible.

》 Bottom Line

If you're building AI agents that touch documents (and you probably are), this changes your architecture.

Your agents just got significantly more powerful.
Not because they got smarter.
Because they can finally SEE the data correctly.

https://t.co/NUImGAmRmp

≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣

ꆛ 𝗙𝗥𝗘𝗘 𝗧𝗥𝗔𝗜𝗡𝗜𝗡𝗚:
(𝗹𝗶𝗺𝗶𝘁𝗲𝗱 𝘁𝗶𝗺𝗲)

⫸ 𝗛𝗼𝘄 𝘁𝗼 𝗕𝘂𝗶𝗹𝗱 𝗥𝗲𝗮𝗹-𝗪𝗼𝗿𝗹𝗱 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁𝘀 𝗶𝗻 𝟯𝟬 𝗠𝗶𝗻𝘂𝘁𝗲𝘀 (𝗭𝗲𝗿𝗼 𝘁𝗼 𝗛𝗲𝗿𝗼)

+ 𝗕𝗢𝗡𝗨𝗦: 49-page guide with battle-tested strategies

👉 𝗚𝗘𝗧 𝗜𝗡𝗦𝗧𝗔𝗡𝗧 𝗔𝗖𝗖𝗘𝗦𝗦: https://t.co/yFzdKeytmp

20

828

130

1K

75K

deepak201001 retweeted

DeepSeek

@deepseek_ai

7 months ago

🚀 Launching DeepSeek-V3.2 & DeepSeek-V3.2-Speciale — Reasoning-first models built for agents! 🔹 DeepSeek-V3.2: Official successor to V3.2-Exp. Now live on App, Web & API. 🔹 DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities. API-only for now. 📄 Tech report: https://t.co/7EyydyNuG0 1/n

deepseek_ai's tweet photo. 🚀 Launching DeepSeek-V3.2 & DeepSeek-V3.2-Speciale — Reasoning-first models built for agents!

🔹 DeepSeek-V3.2: Official successor to V3.2-Exp. Now live on App, Web & API.
🔹 DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities. API-only for now.

📄 Tech report: https://t.co/7EyydyNuG0

1/n

953

16K

2K

3K

5M

Who to follow

Utkarsh Bagri

@utkarshbagri

building a brain for my AI 🧠 previously built: a q-comm 0→400k orders/mo & an edtech 0→5M users IIT Kanpur · DMs open

Saket Diwakar

@SaketDiwakar

Leading Engineering at Vestra

Gopi Ramena

@gopi1410

@Samsung Research, IIT Kanpur Alumni, Entrepreneur

deepak201001 retweeted

Nicolas Neubert

@iamneubert

7 months ago

Introducing Whisper Thunder aka Gen-4.5. Today, we are excited to share our new frontier model. Gen-4.5 was built by a team that fits onto two school buses and decided to take on the largest companies in the world. We are David and we’ve brought one hell of a slingshot.

288

4K

382

2K

702K

deepak201001 retweeted

Daniel San

@dani_avila7

7 months ago

Almost 12k stars on GitHub 👏 Thanks orderly-jame for being star #11,815 on claude-code-templates This CLI tool helps configure and monitor Claude Code. 27 contributors now, being used across 125 countries. Repo: https://t.co/Rtpze8ZaEW

dani_avila7's tweet photo. Almost 12k stars on GitHub 👏

Thanks orderly-jame for being star #11,815 on claude-code-templates

This CLI tool helps configure and monitor Claude Code. 27 contributors now, being used across 125 countries.

Repo: https://t.co/Rtpze8ZaEW https://t.co/HP02OvI4cQ

8

306

26

439

37K

deepak201001 retweeted

Anthropic

@AnthropicAI

8 months ago

We believe this is the first documented case of a large-scale AI cyberattack executed without substantial human intervention. It has significant implications for cybersecurity in the age of AI agents. Read more: https://t.co/VxqERnPQRJ

327

12K

2K

8K

8M

deepak201001 retweeted

Surya Ganguli

@SuryaGanguli

8 months ago

Using AI to help you on your homework is like using a robot to help you lift weights at the gym.

107

1K

176

109

92K

deepak201001 retweeted

VraserX e/acc

@VraserX

9 months ago

A 7 million parameter model from Samsung just outperformed DeepSeek-R1, Gemini 2.5 Pro, and o3-mini on reasoning benchmarks like ARC-AGI. Let that sink in. It’s 10,000x smaller yet smarter. The secret is recursion. Instead of brute-forcing answers like giant LLMs, it drafts a full solution, then “thinks” about it, revising, self-critiquing, and improving up to 16 times. It literally learns to reason like a mind that pauses, reflects, and corrects itself. This could be the first real step toward thinking architectures instead of just scaling architectures. Less compute, more thought. Less size, more intelligence. The future of AI might not be bigger. It might be recursive.

VraserX's tweet photo. A 7 million parameter model from Samsung just outperformed DeepSeek-R1, Gemini 2.5 Pro, and o3-mini on reasoning benchmarks like ARC-AGI.

Let that sink in.
It’s 10,000x smaller yet smarter.

The secret is recursion.
Instead of brute-forcing answers like giant LLMs, it drafts a full solution, then “thinks” about it, revising, self-critiquing, and improving up to 16 times.

It literally learns to reason like a mind that pauses, reflects, and corrects itself.

This could be the first real step toward thinking architectures instead of just scaling architectures.

Less compute, more thought.
Less size, more intelligence.

The future of AI might not be bigger.
It might be recursive.

55

2K

251

972

135K

Deepak Choudhary @deepak201001

10 months ago

@deedydas Been building in the Fintech AI space working with one of the top 5 wealth management firms in North America as a design partner. Would love to get in touch.

0

35

Deepak Choudhary @deepak201001

11 months ago

@rauchg Making an ai tool for speeding up m&a deals

0

6

deepak201001 retweeted

Haider.

@haider1

11 months ago

Sam Altman asks how the world is supposed to think about GPT-6 discovering new science -- a milestone within reach The breakthroughs could cure disease, but the risks could create new biosecurity threats. "humanity will adapt, as it always does, until the extraordinary becomes normal"

213

1K

133

412

570K

deepak201001 retweeted

Michael Bargury

@mbrg0

11 months ago

we hijacked microsoft's copilot studio agents and got them to spill out their private knowledge, reveal their tools and let us use them to dump full crm records these are autonomous agents.. no human in the loop #DEFCON #BHUSA @tamirishaysh

mbrg0's tweet photo. we hijacked microsoft's copilot studio agents and got them to spill out their private knowledge, reveal their tools and let us use them to dump full crm records

these are autonomous agents.. no human in the loop

#DEFCON #BHUSA @tamirishaysh https://t.co/H9Dk8IVtJt

101

8K

854

6K

1M

deepak201001 retweeted

Greg Burnham @GregHBurnham

11 months ago

Careful not to cut yourself on the jagged frontier

261

6K

109

945

882K

deepak201001 retweeted

Claude

@claudeai

11 months ago

We just shipped automated security reviews in Claude Code. Catch vulnerabilities before they ship with two new features: - /security-review slash command for ad-hoc security reviews - GitHub Actions integration for automatic reviews on every PR

163

7K

721

3K

1M

deepak201001 retweeted

Angry Tom

@AngryTomtweets

11 months ago

This is crazy... Alibaba just dropped Wan 2.2, the world's first open-source MoE-architecture video model with cinematic control! A major upgrade in cinematic quality, smoother movements, and prompt following. 10 mind-blowing examples:

65

1K

159

1K

379K

Deepak Choudhary @deepak201001

11 months ago

@Lauramaywendel Would like to try this!

0

10

Deepak Choudhary @deepak201001

11 months ago

@0xstrongs @paulg @grok @grok Could also be construction

1

0

2K

Deepak Choudhary @deepak201001

11 months ago

“Agentic Web” – a visionary paper on how AI agents powered by LLMs are evolving the internet into an autonomous, collaborative ecosystem. From history to risks & future directions. Must-read for AI/Web enthusiasts! #AI #AgenticAI #LLM https://t.co/z2aYb6fYxO

0

21

Deepak Choudhary

@deepak201001

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users