Chinese scientists have developed,
The best shortest-path algorithm in 41 years!
A team from Tsinghua University has broken Dijkstra's "sorting barrier" - the first improvement since 1984.
Just use for a world-map 🤯
Paper - https://t.co/0AhR5O7vl4
https://t.co/a9KMVRuYGx
New @GoogleDeepMind paper builds a new benchmark and agent design so language models can actually learn from their own experience.
Right now most language model agents only keep chat logs or facts, so they remember what happened but not how to solve similar tasks better, and the authors call this conversational recall versus experience reuse.
Evo Memory turns existing benchmarks into streams of tasks arriving 1 after another, and forces agents to search past experiences, use them, then update memory each time.
The simple baseline ExpRAG stores each solved task as a short text record, retrieves a few similar ones for a new task, and inserts them into the prompt.
ReMem goes further by letting the agent choose at each step to think, act, or refine memory, actively pulling useful experiences and pruning or rewriting unhelpful ones.
Across math, question answering, tool use, and interactive environments, these self evolving memories, especially ReMem and even simple ExpRAG, boost accuracy, need fewer steps, and make smaller models behave much stronger without any retraining.
----
Paper Link – arxiv. org/abs/2511.20857
Paper Title: "Evo-Memory: Benchmarking LLM Agent Test-time Learning with Self-Evolving Memory"
🔥 Andrew Ng just quietly revolutionized AI agents.
And nobody's talking about the real impact.
》Let me explain what just changed.
I've been building 300+ production AI agents.
Healthcare systems.
Financial pipelines.
Aviation platforms.
And I can tell you:
document extraction has ALWAYS been the bottleneck that kills agent workflows.
Until yesterday.
》 The Problem We've All Been Ignoring
Your AI agent can orchestrate complex workflows. It can reason through multi-step problems. It can call APIs and make decisions.
But show it a scanned medical report with a messy table?
A financial statement with merged cells?
A handwritten construction log?
It hallucinates.
Shifts cells.
Loses data.
Fails silently.
And suddenly your brilliant agent architecture is worthless because it can't extract the INPUT DATA correctly.
I've watched hundreds of my students hit this wall.
Their LangGraph flows are perfect.
Their CrewAI orchestration is elegant.
But the document extraction?
That's where production breaks.
》 What Actually Changed
Andrew Ng's team just released DPT (Document Pre-trained Transformer), and here's why it matters for agent builders:
✸ The model breaks tables into structure FIRST
✸ Then extracts data from isolated sections
✸ Parallel processing makes it actually fast
✸ Three lines of code. Seriously.
The breakthrough isn't just accuracy. It's that the model was trained to use an agentic workflow internally.
Think about that.
》 Why This Makes Your Agents Unstoppable
Here's what I'm building this week:
✸ Healthcare agents that extract lab results from any hospital format
✸ Financial agents that pull data from messy quarterly reports into live dashboards
✸ Construction agents that digitize handwritten logs in real-time
The pattern?
Agentic document extraction becomes the FIRST node in your LangGraph. Your agent starts with clean, structured data instead of garbage.
No more "the AI got confused by the PDF" conversations with stakeholders.
DPT solves this.
For free.
With three lines of code.
》 What I'm Testing Right Now
→ Chaining DPT with PydanticAI for validated extraction
→ Using it as a tool in OpenAI Swarm multi-agent systems
→ Building MCP servers that expose DPT extraction to any agent framework
The SDK is stupid simple.
The accuracy is production-ready.
The speed makes real-time workflows possible.
》 Bottom Line
If you're building AI agents that touch documents (and you probably are), this changes your architecture.
Your agents just got significantly more powerful.
Not because they got smarter.
Because they can finally SEE the data correctly.
https://t.co/NUImGAmRmp
≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣≣
ꆛ 𝗙𝗥𝗘𝗘 𝗧𝗥𝗔𝗜𝗡𝗜𝗡𝗚:
(𝗹𝗶𝗺𝗶𝘁𝗲𝗱 𝘁𝗶𝗺𝗲)
⫸ 𝗛𝗼𝘄 𝘁𝗼 𝗕𝘂𝗶𝗹𝗱 𝗥𝗲𝗮𝗹-𝗪𝗼𝗿𝗹𝗱 𝗔𝗜 𝗔𝗴𝗲𝗻𝘁𝘀 𝗶𝗻 𝟯𝟬 𝗠𝗶𝗻𝘂𝘁𝗲𝘀 (𝗭𝗲𝗿𝗼 𝘁𝗼 𝗛𝗲𝗿𝗼)
+ 𝗕𝗢𝗡𝗨𝗦: 49-page guide with battle-tested strategies
👉 𝗚𝗘𝗧 𝗜𝗡𝗦𝗧𝗔𝗡𝗧 𝗔𝗖𝗖𝗘𝗦𝗦: https://t.co/yFzdKeytmp
🚀 Launching DeepSeek-V3.2 & DeepSeek-V3.2-Speciale — Reasoning-first models built for agents!
🔹 DeepSeek-V3.2: Official successor to V3.2-Exp. Now live on App, Web & API.
🔹 DeepSeek-V3.2-Speciale: Pushing the boundaries of reasoning capabilities. API-only for now.
📄 Tech report: https://t.co/7EyydyNuG0
1/n
Introducing Whisper Thunder aka Gen-4.5.
Today, we are excited to share our new frontier model. Gen-4.5 was built by a team that fits onto two school buses and decided to take on the largest companies in the world.
We are David and we’ve brought one hell of a slingshot.
Almost 12k stars on GitHub 👏
Thanks orderly-jame for being star #11,815 on claude-code-templates
This CLI tool helps configure and monitor Claude Code. 27 contributors now, being used across 125 countries.
Repo: https://t.co/Rtpze8ZaEW
We believe this is the first documented case of a large-scale AI cyberattack executed without substantial human intervention. It has significant implications for cybersecurity in the age of AI agents.
Read more: https://t.co/VxqERnPQRJ
A 7 million parameter model from Samsung just outperformed DeepSeek-R1, Gemini 2.5 Pro, and o3-mini on reasoning benchmarks like ARC-AGI.
Let that sink in.
It’s 10,000x smaller yet smarter.
The secret is recursion.
Instead of brute-forcing answers like giant LLMs, it drafts a full solution, then “thinks” about it, revising, self-critiquing, and improving up to 16 times.
It literally learns to reason like a mind that pauses, reflects, and corrects itself.
This could be the first real step toward thinking architectures instead of just scaling architectures.
Less compute, more thought.
Less size, more intelligence.
The future of AI might not be bigger.
It might be recursive.
@deedydas Been building in the Fintech AI space working with one of the top 5 wealth management firms in North America as a design partner. Would love to get in touch.
Sam Altman asks how the world is supposed to think about GPT-6 discovering new science -- a milestone within reach
The breakthroughs could cure disease, but the risks could create new biosecurity threats.
"humanity will adapt, as it always does, until the extraordinary becomes normal"
we hijacked microsoft's copilot studio agents and got them to spill out their private knowledge, reveal their tools and let us use them to dump full crm records
these are autonomous agents.. no human in the loop
#DEFCON#BHUSA@tamirishaysh
We just shipped automated security reviews in Claude Code. Catch vulnerabilities before they ship with two new features:
- /security-review slash command for ad-hoc security reviews
- GitHub Actions integration for automatic reviews on every PR
This is crazy...
Alibaba just dropped Wan 2.2, the world's first open-source MoE-architecture video model with cinematic control!
A major upgrade in cinematic quality, smoother movements, and prompt following.
10 mind-blowing examples:
“Agentic Web” – a visionary paper on how AI agents powered by LLMs are evolving the internet into an autonomous, collaborative ecosystem. From history to risks & future directions. Must-read for AI/Web enthusiasts! #AI#AgenticAI#LLM
https://t.co/z2aYb6fYxO