Ramchandra Darade

@agentsmith011

Joined March 2026

65 Following

0 Followers

114 Posts

agentsmith011 retweeted

Avi Chawla

@_avichawla

8 days ago

8 RAG architectures for AI Engineers: (explained with usage) 1) Naive RAG - Retrieves documents purely based on vector similarity between the query embedding and stored embeddings. - Works best for simple, fact-based queries where direct semantic matching suffices. 2) Multimodal RAG - Handles multiple data types (text, images, audio, etc.) by embedding and retrieving across modalities. - Ideal for cross-modal retrieval tasks like answering a text query with both text and image context. 3) HyDE (Hypothetical Document Embeddings) - Queries are not semantically similar to documents. - This technique generates a hypothetical answer document from the query before retrieval. - Uses this generated document’s embedding to find more relevant real documents. 4) Corrective RAG - Validates retrieved results by comparing them against trusted sources (e.g., web search). - Ensures up-to-date and accurate information, filtering or correcting retrieved content before passing to the LLM. 5) Graph RAG - Converts retrieved content into a knowledge graph to capture relationships and entities. - Enhances reasoning by providing structured context alongside raw text to the LLM. 6) Hybrid RAG - Combines dense vector retrieval with graph-based retrieval in a single pipeline. - Useful when the task requires both unstructured text and structured relational data for richer answers. 7) Adaptive RAG - Dynamically decides if a query requires a simple direct retrieval or a multi-step reasoning chain. - Breaks complex queries into smaller sub-queries for better coverage and accuracy. 8) Agentic RAG - Uses AI agents with planning, reasoning (ReAct, CoT), and memory to orchestrate retrieval from multiple sources. - Best suited for complex workflows that require tool use, external APIs, or combining multiple RAG techniques. Most architectures here involve some form of retrieval-time decision. But they all run on top of whatever was already indexed. If that indexing step outputs messy chunks, every architecture inherits them. Improving it is a separate problem from the 8 above. My co-founder wrote about a better unit for the indexing step. The technique: - cuts corpus size by 40x. - reduces tokens per query by 3x. - improves vector search relevance by 2.3x. And it doesn't alter the retrieval algorithm, the reranker, or the embedding model. Read it below.

_avichawla's tweet photo. 8 RAG architectures for AI Engineers:

(explained with usage)

1) Naive RAG
- Retrieves documents purely based on vector similarity between the query embedding and stored embeddings.
- Works best for simple, fact-based queries where direct semantic matching suffices.

2) Multimodal RAG
- Handles multiple data types (text, images, audio, etc.) by embedding and retrieving across modalities.
- Ideal for cross-modal retrieval tasks like answering a text query with both text and image context.

3) HyDE (Hypothetical Document Embeddings)
- Queries are not semantically similar to documents.
- This technique generates a hypothetical answer document from the query before retrieval.
- Uses this generated document’s embedding to find more relevant real documents.

4) Corrective RAG
- Validates retrieved results by comparing them against trusted sources (e.g., web search).
- Ensures up-to-date and accurate information, filtering or correcting retrieved content before passing to the LLM.

5) Graph RAG
- Converts retrieved content into a knowledge graph to capture relationships and entities.
- Enhances reasoning by providing structured context alongside raw text to the LLM.

6) Hybrid RAG
- Combines dense vector retrieval with graph-based retrieval in a single pipeline.
- Useful when the task requires both unstructured text and structured relational data for richer answers.

7) Adaptive RAG
- Dynamically decides if a query requires a simple direct retrieval or a multi-step reasoning chain.
- Breaks complex queries into smaller sub-queries for better coverage and accuracy.

8) Agentic RAG
- Uses AI agents with planning, reasoning (ReAct, CoT), and memory to orchestrate retrieval from multiple sources.
- Best suited for complex workflows that require tool use, external APIs, or combining multiple RAG techniques.

Most architectures here involve some form of retrieval-time decision. But they all run on top of whatever was already indexed.

If that indexing step outputs messy chunks, every architecture inherits them. Improving it is a separate problem from the 8 above.

My co-founder wrote about a better unit for the indexing step. The technique:

- cuts corpus size by 40x.
- reduces tokens per query by 3x.
- improves vector search relevance by 2.3x.

And it doesn't alter the retrieval algorithm, the reranker, or the embedding model.

Read it below.

920

175

87K

agentsmith011 retweeted

Rahul

@sairahul1

11 days ago

Andrej Karpathy: "90% of Claude's mistakes come from missing context, not a weak model." 41% mistake rate without a CLAUDE.md. 11% with the 4-rule baseline. 3% with the 12-rule version below here are the 12 rules senior engineers settled on: 1. think before coding: state assumptions, don't guess. the model can't read your mind, stop hoping it will 2. simplicity first: minimum code, no speculative abstractions. the moment you let Claude add "for future flexibility," you've added 200 lines you'll delete next quarter 3. surgical changes: touch only what you must. don't let it improve adjacent code, that's how PRs blow up 4. goal-driven execution: define success criteria upfront, loop until verified. without them Claude either loops forever or stops too early 5. use the model only for judgment calls: classification, drafting, summarization, extraction. NOT routing, retries, status-code handling, deterministic transforms. if code can answer, code answers 6. token budgets are not advisory: per-task 4000, per-session 30000. by message 40 of a long debug, Claude is re-suggesting fixes you rejected at message 5 7. surface conflicts, don't average them: two patterns in the codebase? pick one. Claude blending them is how errors get swallowed twice 8. read before you write: read exports, callers, shared utilities. Claude will happily add a duplicate function next to an identical one it never read 9. tests verify intent, not just behavior: a test that can't fail when business logic changes is wrong. all 12 of Claude's tests can pass while the function returns a constant 10. checkpoint every significant step: Claude finished steps 5 and 6 on top of a broken state from step 4. nobody noticed for an hour 11. match the codebase conventions: class components? don't fork to hooks silently. testing patterns assumed componentDidMount, hooks broke them without surfacing 12. fail loud: "completed successfully" with 14% of records silently skipped is the worst class of bug. surface uncertainty, don't hide it what actually compounds instead of the next framework: - the CLAUDE.md file as institutional memory across sessions - eval-driven changes, not vibe-driven - checkpoints over speed - explicit conflicts over silent blending - discipline over framework, every time - one repo, one rules file, no exceptions you don't need a better AI you need better context engineering complete playbook below ↓

282

415K

agentsmith011 retweeted

Daily Dose of Data Science

@DailyDoseOfDS_

14 days ago

Sub-Agents vs Agent Teams in Claude Code: Sub-agents get their own system prompt, their own tool set, and a clean context window. They report back to the parent and terminate. Agent teams get all of that plus three things sub-agents don't have: - a shared task list with dependency tracking - peer-to-peer messaging between teammates - persistent context that accumulates over time. We published an article that dives into a lot more detail. Read it below.

DailyDoseOfDS_'s tweet photo. Sub-Agents vs Agent Teams in Claude Code:

Sub-agents get their own system prompt, their own tool set, and a clean context window. They report back to the parent and terminate.

Agent teams get all of that plus three things sub-agents don't have:
- a shared task list with dependency tracking
- peer-to-peer messaging between teammates
- persistent context that accumulates over time.

We published an article that dives into a lot more detail.

Read it below.

185

167

12K

agentsmith011 retweeted

Rahul

@sairahul1

14 days ago

Anthropic pays $750,000+ a year for engineers who can build LLMs from scratch. Not how to prompt them. Not how to fine-tune them. Not how to build RAG pipelines. But how to build them from scratch. This 2-hour Stanford lecture teaches you everything. Scaling laws. Data collection. Architecture design. Post-training alignment. Free. From Stanford. Watch first. Then read this. The lecture is the theory. And this article shows you how to actually build it (with code) ↓

485

416K

agentsmith011 retweeted

Viktor Oddy

@viktoroddy

17 days ago

Claude Mythos is Insane! ❤️‍🔥Just recorded a 12-min tutorial on how to build animated, award-Winning websites with Claude Fable 5!

232

17K

31K

agentsmith011 retweeted

Hamza Khalid

@humzaakhalid

18 days ago

KARPATHY JUST HANDED EVERY DEVELOPER THE EXACT FILE CLAUDE CODE NEEDED FROM DAY ONE. 65 lines. 110K stars. the cheat code for every broken workflow you've been blaming on the model. if I had this a year ago, I would've shipped twice as fast. make sure to bookmark it before it gets lost in your feed. I was losing 2 hours a day to Claude rewriting code I didn't ask it to touch. then I found CLAUDE. md. 90 seconds to set up. changed everything. Karpathy identified 4 failure patterns Claude Code repeats constantly, in his own words: → silent assumptions: Claude makes decisions without checking with you → code bloat: 1000 lines written when 100 would do → collateral damage: Claude edits code unrelated to the task → no success criteria: Claude loops with no finish line these aren't model failures. they're missing instructions. CLAUDE. md gives Claude the 4 rules it needed from day one: → think before coding, state assumptions. ask before assuming. → simplicity first, minimum code. nothing speculative. → surgical changes, touch only what is required. nothing adjacent. → goal-driven execution, define success before starting. loop until verified. 65 lines. no build step. no framework. no dependencies. just the 4 principles every developer already knew, but needed Karpathy to write down. (Link to the REPO in the comment below) the guide on how to build a second brain with CLAUDE is in the article below.

humzaakhalid's tweet photo. KARPATHY JUST HANDED EVERY DEVELOPER THE EXACT FILE CLAUDE CODE NEEDED FROM DAY ONE.

65 lines. 110K stars. the cheat code for every broken workflow you've been blaming on the model.

if I had this a year ago, I would've shipped twice as fast.

make sure to bookmark it before it gets lost in your feed.

I was losing 2 hours a day to Claude rewriting code I didn't ask it to touch.
then I found CLAUDE. md.

90 seconds to set up. changed everything.

Karpathy identified 4 failure patterns Claude Code repeats constantly, in his own words:

→ silent assumptions: Claude makes decisions without checking with you
→ code bloat: 1000 lines written when 100 would do
→ collateral damage: Claude edits code unrelated to the task
→ no success criteria: Claude loops with no finish line

these aren't model failures. they're missing instructions.

CLAUDE. md gives Claude the 4 rules it needed from day one:
→ think before coding, state assumptions. ask before assuming.
→ simplicity first, minimum code. nothing speculative.
→ surgical changes, touch only what is required. nothing adjacent.
→ goal-driven execution, define success before starting. loop until verified.
65 lines. no build step. no framework. no dependencies.

just the 4 principles every developer already knew, but needed Karpathy to write down.

(Link to the REPO in the comment below)

the guide on how to build a second brain with CLAUDE is in the article below.

210

257K

agentsmith011 retweeted

Dario Amodei

@DarioAmodei

17 days ago

Today I'm publishing a new essay, Policy on the AI Exponential. AI is progressing extremely fast—much faster than the policy process was built to handle. The essay lays out where I think the technology is now, and the action needed to close the gap: https://t.co/Lh6PWae178

14K

12K

agentsmith011 retweeted

ollama

@ollama

18 days ago

Use Ollama with Hermes Desktop by @NousResearch. Hermes Desktop brings the same agent (its multi-agent engine, self-improving skills, and messaging integrations) into a desktop app on macOS, Windows, and Linux. Run it on Ollama using local or cloud with one command: ollama launch hermes-desktop 🧵

ollama's tweet photo. Use Ollama with Hermes Desktop by @NousResearch.

Hermes Desktop brings the same agent (its multi-agent engine, self-improving skills, and messaging integrations) into a desktop app on macOS, Windows, and Linux.

Run it on Ollama using local or cloud with one command:

ollama launch hermes-desktop

🧵

904

119

491

53K

agentsmith011 retweeted

Prompter

@PromptLLM

18 days ago

Insane advice from Fable 5

260

14K

812

10K

agentsmith011 retweeted

Brij Pandey

@LearnWithBrij

18 days ago

𝗖𝗟𝗔𝗨𝗗𝗘.𝗺𝗱 is NOT a README. Most devs: → Add a few bullets → Maybe a build command → Call it “done” Then complain: “Claude writes bad code” 🤦‍♂️ No. Your CLAUDE.md is just… useless. Here’s how to fix it 👇 1️⃣ Use ALL 3 scopes (not just one) • Global → ~/.claude/CLAUDE.md • Project → ./CLAUDE.md • Folder → ./src/CLAUDE.md Merge order: Global → Project → Folder (last wins) Most people miss this. 2️⃣ Follow WHAT / WHY / HOW • WHAT → stack, structure, dependencies • WHY → decisions, patterns, anti-patterns • HOW → commands, tests, deploy flow Skip one = Claude guesses. And it guesses wrong. 3️⃣ Be SPECIFIC ❌ “Write clean code” ✅ “camelCase vars, PascalCase components” ❌ “Test everything” ✅ “80% coverage, npm test --watch” Vague = ignored Specific = followed 4️⃣ Follow these 5 rules • Run /init first • Keep it < 500 lines • Expect ~70% compliance • Update monthly • Reference configs (don’t copy) The truth? Top engineers aren’t better at prompting. They’re better at designing CLAUDE.md. Fix this → your AI code quality 10x 🚀

LearnWithBrij's tweet photo. 𝗖𝗟𝗔𝗨𝗗𝗘.𝗺𝗱 is NOT a README.

Most devs:
→ Add a few bullets
→ Maybe a build command
→ Call it “done”

Then complain:
“Claude writes bad code” 🤦‍♂️
No.
Your CLAUDE.md is just… useless.
Here’s how to fix it 👇

1️⃣ Use ALL 3 scopes (not just one)
• Global → ~/.claude/CLAUDE.md
• Project → ./CLAUDE.md
• Folder → ./src/CLAUDE.md

Merge order:
Global → Project → Folder (last wins)

Most people miss this.

2️⃣ Follow WHAT / WHY / HOW

• WHAT → stack, structure, dependencies
• WHY → decisions, patterns, anti-patterns
• HOW → commands, tests, deploy flow

Skip one = Claude guesses.
And it guesses wrong.
3️⃣ Be SPECIFIC

❌ “Write clean code”
✅ “camelCase vars, PascalCase components”
❌ “Test everything”
✅ “80% coverage, npm test --watch”

Vague = ignored
Specific = followed

4️⃣ Follow these 5 rules

• Run /init first
• Keep it < 500 lines
• Expect ~70% compliance
• Update monthly
• Reference configs (don’t copy)
The truth?
Top engineers aren’t better at prompting.

They’re better at designing CLAUDE.md.

Fix this → your AI code quality 10x 🚀

agentsmith011 retweeted

Paweł Huryn

@PawelHuryn

19 days ago

Six patterns for building dynamic workflows and loops identified by Anthropic: 1. Classify-and-act: one agent decides the type, the script routes it. Example: bug vs feature vs noise. 2. Fan-out-and-synthesize: one agent per piece, merged in code. Examples: market research, competitor teardown. 3. Adversarial verification: a separate agent checks the output against a rubric. Example: fact-checking a PRD against the sources. 4. Generate-and-filter: many candidates, deduped, the survivors kept. Examples: naming, positioning, ideation. 5. Tournament (compare): agents attempt the task different ways, judges compare until one wins. Example: product strategy. 6. Loop-until-done: spawn until a stop condition. Example: implement, document, and test a feature in one shot.

PawelHuryn's tweet photo. Six patterns for building dynamic workflows and loops identified by Anthropic:

1. Classify-and-act: one agent decides the type, the script routes it. Example: bug vs feature vs noise.

2. Fan-out-and-synthesize: one agent per piece, merged in code. Examples: market research, competitor teardown.

3. Adversarial verification: a separate agent checks the output against a rubric. Example: fact-checking a PRD against the sources.

4. Generate-and-filter: many candidates, deduped, the survivors kept. Examples: naming, positioning, ideation.

5. Tournament (compare): agents attempt the task different ways, judges compare until one wins. Example: product strategy.

6. Loop-until-done: spawn until a stop condition. Example: implement, document, and test a feature in one shot.

515

823

61K

agentsmith011 retweeted

Max Stone

@Maxsteinbrenner

19 days ago

Prompt engineering has been replaced by loop engineering. What is it? (Explained in 60 seconds) For the past 2 years we have been prompting agents with individual tasks. That is starting to change. So far, if you wanted an agent to build a dashboard for a client, you would give it a task, review the output, improve the prompt, and repeat the process until the work was done. Looping changes that. Instead of giving an agent individual tasks, you give it a goal and let it work through a recursive loop until that goal is met. For example: → Research → Draft → Evaluate → Test → Improve → Repeat The agent keeps cycling through the loop until it reaches the standard you defined. Within loop engineering there are two main approaches: 1. Open Looping You give the agent a goal and allow it significant freedom in how it achieves it. This is powerful, but also expensive and harder to control. 2. Closed Looping The human defines the architecture, constraints and evaluation criteria. The agent is then responsible for executing, improving and iterating within those boundaries until the goal is reached. The next evolution is orchestrated looping. Instead of a single agent running a loop, one agent breaks the goal into smaller tasks and assigns them to specialist agents. Each specialist runs its own loop and reports back. In other words: You move from one agent improving itself to an entire team of agents iterating together until the goal is achieved.

Maxsteinbrenner's tweet photo. Prompt engineering has been replaced by loop engineering.
What is it? (Explained in 60 seconds)

For the past 2 years we have been prompting agents with individual tasks. That is starting to change.

So far, if you wanted an agent to build a dashboard for a client, you would give it a task, review the output, improve the prompt, and repeat the process until the work was done.

Looping changes that.

Instead of giving an agent individual tasks, you give it a goal and let it work through a recursive loop until that goal is met.

For example:

→ Research
→ Draft
→ Evaluate
→ Test
→ Improve
→ Repeat
The agent keeps cycling through the loop until it reaches the standard you defined.
Within loop engineering there are two main approaches:
1. Open Looping
You give the agent a goal and allow it significant freedom in how it achieves it.
This is powerful, but also expensive and harder to control.

2. Closed Looping

The human defines the architecture, constraints and evaluation criteria.

The agent is then responsible for executing, improving and iterating within those boundaries until the goal is reached.

The next evolution is orchestrated looping.

Instead of a single agent running a loop, one agent breaks the goal into smaller tasks and assigns them to specialist agents.

Each specialist runs its own loop and reports back.

In other words:

You move from one agent improving itself to an entire team of agents iterating together until the goal is achieved.

711

114

879

70K

agentsmith011 retweeted

aditya

@adxtyahq

19 days ago

Good list. I'd add: - Dataset Engineering - https://t.co/9v0BWmOe4v - Product Evals - https://t.co/zGn1SrznLs - OpenAI Evals - https://t.co/JkNoFreo0P - Context Engineering - https://t.co/caRNtIw1Ne - Agent Memory & Context Lifecycle - https://t.co/HN7c6OOxyN - MCP & Tool Ecosystems - https://t.co/lKawRb4etX - Observability & Tracing - https://t.co/0UZ2iSvijf - AI Agent Evaluation - https://t.co/sclfEZt01U - Inference Optimization (KV Cache, PagedAttention, Batching) - https://t.co/T7xv6lIirV - Security Engineering (Prompt Injection, Agent Safety) - https://t.co/U1wO0AQnnt - Business Metrics & Product Thinking - https://t.co/aoM9kuqkDB A surprising number of AI engineers spend more time debugging retrieval, context, caching, queues, permissions, and analytics than they do writing prompts.

adxtyahq's tweet photo. Good list.

I'd add:
- Dataset Engineering - https://t.co/9v0BWmOe4v
- Product Evals - https://t.co/zGn1SrznLs
- OpenAI Evals - https://t.co/JkNoFreo0P
- Context Engineering - https://t.co/caRNtIw1Ne
- Agent Memory & Context Lifecycle - https://t.co/HN7c6OOxyN
- MCP & Tool Ecosystems - https://t.co/lKawRb4etX
- Observability & Tracing - https://t.co/0UZ2iSvijf
- AI Agent Evaluation - https://t.co/sclfEZt01U
- Inference Optimization (KV Cache, PagedAttention, Batching) - https://t.co/T7xv6lIirV
- Security Engineering (Prompt Injection, Agent Safety) - https://t.co/U1wO0AQnnt
- Business Metrics & Product Thinking - https://t.co/aoM9kuqkDB

A surprising number of AI engineers spend more time debugging retrieval, context, caching, queues, permissions, and analytics than they do writing prompts.

225

84K

agentsmith011 retweeted

Rahul

@sairahul1

19 days ago

this is f*cking gold the Claude setup most people will never find on their own if I had this a year ago, I would've shipped my first app in a day instead of 3 weeks. in the right hands, this changes everything:

sairahul1's tweet photo. this is f*cking gold

the Claude setup most people will never find on their own

if I had this a year ago, I would've shipped my first app in a day instead of 3 weeks.

in the right hands, this changes everything: https://t.co/l5a0RjMO3O

271

253K

agentsmith011 retweeted

Elon Musk

@elonmusk

18 days ago

Tesla AI chip design engineering reviews are so great! Team is awesome. Our AI6 chip might set a record for most amount of usable intelligence from a wafer when factoring in yield.

75K

15M

agentsmith011 retweeted

Andrej Karpathy

@karpathy

18 days ago

This is a super exciting release - Claude Fable 5 is the same underlying model as Mythos but with added safeguards. The benchmarks are great and it's SOTA on everything by a margin but I'll add that *qualitatively* also, this is a major-version-bump-deserving step change forward (imo of the same order as Claude 4.5 was in November), peaking especially for long problem-solving sessions on very difficult problems. You can give it a lot more ambitious tasks than what you're used to, the model "gets it" and it will just go, and it's never felt this tempting to stop looking at the code at all (but don't do this in prod!). The model still has quirks that people will run into and the safeguards are configured to be a little too trigger happy for launch, which can hopefully be tuned over time. I feel a lot of things changing as working software increasingly comes out on a tap. The Jevon's paradox kicks in and I feel my own demand for software growing substantially. You can ask for anything - explainers, visualizers, dashboards, bespoke single-use apps (e.g. a full wandb that is hyper-specific just for your project), you can 10X your test suite, auto-optimize code, run giant research projects with custom HTML for the results, anything! "Free your mind" (Matrix ref). Really looking forward to all the things people build!

26K

agentsmith011 retweeted

Ramchandra Darade @agentsmith011

19 days ago

Patience, dedication, resilience and humble attitude throughout this impressive journey are truly inspiring. Thank you for setting such a powerful example of what perseverance and hard work can achieve. Read the Success Story : https://t.co/AMvWa6K3xv @Beesolverindia

Ramchandra Darade @agentsmith011

19 days ago

agentsmith011 retweeted

Rahul

@sairahul1

20 days ago

This is the best site on the internet to learn harness engineering. Free. Completely. Most AI engineers have never heard the term. https://t.co/bwDbTTYsjM Bookmark this site. Then read this setup ↓

sairahul1's tweet photo. This is the best site on the internet to learn harness engineering.

Free. Completely.

Most AI engineers have never heard the term.

https://t.co/bwDbTTYsjM

Bookmark this site.

Then read this setup ↓ https://t.co/ddEP0XowXM

444

448K

agentsmith011 retweeted

Hyperagent

@hyperagentapp

about 1 month ago

7 things we built with Opus 4.8 on Hyperagent 👇 1. Mars rover pathfinding simulator 2. Standup Island: a cozier place to review the kanban, inspired by @every's livestream today 3. SpaceXAI + Anthropic partnership visualized 4. Landing page for an outdoor brand w/ Nano Banana + Veo 5. Multi-agent command center 6. Black hole explainer 7. Emergent ecosystem simulator In our vibe check, 4.8 shows: - more varied design sensibilities - better self-correction over long-running tasks - excellent spatial reasoning - more natural copywriting - fewer obvious coding errors - more resourcefulness during reasoning Links below to every interactive artifact shown

270

139

Ramchandra Darade

@agentsmith011

Last Seen Users on Sotwe

Trends for you

Most Popular Users