AI | Cloud | Futuristic Designs | Productivity🔍
Deep dives & expert tips on AI, LLMs, AWS, Salesforce, general self improvement tips and office productivity
@Sheetal2205 This is the one of solid course which I've myself gone through :
Selling point : Concepts + theory + practicals all are thoroughly covered !
Good for even beginners !
https://t.co/uOdyHGcuvU
I DID NOT clear the Claude Certified Architect Foundation exam. Scored 703, while 720 is the passing mark.
My Initial learnings /reflections from the experience:
1. Good mock test scores are NOT enough !!!
I scored 846/1000 in mocks and made the mistake of thinking I would comfortably clear the real exam.
2. The actual exam felt much tougher and very different from mock tests and online question sets available at various sites .
3. Don’t neglect concepts you already know.
I blanked on things like claude --continue and claude --resume, and there were multiple questions around them.
4. Revise. Re-revise. Don’t rush the exam.
Have your own last-minute notes for factual and command-based concepts like above !
5. The most recurring theme across the exam was Reliability:
How do you make AI agents produce consistent, dependable outputs?
Single agent or multi-agent, chat agent or code... this theme was just ...everywhere.
6.Another big theme:
What is the BEST thing you can do when the model starts forgetting context?
The tricky part: options were not that straightforward.
7.Surprisingly, I expected far more questions on CLAUDE.md, MCP servers, skills, subagents, etc. Topics I personally felt stronger in.
But I barely saw more than 1-2 questions there. No questions on project-level vs personal-level CLAUDE.md either.
8.The exam is highly conceptual and practical.
Almost every question is scenario-based.
You cannot pass just because you “know” some related concept.
You need to deeply understand tradeoffs and choose the best possible approach.
9.I’ll probably make another post once I get a detailed topic-wise performance breakdown.
10.After the initial disappointment faded(took few hours though !) I actually appreciated the exam more.
It genuinely tests whether you have knowledge, experience & thorough understanding design reliable production-grade AI systems or not !!
11.I’ve worked on some AI agents/systems professionally, but I still haven’t built large-scale production AI systems deeply enough.
This exam stripped away any false confidence I've and laid my gaps bare in front of me !!!
12. This result made one thing clear:
I’m not that far, but not there yet.
And honestly, that realization has motivated me even more.
Time to learn more !!
Build more.
Experience more
Come back stronger !
13. I honestly think , everyone interested in building AI agents, should give this exam, NOT just to get another certificate but to know where exactly they stand today ! . I believe that clarity is more valuable than certificate itself !
Thanks Claude, for showing me that !!
I DID NOT clear the Claude Certified Architect Foundation exam. Scored 703, while 720 is the passing mark.
My Initial learnings /reflections from the experience:
1. Good mock test scores are NOT enough !!!
I scored 846/1000 in mocks and made the mistake of thinking I would comfortably clear the real exam.
2. The actual exam felt much tougher and very different from mock tests and online question sets available at various sites .
3. Don’t neglect concepts you already know.
I blanked on things like claude --continue and claude --resume, and there were multiple questions around them.
4. Revise. Re-revise. Don’t rush the exam.
Have your own last-minute notes for factual and command-based concepts like above !
5. The most recurring theme across the exam was Reliability:
How do you make AI agents produce consistent, dependable outputs?
Single agent or multi-agent, chat agent or code... this theme was just ...everywhere.
6.Another big theme:
What is the BEST thing you can do when the model starts forgetting context?
The tricky part: options were not that straightforward.
7.Surprisingly, I expected far more questions on CLAUDE.md, MCP servers, skills, subagents, etc. Topics I personally felt stronger in.
But I barely saw more than 1-2 questions there. No questions on project-level vs personal-level CLAUDE.md either.
8.The exam is highly conceptual and practical.
Almost every question is scenario-based.
You cannot pass just because you “know” some related concept.
You need to deeply understand tradeoffs and choose the best possible approach.
9.I’ll probably make another post once I get a detailed topic-wise performance breakdown.
10.After the initial disappointment faded(took few hours though !) I actually appreciated the exam more.
It genuinely tests whether you have knowledge, experience & thorough understanding design reliable production-grade AI systems or not !!
11.I’ve worked on some AI agents/systems professionally, but I still haven’t built large-scale production AI systems deeply enough.
This exam stripped away any false confidence I've and laid my gaps bare in front of me !!!
12. This result made one thing clear:
I’m not that far, but not there yet.
And honestly, that realization has motivated me even more.
Time to learn more !!
Build more.
Experience more
Come back stronger !
13. I honestly think , everyone interested in building AI agents, should give this exam, NOT just to get another certificate but to know where exactly they stand today ! . I believe that clarity is more valuable than certificate itself !
Thanks Claude, for showing me that !!
“design a RAG pipeline for 10M docs with zero hallucination”
apparently this was asked in a Google L5 interview round. came across it somewhere on the internet and honestly it’s a way more interesting system design problem than most classic distributed systems questions
1. ingest + normalize docs
- remove duplicates, standardize formats, extract metadata, maintain version history
2. hybrid retrieval (BM25 + embeddings)
- BM25 handles exact keyword matching while embeddings capture semantic meaning
- semantic search alone usually struggles with precision at massive scale
3. ANN retrieval + reranking
- ANN (Approximate nearest neighbor ) quickly pulls top candidate chunks from millions of docs
- then a reranker rescoring step improves relevance by deeply comparing query vs retrieved chunks
4. source confidence scoring
- every retrieved chunk gets scored based on freshness, trust level, overlap and retrieval consistency
- low-confidence context should never heavily influence generation
5. constrained generation
- the model is only allowed to answer using retrieved context (nothing new to be invented outside of the retrieved context)
6. citation-backed responses
- every major claim links back to exact chunks, documents or timestamps
7. hallucination fallback layer
- if retrieval confidence drops below a threshold: “insufficient evidence found”
8. continuous evals
- run adversarial queries, retrieval recall benchmarks and hallucination tests continuously
9. caching + memory layer
- cache high-frequency enterprise queries and retrieval paths (improves latency and output)
10. observability everywhere
- trace retrieval paths, chunk rankings, token attribution and failure points
Also at 10M docs, retrieval quality matters more than the frontier model itself.
@rubenhassid Great post, i'll add one more thing ... just ask to give LLM a confidence score between 0 to 1 , about its response.
Anything less than 0.8 is a red flag ...
This is something I have done for long and it always helped !!
As you have mentioned, you've gone through process of learning , unlearning multiple times in last 15 years.
Do you think, you can't do it now just because you hit 40.
I've seen few years back. .. a 60 year old learning and becoming python engineer as thas was something he wanted to do !
The surprising thing was .. he was senior architect in AWS but somehow got more interested in getting hands dirty in python.
Just go for it, if you think you can also get practical experience in whatever you want to do next --- that is the only difficult part in learning something new !!
RAG vs. CAG, clearly explained!
RAG is great, but it has a major problem:
Every query hits the vector DB. Even for static information that hasn't changed in months.
This is expensive, slow, and unnecessary.
Cache-Augmented Generation (CAG) addresses this issue by enabling the model to "remember" static information directly in its key-value (KV) memory.
In fact, you can combine RAG and CAG for the best of both worlds.
Here's how it works:
RAG + CAG splits your knowledge into two layers:
↳ Static data (policies, documentation) gets cached once in the model's KV memory
↳ Dynamic data (recent updates, live documents) gets fetched via retrieval
This gives faster inference, lower costs, and less redundancy.
The trick is being selective about what you cache.
Only cache static, high-value knowledge that rarely changes. If you cache everything, you'll hit context limits. Separating "cold" (cacheable) and "hot" (retrievable) data keeps this system reliable.
You can start today. OpenAI and Anthropic already support prompt caching in their APIs.
I have shared my recent article on prompt caching below if you want to dive deeper.
Have you tried CAG in production yet?
Below, I have quoted an article that I wrote on prompt cashing and how Claude Code achieves a 92% cache hit-rate. Give it a read.
Creator of C++, Bjarne Stroustrup:
AI-generated code isn't ready — it generates more bugs, more bloat, more security holes, and is nearly impossible to validate
"senior developers are already retiring rather than deal with it"
The problem is that even a small prompt change can shift the entire codebase in unpredictable ways
Nice .. but in today's world your'e seriously far behind if you still have to provide all this info as a part of prompt
A lot of these I think , like context files , rules etc should either be part of CLAUDE.md or Skills.
Let Claude load atleast skills progressively instead of providing all the info at one go , consuming unnecessary tokens.
@Sheetal2205 This is the one of solid course which I've myself gone through :
Selling point : Concepts + theory + practicals all are thoroughly covered !
Good for even beginners !
https://t.co/uOdyHGcuvU
Not all context present in docs.
A lot present in people's mind in any enterprise in any industry, especially in fast changing world.
I can talk and get that fast and thus make faster design decisions . Claude can't do that yet and I doubt when it will be that capable !
And putting everything in docs , updating it continously so that clause can refer via Claude.md, Skill etc is still something a distant dream !
@Its_Nova1012 Not sure about 3 years but as of now, it seems, your value no longer resides in writing isolated code blocks but in capability to orchestrate complex, probabilistic systems.... which can do, what earlier team of people were doing .... with reliability !!
Karpathy just described what hiring looks like in 2026:
"Build a large project with Claude Code — like a Twitter clone. Make it secure. Have real agents using the platform doing stuff. The interviewer uses parallel agents trying to break in to verify security."
One person. Multiple agents. Shipping and defending production code simultaneously.
This is not a future job description.
This is happening right now.
The founders who get there first are not the smartest ones in the room. They are the ones who stopped doing everything themselves and built agents to do it for them.
Here is the complete playbook — 13 agents, exact prompts, 90-day build plan ↓
Read this before your competition does.
Anthropic just shipped Claude's 10 finance agents.
Available in Cowork, Code, API, and Office.
How to install in 4 steps.
1. Install in Cowork.
- Open Settings → Plugins → Add plugin.
- Paste: https://t.co/SCo2Vbrqty
- Pick the agents you want from the list.
2. Install for Microsoft Office.
- Open the GitHub link (above).
- Copy the install command into Claude Code.
- Run /claude-for-msft-365-install:setup to finish.
3. Connect your data sources.
- There are 17 data partners at launch.
- Add the ones you pay for as connectors.
4. Pick one. Run today.
- Map one agent to a job on your plate this week.
- Paste the prompt. Edit it. Run it.
Try these prompts:
✦ pitch-agent
Pulls comps, precedents and LBO numbers into a branded pitch deck.
"Draft a 12-slide pitchbook for our acquisition of [TargetCo]."
✦ meeting-prep-agent
Pulls past notes, recent news and talking points into a one-page brief.
"Build a one-page brief for tomorrow's 10am with [Client]."
✦ earnings-reviewer
Reads earnings reports and flags the surprises and risky wording.
"Summarise [Ticker]'s Q1 earnings and flag every surprise vs forecast."
✦ model-builder
Builds a working financial model in your spreadsheet from one prompt.
"Build a valuation model for [TargetCo] vs six similar companies."
✦ market-researcher
Pulls sector trends, competitor moves and pricing into one memo.
"Write a 1,000-word memo on European fintech lenders."
✦ valuation-reviewer
Audits a valuation model and challenges every assumption inside it.
"Review the valuation model on [sheet]. Challenge every assumption."
✦ gl-reconciler
Matches your books against bank statements and flags any mismatches.
"Reconcile our bank books against last month's statement."
✦ month-end-closer
Runs your monthly accounts checklist end to end and flags any issues.
"Run the April month-end close on our standard checklist."
✦ statement-auditor
Audits the books for errors and control gaps before they ship.
"Audit April's P&L and balance sheet against our books."
✦ kyc-screener
Vets new clients against watchlists and ownership records.
"Run a background check on [NewClient]. Pull ownership records."
Free Claude playbooks → https://t.co/1F12fOTjss
Repost ♻️ to help someone in your network.