Here is Gemini CLI’s February 2nd weekly update for v0.28.0
- ⚙️ Extension Settings: Gemini CLI extensions now support extension settings, allowing users to be prompted on installation to set API keys, base URLs, passwords, etc.
- 🎨 Custom Themes for Extensions: Extensions can now contribute custom themes via a themes property in their gemini-extension.json. These themes are automatically discovered and listed in the /theme selection dialog.
- 🎉Gemini CLI Extensions:
- Miro: Link Gemini CLI to Miro boards to build diagrams, extract context, generate code, and track tasks using natural language.
- Apiiro: Integrate Apiiro Guardian Agent to secure code, query risks, analyze repositories, and manage findings directly within Gemini CLI
- 💻 Background Shell Commands: You can now run shell commands in the background, which is useful for long-running tasks.
- ⌨️ Undo/Redo Keybindings: Updated undo/redo keybindings to Cmd+Z/Alt+Z and Shift+Cmd+Z/Shift+Alt+Z for a more native editing experience.
- 🧩 Skill Discovery Alias: Added .agents/skills directory alias to streamline skill discovery and management.
- 📂 Directory Persistence on Resume: Fixed an issue where directories added via /dir add were not saved across sessions. Now, added directories are correctly persisted and restored when using the --resume flag.
- 📚 CLI Reference Documentation: Comprehensive improvements to CLI reference documentation covering all commands, options, and parameters.
Other Updates:
- 🆔 User Identity Display: Your authentication status, email, and tier are now displayed on startup and with the stats command.
-🛡️ Folder Trust: When opening Gemini CLI in a folder for the first time surfaces a trust dialog. This is now enabled by default.
https://t.co/HcVTiRyXSU
🧵
@SumitM_X API gateway per microservice defeats the entire purpose. gateway exists to give you a single entry point - auth, rate limiting, routing in one place.
1 is the lie. indexes hurt writes every insert/update/delete must update the index tree too. reads get faster, writes pay the price. 2 is true (leftmost prefix rule). 3 is true 10 indexes on a write-heavy table is a silent killer. more indexes = more maintenance overhead per write.
@SumitM_X delete has to earn it, one row at a time, with undo logs, index updates, and fk checks for each. drop just deletes a metadata entry. the rows are still on disk, just unreachable.
B, but only if you actually need it.
50M records isn't that scary postgres full text search with a GIN index handles it fine for most use cases. faster to ship, one less infra piece to manage.
jump to elasticsearch when you need relevance scoring, fuzzy matching, or you're already getting killed on query latency.
The bigger your LLM context window, the more your KV cache bloats and that means more GPU memory, more cost, more pain.
this is one of the dirtiest bottlenecks in running LLMs at scale. you want long context? pay up.
Google just dropped turboquant.
turboquant compresses your KV cache down to 3 bits using two clever tricks, PolarQuant rotates vectors to compress cleanly, then QJL (a 1-bit error checker) kills any leftover bias.
early results shows 6x less memory. 8x faster attention on H100s. zero accuracy loss. zero retraining needed.
not all details are public yet, full paper drops at ICLR 2026. but what's out already is wild.
https://t.co/kfrtFJS6TD
Day 4 of agentic
AI engineering series human-in-the-loop (HITL)
Why do fully autonomous agents fail in production?
Because not every decision should be made by the model alone. irreversible actions, high stakes outputs, low confidence steps, the agent should pause, not guess.
Most devs build agents that either interrupt too much (useless) or never interrupt (dangerous). no middle ground.
The pattern is simple. define checkpoints. before destructive actions like delete, send, deploy pause and confirm. for low confidence branches surface to human. for everything else let the agent run. tools like langgraph make this clean with interrupt_before on specific nodes, stream state back to user and resume after approval.
the goal isn't a fully autonomous agent. it's an agent that knows when it's not enough.
@SumitM_X None of them based on company name. i want to be where the hardest problems live hotstar’s 60M concurrent streams, oracle’s db internals, goldman’s low-latency trading infra, zomato’s real-time logistics.
@SumitM_X like '%abc' is basically select * from table with extra steps. db can't use any index has to read every row. like 'xyz%' hits the B-tree, seeks to 'xyz', stops there. the % position isn't a style choice, it's the difference between an index seek and a full scan.
GET is a "simple request" no preflight. POST is not, so browser fires OPTIONS before the actual call. if your server doesn't respond to OPTIONS with Access Control-Allow-Origin + Access-Control-Allow Methods, browser kills it before POST even leaves the client. endpoint works, CORS config doesn't.
@vivoplt ai writes the 80% that's easy. the 20% why this architecture fails at scale, what that incident at 2am actually means, whether the abstraction holds under real load that's still a human call. you're not hiring me to write code. you're hiring me to know when the code is wrong.
Day 3 of agentic AI engineering series
Agent observability & tracing
your agent failed at step 12. was it a bad tool call? hallucinated output? wrong context passed? you have no idea.
That's the real problem multi-step agents are black boxes by default. no visibility into what the model decided, what tools fired, what context it had at that point.
the fix: trace every llm call, tool invocation, and decision point.
tools like langfuse, langsmith let you see the full agent run inputs, outputs, latency, token usage, where it went off track.
@0xlelouch_ CAP theorem. you can't have consistency + availability + partition tolerance all at once. eventual consistency picks A+P scales great, but reads can return stale data. for likes on a tweet? fine. for stock trades or payment deductions? you cannot afford a stale read.
Day 2 of agentic AI engineering series. Context Compacting.
Your agent works great for 5 steps. by step 20 it's hallucinating, repeating itself, forgetting what it already did.
Not a model problem. it's a context problem.
as conversation grows, old noise drowns out what actually matters. the model attends to everything including junk.
context compacting = summarize + prune the history periodically. keep the signal, drop the noise. agent stays sharp across 100+ steps.
@OjasSharma276 Not the exact explanation but think of it as
Hhhhooooyyyll
This is nothing but
4h4o3y2l
See you compressed the data without loosing the details
Day 1 of agentic AI engineering series context engineering
Most agent systems fail not because the model is dumb, but because we don’t trust it enough too many guardrails, conflicting instructions, over-specified prompts you’re not guiding the agent, you’re confusing it give the model clean context, clear objective, right tools. that’s it.
The model knows what to do. your job is to not get in the way.
@0xlelouch_ postgres doesn’t release disk space on DELETE it just marks rows as “dead tuples.” the space is held for future reuse. run VACUUM FULL table_name to reclaim it. warning: it locks the table. for prod, use pg_repack instead does the same thing without locking.
Do not keep looking for beginner-friendly content, because for harder topics, there is none. Also, they often optimize for feeling smart rather than becoming smart.
Try to build a habit of reading, watching, and absorbing detailed information and concepts. This will not only help you move away from being spoon-fed, but it will also make you better at abstract thinking.
I believe abstract thinking will become crucial in the coming decade, as the world shifts toward solving more complex and ambiguous problems that are not easily visualized and comprehended.
We are seeing an early sign of the drift with you-know-what. Extrapolating from here, I would say it is going to be increasingly vital for us to think beyond the obvious.
The people who will navigate that well are not the ones who found the best beginner guide. They will be the ones who learned to sit with the hard stuff until it made sense.