Prompt engineering has been replaced by loop engineering.
What is it? (Explained in 60 seconds)
For the past 2 years we have been prompting agents with individual tasks. That is starting to change.
So far, if you wanted an agent to build a dashboard for a client, you would give it a task, review the output, improve the prompt, and repeat the process until the work was done.
Looping changes that.
Instead of giving an agent individual tasks, you give it a goal and let it work through a recursive loop until that goal is met.
For example:
โ Research
โ Draft
โ Evaluate
โ Test
โ Improve
โ Repeat
The agent keeps cycling through the loop until it reaches the standard you defined.
Within loop engineering there are two main approaches:
1. Open Looping
You give the agent a goal and allow it significant freedom in how it achieves it.
This is powerful, but also expensive and harder to control.
2. Closed Looping
The human defines the architecture, constraints and evaluation criteria.
The agent is then responsible for executing, improving and iterating within those boundaries until the goal is reached.
The next evolution is orchestrated looping.
Instead of a single agent running a loop, one agent breaks the goal into smaller tasks and assigns them to specialist agents.
Each specialist runs its own loop and reports back.
In other words:
You move from one agent improving itself to an entire team of agents iterating together until the goal is achieved.
Anthropic x https://t.co/5iQpMPQ9sQ just dropped a free course on Agent Skills
stop re-explaining the same workflows to your agents
package them as skills โ deploy anywhere
2 hours. beginner. actually worth it
https://t.co/0RZdoQFHGv
As an AI Engineer. Please learn:
Harness engineering, not just prompt engineering
Context engineering, not just long prompts
Prompt caching vs. semantic caching tradeoffs
KV cache management, eviction, reuse, and memory pressure at scale
Prefill vs. decode latency and why they optimize differently
Continuous batching, paged attention, and throughput optimization
Speculative decoding vs. quantization vs. distillation tradeoffs
INT8, INT4, FP8, AWQ, GPTQ, and when quantization hurts quality
Structured output failures, schema validation, repair loops, and fallback chains
Function calling reliability, tool contracts, argument validation, and idempotency
Agent guardrails, loop budgets, tool budgets, and termination conditions
Model routing, graceful fallback logic, and degraded-mode UX
RAG architecture: chunking, embeddings, hybrid search, reranking, and freshness
Retrieval evals: recall, precision, grounding, attribution, and citation quality
Evals: golden sets, regression tests, adversarial tests, LLM-as-judge, and human evals
LLM observability as a first-class discipline: traces, spans, tokens, latency, errors, and drift
Cost attribution per feature, workflow, tenant, and user journey not just per model
Safety engineering: prompt injection defense, data leakage prevention, and permission boundaries
Multi-tenant isolation, cache safety, and cross-user context contamination prevention
Fine-tuning vs. in-context learning vs. RAG vs. distillation and when each is the wrong tool
Latency, quality, cost, and reliability tradeoffs across the full inference stack
Production failure modes: hallucinated tool calls, malformed JSON, stale retrieval, runaway agents, and silent eval regressions
Shipping LLM systems as reliable infrastructure, not demos wrapped around prompts
https://t.co/OhK9MK04ld
Stop learning LLMs from disconnected tutorials.
LLM from Scratch is a hands-on PyTorch curriculum for builders who want to understand how LLMs are trained, modernized, and aligned.
It helps you move from concepts to implementation by organizing the path from transformer basics to tiny-model training, scaling, fine-tuning, reward modeling, and RLHF.
Key features:
โข End-to-end curriculum โ follows pretraining โ finetuning โ alignment from foundations through RLHF
โข Transformer from first principles โ covers positional embeddings, self-attention, attention heads, MLPs, residuals, LayerNorm, and full blocks
โข Tiny LLM training loop โ includes tokenization, batching, cross-entropy, sampling, validation loss, and a no-Trainer training loop
โข Modern architecture upgrades โ walks through RMSNorm, RoPE, SwiGLU, KV cache, sliding-window attention, and streaming cache ideas
โข Alignment path included โ covers SFT, reward modeling, PPO-style RLHF, and GRPO with concrete training-loop notes
Itโs open-source (GPL-3.0 license).
Link in the reply ๐
Interested in learning how to run RL at scale? Here are the best resources to readโฆ
Research on Scaling RL
1. The Art of Scaling RL compute for LLMs: https://t.co/PGjI6Gwgv0
2. Scaling Behaviors of LLM RL Post-Training: https://t.co/2u2saB3C0h
3. Optimally Scaling Sampling Compute for LLM RL: https://t.co/rUSdUvJyNH
4. Scaling up RL: https://t.co/O8vV6z8ymx
5. ProRL V2 - Prolonged Training Validates RL Scaling Laws: https://t.co/vu72juvRW4
6. Polaris - A Recipe for Scaling RL with Reasoning Models: https://t.co/rMibSAeJbg
RL Frameworks
1. Hybrid Flow (early outline of the verl framework): https://t.co/GnWXx131uD
a. More up-to-date info can be found here: https://t.co/j801HcJmPP
2. AReal - Large-Scale Async RL: https://t.co/qhOvsQK09N
3. PipelineRL - Fast On-Policy RL: https://t.co/iRM7KzySXe
4. AsyncFlow - Async Streaming RL: https://t.co/YwmzFtiU2q
RL for Agents
1. DeepSWE - Open Coding Agent Trained w/ RL: https://t.co/GHQHcmtE6F
2. AutoForge - Environment Synthesis for Agentic RL: https://t.co/mr3WDIL5vq
3. Agent-R1 - Training Agents w/ End-to-End RL: https://t.co/xpfQJGgzEv
4. AgentRL - Scaling RL for Multi-Turn, Multi-Task Agents: https://t.co/7fbVl0RWXG
5. The Landscape of Agentic RL: https://t.co/OMnSV4rgdW
6. Training SWE Agents with RL: https://t.co/YqMqySbyXS
Case Studies & Tech Reports
1. Kimi tech reports:
a. Kimi K2 - Open Agentic Intelligence: https://t.co/aAw17SXrIw
b. Kimi End-to-end Agentic RL: https://t.co/ProBpOPIiI
c. Kimi K1.5 - Scaling RL for LLMs: https://t.co/kRGOxY9Jvp
2. Composer series from Cursor:
a. Composer 2: https://t.co/K0v8rNCE6Z
b. Composer 2.5: https://t.co/D9PYimfOMU
3. Olmo 3 (also has open code / data): https://t.co/khetJFvp6N
4. MiniMax tech reports:
a. MiniMax-M2: https://t.co/HApb0OB80S
b. MiniMax-M1: https://t.co/mZj9UQsrnC
5. Nemotron 3 (NVIDIA): https://t.co/lCpE1GzxSi
๐จThis week's top AI/ML research papers:
- DiffusionBlocks
- A Bitter Lesson for Data Filtering
- Neural Weight Norm = Kolmogorov Complexity
- When Does LeJEPA Learn a World Model?
- Do Language Models Need Sleep?
- Parallax
- Gemini Embedding 2
- Qwen-VLA
- The MiniMax-M2 Series
- Looped Diffusion Language Models
- LocateAnything
- Learn from your own latents and not from tokens
overview for each + authors' explanations
read this in thread mode for the best experience
๐จ ๐ ๐๐ฒ๐ฉ๐๐ฌ ๐จ๐ ๐๐๐๐ฌ ๐ฉ๐จ๐ฐ๐๐ซ๐ข๐ง๐ ๐ญ๐จ๐๐๐ฒโ๐ฌ ๐๐ ๐๐ ๐๐ง๐ญ๐ฌ
1๏ธโฃ ๐๐๐ โ ๐๐๐ง๐๐ซ๐๐ญ๐ข๐ฏ๐ ๐๐ซ๐-๐ญ๐ซ๐๐ข๐ง๐๐ ๐๐ซ๐๐ง๐ฌ๐๐จ๐ซ๐ฆ๐๐ซ
(๐โ๐ ๐บ๐๐๐๐๐๐๐๐ ๐ก)
Trained on massive datasets, these autoregressive models are the foundational engines for writing, reasoning, coding, and open-ended conversation.
โ Highly versatile across diverse domains
โ Excels at zero-shot and in-context learning
โ The ultimate foundation for downstream fine-tuning
2๏ธโฃ ๐๐จ๐ โ ๐๐ข๐ฑ๐ญ๐ฎ๐ซ๐ ๐จ๐ ๐๐ฑ๐ฉ๐๐ซ๐ญ๐ฌ
(๐โ๐ ๐๐๐๐๐๐)
Instead of activating the full neural network, MoE uses sparse routing to send each input only to the most relevant subset of "expert" sub-networks.
โ Radically higher compute efficiency during inference
โ Scales seamlessly to trillions of parameters
โ Achieves deep specialization without sacrificing overall performance
3๏ธโฃ ๐๐๐ โ ๐๐ข๐ฌ๐ข๐จ๐ง-๐๐๐ง๐ ๐ฎ๐๐ ๐ ๐๐จ๐๐๐ฅ
(๐โ๐ ๐๐ข๐๐ก๐๐๐๐๐๐)
Combines advanced vision encoders with language models to natively process and reason over spatial dataโlike images, complex diagrams, and video streams.
โ Understands deep visual and spatial context
โ Perfectly aligns pixel data with semantic text
โ Enables rich multimodal tasks (like visual QA and image-based telemetry)
4๏ธโฃ ๐๐๐ โ ๐๐๐ซ๐ ๐ ๐๐๐๐ฌ๐จ๐ง๐ข๐ง๐ ๐๐จ๐๐๐ฅ
(๐โ๐ ๐โ๐๐๐๐๐)
Built for "System 2" thinking. Optimized for multi-step reasoning, logical problem-solving, and planning through explicit verification and self-correction loops.
โ Elite mathematical and logical planning
โ Drastically reduced hallucinations through step-by-step verification
โ Excels at complex, highly constrained problem-solving
5๏ธโฃ ๐๐๐ โ ๐๐ฆ๐๐ฅ๐ฅ ๐๐๐ง๐ ๐ฎ๐๐ ๐ ๐๐จ๐๐๐ฅ
(๐โ๐ ๐ฟ๐๐โ๐ก๐ค๐๐๐โ๐ก)
Compact, highly optimized models engineered specifically for edge devices, offline execution, or highly cost-sensitive environments.
โ Ultra-low latency and blazing-fast inference
โ Highly cost-effective to deploy and maintain
โ Ensures data privacy through strictly on-device processing
6๏ธโฃ ๐๐๐ โ ๐๐๐ซ๐ ๐ ๐๐๐ญ๐ข๐จ๐ง ๐๐จ๐๐๐ฅ
(๐โ๐ ๐ท๐๐๐)
Designed not just to generate text, but to execute real-world tasks using tools, APIs, and external environments. It operates on a continuous agent loop:
๐ Plan โ Action โ Observation โ Reflect โ Update Memory
โ Autonomous real-world execution
โ Native integration with external systems and software
โ Dynamically adapts to environmental feedback
Agents arenโt just chatbots anymore. They see, act, reason, and run anywhere from cloud GPUs to edge devices. ๐ถโ๐๐๐ ๐๐๐ ๐กโ๐ ๐๐๐โ๐ก ๐ฟ๐ฟ๐ ๐ก๐ฆ๐๐ ๐๐๐๐๐๐ก๐๐ฆ ๐๐๐๐๐๐ก๐ ๐๐๐ ๐ก, ๐๐๐ก๐๐๐๐ฆ, ๐๐๐๐๐๐๐๐๐๐ก๐ฆ, ๐๐๐ ๐๐๐๐โ๐ค๐๐๐๐ ๐๐๐๐๐๐๐๐๐ก๐๐๐ .
Cc : Author
9 database types explained in one sentence:
1) ๐ฅ๐ฒ๐น๐ฎ๐๐ถ๐ผ๐ป๐ฎ๐น
โณ Stores structured data in tables with predefined schemas & SQL queries.
2) ๐๐ฒ๐-๐ฉ๐ฎ๐น๐๐ฒ
โณ Stores simple key-value pairs for ultra-fast lookups & caching.
3) ๐๐ผ๐ฐ๐๐บ๐ฒ๐ป๐
โณ Stores data as JSON-like documents with flexible, nested structures.
4) ๐ช๐ถ๐ฑ๐ฒ-๐๐ผ๐น๐๐บ๐ป
โณ Stores data in flexible column families for large-scale distributed workloads.
5) ๐ง๐ถ๐บ๐ฒ-๐ฆ๐ฒ๐ฟ๐ถ๐ฒ๐
โณ Stores time-stamped data for real-time metrics, logs, events, & telemetry.
6) ๐๐ฟ๐ฎ๐ฝ๐ต
โณ Stores relationships between entities to query connected data efficiently.
7) ๐ฉ๐ฒ๐ฐ๐๐ผ๐ฟ
โณ Stores embeddings to enable similarity search & AI-powered retrieval.
8) ๐๐ผ๐น๐๐บ๐ป๐ฎ๐ฟ
โณ Stores data by columns instead of rows to optimize analytical queries.
9) ๐ฆ๐ฒ๐ฎ๐ฟ๐ฐ๐ต
โณ Stores indexed text and structured data to enable fast full-text and relevance-based queries.
Most modern systems use several of these together.
As systems become more real-time and AI-driven, the need for time-series infrastructure has grown significantly.
I like using TimescaleDB by Tiger Data because it keeps the simplicity of Postgres while making it much easier to work with large volumes of time-series and real-time data.
Try Tiger Data free with my link below. You'll get a $1,000 30-day credit, no credit card required. It takes just a few minutes to get started, and you can use the credit to build and experiment with whatever you want (new accounts only).
Try it here (for free) โ https://t.co/Gcz0RaBFxg
What else would you add?
โโ
โป๏ธ Repost to help others learn and grow.
๐ Thanks to @TigerDatabase for sponsoring this post.
โ Follow me ( Nikki Siapno ) + turn on notifications.
HUGGING FACE DROPPED A FREE CONTEXT ENGINEERING COURSE
and the curriculum is stacked:
โซ๏ธ unit 1: agent skills + SKILL.md format
โซ๏ธ unit 2: MCP (model context protocol)
โซ๏ธ unit 3: plugins for tool distribution
โซ๏ธ unit 4: subagents + multi-agent workflows
โซ๏ธ unit 5: hooks to guard the agent lifecycle
โซ๏ธ bonus: build your own agent from scratch
https://t.co/1HjjaXVOek
๐๐ผ๐ ๐ง๐ต๐ฒ ๐๐ป๐๐ฒ๐ฟ๐ป๐ฒ๐ ๐๐ฐ๐๐๐ฎ๐น๐น๐ ๐ช๐ผ๐ฟ๐ธ๐ โ ๐ง๐ต๐ฒ ๐ฎ๐ฌ ๐๐ผ๐ป๐ฐ๐ฒ๐ฝ๐๐ ๐๐๐ฒ๐ฟ๐ ๐๐ฒ๐ ๐ฆ๐ต๐ผ๐๐น๐ฑ ๐๐ป๐ผ๐
Most people use the internet every day.
Very few understand what happens after pressing Enter.
If I had learned these concepts earlier,
I wouldโve understood backend, DevOps, cloud, APIs, and scaling way faster.
Here are the 20 internet concepts that make everything click:
โ DNS
โ IP Address
โ TCP/IP
โ HTTP vs HTTPS
โ SSL/TLS
โ CDN
โ Load Balancer
โ Reverse Proxy
โ Caching
โ Cookies & Sessions
โ REST APIs
โ WebSockets
โ Authentication
โ Databases
โ Serverless
โ Containers
โ Kubernetes
โ Message Queues
โ Rate Limiting
โ Edge Computing
Once you understand these,
modern system design stops feeling like magic.
bookmark this before your next backend project.
Most people use AI every dayโฆ
without understanding the 20 core concepts that actually make it work.
If you learn these once, AI suddenly becomes far less โmagicโ and far more predictable.
Hereโs the roadmap every modern AI engineer should understand in 2026:
โข Neural Networks
โข Tokenization
โข Embeddings
โข Attention Mechanism
โข Transformers
โข Pretraining
โข Fine-tuning
โข RLHF
โข Guardrails
โข RAG
โข Vector Databases
โข Chunking
โข Decoding
โข Temperature & Top-P
โข AI Agents
โข Tools & Function Calling
โข Planning
โข Evaluation
โข Iterative Improvement
โข Bias & Fairness
The biggest realization?
AI is not magic.
Itโs:
Math + Data + Compute + Iteration.
Once you understand how these layers connect:
โข RAG starts making sense
โข Agents become easier to build
โข Hallucinations become easier to reduce
โข Prompt engineering becomes more effective
โข Production AI systems become less intimidating
The people winning in AI right now are not the ones memorizing prompts.
Theyโre the ones deeply understanding the architecture behind the models.
Learn the foundations once.
Youโll be able to adapt to every new AI trend that comes next. ๐