Adel Zaalouk

about 2 years ago

I'm launching "The Technomist," a newsletter exploring how technology and business intersect, covering topics from product/idea discovery to AI strategies. Subscribe if you'd like to follow along: https://t.co/DhqP0Uvagy First post coming soon! #tech #business

746

15 days ago

AI Scrolls doesn't try. Sparse eras collapse into count badges, dense years fan out when you zoom. The axis adapts to the data, not the other way around. It's a utility to help you observe the patterns. Google Trends shows search interest across correlated with what’s happening into each slot across different topic, the Ngrams goes a bit deeper, tracking how often terms like "artificial intelligence" and "machine learning" appeared in published books since 1800. You can literally see the AI winters: interest peaks in the 70s, drops through the 80s, flatlines, then climbs again. the next wave. More on the thinking behind it and how it came to be: https://t.co/IwBlWKatyd

15 days ago

@aaliya_va Glad you liked it!

VP Product @GiantSwarm; CNCF Ambassador; CKA; recovering Data Science Academic; into Chinese Martial Arts; not updating his website; @[email protected]

15 days ago

I built 𝗔𝗜 𝗦𝗰𝗿𝗼𝗹𝗹𝘀, an interactive timeline of AI history. I started this as a 𝗳𝘂𝗻 𝘀𝗶𝗱𝗲-𝗽𝗿𝗼𝗷𝗲𝗰𝘁 a while back and kept evolving it because most AI timelines I found had one of two problems: they either stopped around 2023, or they were useful as reference lists but not very good at showing patterns over time. AI Scrolls currently includes 359 events, from 𝗔𝗿𝗶𝘀𝘁𝗼𝘁𝗹𝗲'𝘀 syllogism in 𝟯𝟴𝟰 𝗕𝗖 to recent work on agents, context engineering, and harness engineering in 2026. I added 𝗚𝗼𝗼𝗴𝗹𝗲 𝗧𝗿𝗲𝗻𝗱𝘀 data for "artificial intelligence" and 14 related sub-trends, including agents, RAG, prompt engineering, context engineering, harness engineering, and vibe coding. You'll notice that search interest in "artificial intelligence" stayed relatively flat from 2004 to 2022, then jumped sharply after ChatGPT (no surprise by now 🙂). Prompt engineering appears to have peaked earlier, while AI agents is still climbing. 🔗 📜 𝗛𝗲𝗿𝗲 𝗶𝘀 𝘁𝗵𝗲 𝗹𝗶𝗻𝗸 𝗳𝗼𝗿 𝘆𝗼𝘂 𝘁𝗼 𝗲𝘅𝗽𝗹𝗼𝗿𝗲: https://t.co/qI6dB5xobv

Who to follow

Puja Abbassi

@puja108

Gardener Project

@GardenerProject

Homogeneous Kubernetes clusters at scale on any infrastructure using hosted control planes. https://t.co/cO7B7JxKeK

Stefan Schimanski

@the_sttts

Kubernetes @ NVIDIA, Ex-Upbound, Ex-Redhat, kcp, logician, restless, hungry & foolish, þetta reddast, https://t.co/CQ4IEp3Qh2 @sttts.social – Opinions are mine

ZaNetworker retweeted

18 days ago

An AI agent deleted an entire production database in 9 seconds last month. No rollback. No recovery. Every sandboxing approach falls into 3 modes. Knowing which you need matters more than which solution you pick. We validated OpenShell across all 3. 🔗 https://t.co/qtnSsSuhFq

RedHat_AI's tweet photo. An AI agent deleted an entire production database in 9 seconds last month. No rollback. No recovery.

Every sandboxing approach falls into 3 modes. Knowing which you need matters more than which solution you pick. We validated OpenShell across all 3.

🔗 https://t.co/qtnSsSuhFq https://t.co/hSL6Yc56pT

ZaNetworker retweeted

18 days ago

RAG systems rank by relevance, not authorization. A query from one tenant can surface another tenant's data simply because it scores highest. Red Hat's @franciscojarceo co-authored a paper on fixing this. MLOps Community, June 11, 1:30 PM EDT (virtual) https://t.co/YmWneZFH1R

748

ZaNetworker retweeted

18 days ago

Anthropic's self-hosted sandboxes for Claude, now running on OpenShell on Red Hat AI. Claude reasons in the cloud. Code executes on your infrastructure, inside an OpenShell sandbox: deny-all by default, per-binary network policy, credential isolation. A five-part 🧵:

RedHat_AI's tweet photo. Anthropic's self-hosted sandboxes for Claude, now running on OpenShell on Red Hat AI.

Claude reasons in the cloud. Code executes on your infrastructure, inside an OpenShell sandbox: deny-all by default, per-binary network policy, credential isolation.

A five-part 🧵: https://t.co/LSLFbKJ4Cc

Red Hat Developer @rhdevelopers

19 days ago

Every layer counts!

25 days ago

Instruction-based malware is the new threat. ⚠️ 300+ malicious skills were recently found in the #OpenClaw marketplace. No bad code, just adversarial inputs. Protect your stack from semantic malware with #RedHat's #AI security layers. #RHSummit https://t.co/FG3h4vKAIk

358

25 days ago

Most AI agents say "done" when the output is fine, not good. They can fix their own mistakes when someone points them out. Nobody is pointing them out automatically. 𝗔𝗻𝘁𝗵𝗿𝗼𝗽𝗶𝗰 built this, but it only works inside their managed platform. I wanted the same pattern with an 𝗼𝗽𝗲𝗻 𝘀𝗼𝘂𝗿𝗰𝗲 stack. I am calling it 𝗼𝘂𝘁𝗰𝗼𝗺𝗲 𝗹𝗼𝗼𝗽𝘀: A rubric-graded quality gate between the agent's output and the user's inbox. A separate judge model scores the result, and the agent revises until the rubric is satisfied or the iteration budget runs out. To build this loop, I used 𝗢𝗚𝗫 (https://t.co/m6C5Ywz1eC) for inference and 𝗠𝗟𝗳𝗹𝗼𝘄 (https://t.co/wnREtKrONy) for evaluation. OGX gave me a single endpoint across all agentic API surfaces (chat-completions, responses API, messages API, Interactions API), integrates with self-hosted runtimes like vLLM and natively supports retrieval and document processing workflows with various DB providers. An outcome loop makes 𝘁𝘄𝗼 𝗺𝗼𝗱𝗲𝗹 𝗰𝗮𝗹𝗹𝘀 𝗽𝗲𝗿 𝗶𝘁𝗲𝗿𝗮𝘁𝗶𝗼𝗻 (agent + judge), and 𝗢𝗚𝗫 routes both through the same base URL, so you can self-host one and use a hosted provider for the other. 𝗠𝗟𝗳𝗹𝗼𝘄 turns rubrics into versioned metrics that log scores, justifications, and artifacts automatically, so you can actually see which criteria fail and whether your rubric is calibrated. Some learnings along the way: ➡️ Build the judge before the loop. A weak judge lets bad output through, a harsh one wastes iterations. The judge is the system. ➡️ Stop at the first passing score. Extra iterations drift; models rewrite good sections to satisfy criteria they already passed. ➡️ Start the rubric loose, tighten from the data. You won't know which criteria matter until you see real outputs fail. This is part 3 of my continuous learning for agents series: https://t.co/oZ5rh55gSE Link to the post: https://t.co/L5yLao1UAH

25 days ago

We're way past the point where AI is just the model. There's more at stake now! Wrote a 𝘁𝗵𝗿𝗲𝗲-𝗽𝗮𝗿𝘁 𝘀𝗲𝗿𝗶𝗲𝘀 breaking down what goes into building a 𝘄𝗵𝗼𝗹𝗲 𝗔𝗜 𝗽𝗿𝗼𝗱𝘂𝗰𝘁: ➡️ 𝗣𝗮𝗿𝘁 𝟭 covers the framework, building off Moore's whole product model and introducing a tweaked version. ➡️ 𝗣𝗮𝗿𝘁 𝟮 covers compound systems, agentic loops, and the architecture patterns emerging right now. ➡️ 𝗣𝗮𝗿𝘁 𝟯 tackles the question: if everyone has access to the same models, what makes your product defensible? https://t.co/ZOwON73INz

25 days ago

AI agents need to run shell commands, write to disk, and make network calls to do real work. Security teams look at that access list and see a breach waiting to happen. Give an agent those permissions and things go wrong fast, a 𝗵𝗮𝗹𝗹𝘂𝗰𝗶𝗻𝗮𝘁𝗶𝗼𝗻 turns 𝗿𝗺 -𝗿𝗳 on the wrong directory, a bad tool call POSTs credentials to an external endpoint. And the threat model has a new entry: 𝘀𝗲𝗺𝗮𝗻𝘁𝗶𝗰 𝗺𝗮𝗹𝘄𝗮𝗿𝗲 (not executable payloads), malicious instructions hidden in natural language that agents follow because they read documentation the same way they read prompts. 𝗡𝗼 𝘀𝗶𝗻𝗴𝗹𝗲 𝗰𝗼𝗻𝘁𝗿𝗼𝗹 𝗵𝗮𝗻𝗱𝗹𝗲𝘀 𝘁𝗵𝗶𝘀. A firewall does not stop an agent that was told to write secrets to a file it is allowed to touch. A guardrail does not help if the agent never hits the model with the malicious content. 𝗬𝗼𝘂 𝗻𝗲𝗲𝗱 𝗹𝗮𝘆𝗲𝗿𝘀 (𝗱𝗲𝗳𝗲𝗻𝘀𝗲 𝗶𝗻 𝗱𝗲𝗽𝘁𝗵), and each layer must assume the one above it has already failed. Here is a 𝘀𝗶𝘅-𝗹𝗮𝘆𝗲𝗿 𝗳𝗿𝗮𝗺𝗲𝘄𝗼𝗿𝗸 for how we are building the agent's secure runtime with 𝗥𝗲𝗱 𝗛𝗮𝘁 𝗔𝗜: ➡️ 𝗛𝗮𝗿𝗱𝗲𝗻𝗲𝗱 𝗽𝗹𝗮𝘁𝗳𝗼𝗿𝗺 locks down the base. Security context constraints, SELinux, and DNS-based egress filtering before the agent even starts. ➡️ Agent pods run inside 𝗹𝗶𝗴𝗵𝘁𝘄𝗲𝗶𝗴𝗵𝘁 𝗩𝗠 𝗿𝘂𝗻𝘁𝗶𝗺𝗲𝘀 (Kata Containers), not just containers. Container escape hits a hardware boundary. ➡️ An agent "𝘀𝗵𝗲𝗹𝗹" (𝗢𝗽𝗲𝗻𝗦𝗵𝗲𝗹𝗹) enforces per-binary network policies and kernel-level filesystem allowlists. The agent binary can only reach the endpoints and paths you explicitly permit. ➡️ Conversation-layer 𝗴𝘂𝗮𝗿𝗱𝗿𝗮𝗶𝗹𝘀 (NeMo Guardrails) intercept prompt injections and filter unsafe outputs before the model processes them. ➡️ Continuous 𝗿𝗲𝗱 𝘁𝗲𝗮𝗺𝗶𝗻𝗴 (Garak) runs 120+ adversarial probes against your deployed agents. You find the gaps before someone else does. ➡️ 𝗪𝗼𝗿𝗸𝗹𝗼𝗮𝗱 𝗶𝗱𝗲𝗻𝘁𝗶𝘁𝘆 𝗮𝗻𝗱 𝘁𝗿𝗮𝗰𝗶𝗻𝗴 give every agent a cryptographic identity (SPIFFE) and full OpenTelemetry traces. When something goes wrong, you know which agent did what, when, and why. The design principle assumes breach at every layer and contains the blast radius at the next one. https://t.co/BVctOJJ0I8

3 months ago

Karpathy and Tobi Lutke built the same loop independently. Point an AI agent at code, give it a score to chase, let it run experiments overnight. One got a better model. The other got a 53% speedup on a 20-year-old codebase. I generalized the pattern into a tool that works for any domain. Pointed it at a RAG search engine, got 14 experiments and a 9.3% improvement while I did other things. The real work isn't the loop. It's writing good evals. https://t.co/GmVXa7SAuO

3 months ago

Your AI coding agent makes the same mistake every session. You correct it, it adapts, the session ends, and tomorrow it's forgotten everything. I built a system that captures corrections, figures out which skill caused the failure, and checks whether I already fixed it. The key insight: an agent that remembers everything learns nothing. An agent that remembers only what you choose to teach it gets better every week. https://t.co/UNtWjLaF8y

ZaNetworker retweeted

3 months ago

Your agent works on your laptop. But does it have identity? Isolation? Audit trails? Observability with agent tracing? Can you prove to compliance what tools it called and why? Most teams can't. That's the gap Red Hat AI closes with Bring Your Own Agent: security, governance, observability, and tool-level authorization around any agentic runtime, framework, or application without touching code. Here's how to operationalize "Bring Your Own Agent" on Red Hat AI, the OpenClaw edition: https://t.co/JEjP5C7jjD

RedHat_AI's tweet photo. Your agent works on your laptop. But does it have identity? Isolation? Audit trails? Observability with agent tracing? Can you prove to compliance what tools it called and why?

Most teams can't.

That's the gap Red Hat AI closes with Bring Your Own Agent: security, governance, observability, and tool-level authorization around any agentic runtime, framework, or application without touching code.

Here's how to operationalize "Bring Your Own Agent" on Red Hat AI, the OpenClaw edition: https://t.co/JEjP5C7jjD

3 months ago

Most CLAUDE.md files I've seen are way too long. The model can only reliably follow ~150 instructions, and Claude Code's system prompt already uses ~50 of those. If Claude keeps ignoring your rules, your instruction budget is probably overdrawn. Wrote about how to fix it. https://t.co/x5NohSS7no

3 months ago

@karpathy I have been experimenting with the agents command center here: https://t.co/eW2nb72D4O

3 months ago

Spent the weekend building a tool for solving the coding agents' sprawl problem. Introducing ✨ aimux ✨ If you're running multiple coding agents like Claude, Codex, Gemini, etc, you know the pain: which session is stuck? What did it do? How do I debug it? How much did it cost? aimux is a single-binary TUI that gives you one view across all your AI coding agents. Discovery, traces, cost tracking, annotations + labels (for evals), and OTEL export! No daemons, no hooks, no modifications to your tools. Integrates with MLFlow and is easily extensible as well. Multiplex your AI agents. Trace, launch (built-in for different coding agents), export. Never leave the terminal. Install: brew install zanetworker/aimux/aimux Repo: https://t.co/vyaU4neA43 Site: https://t.co/2i43XUtNbB

281

3 months ago

Link to paper: https://t.co/CDYn1DxORh

3 months ago

AI scales execution to near-zero cost. But verifying that output stays biologically bounded. The bottleneck is not intelligence (not anymore, that's becoming abundant now), it is and will be human verification bandwidth. Full post here: https://t.co/3Dh7blx85g

ZaNetworker's tweet photo. AI scales execution to near-zero cost. But verifying that output stays biologically bounded. The bottleneck is not intelligence (not anymore, that's becoming abundant now), it is and will be human verification bandwidth.

Full post here: https://t.co/3Dh7blx85g