Zack

@altereddude

I'm the guy who thinks java is structured and non verbose. Thionville, Luxembourg.

Joined August 2010

140 Following

927 Followers

20.8K Posts

altereddude retweeted

Vaishnavi

@_vmlops

4 days ago

MICROSOFT JUST DROPPED COREUTILS FOR WINDOWS the same linux commands you use every day...now native on windows ▪️ ls, cat, grep, find, cp, mv, rm all of them ▪️ one install: `winget install Microsoft.Coreutils` ▪️ built in rust, no WSL needed ▪️ your bash scripts? they just work https://t.co/aMpZ3o8psy

altereddude retweeted

Vaishnavi

@_vmlops

4 days ago

Spent 3 hours tailoring a CV for a job wrote a cover letter that felt generic anyway got rejected in 2 days that's the job search loop most people are stuck in Someone built a way out ai-job-search is a Claude Code framework that turns your entire job application process into an agentic pipeline you run /setup once. claude interviews you about your background, skills, and goals then populates your full profile automatically then /scrape hits multiple job portals, deduplicates results, and ranks every match against your actual profile. find a role you like? /apply <url> and this is where it gets interesting. it doesn't just write a CV. it runs a drafter-reviewer loop one agent tailors your CV and cover letter in LaTeX, a second agent researches the company and tears the drafts apart, then revisions happen automatically before final output lands. every claim verified against your real profile. nothing fabricated three commands. zero generic applications the deeper insight buried in the readme: "a thin profile produces generic applications. a detailed one enables genuinely tailored results." the tool is only as sharp as the story you give it

altereddude retweeted

Peter Dedene

@dedene

5 days ago

POV: you're still using GitHub Copilot after June 1st, 2026

250

17K

altereddude retweeted

Thariq

@trq212

5 days ago

https://t.co/R6exTuF7P8

243

10K

23K

Who to follow

mr. Kahwa 🇲🇦 🇵🇸 ☕

@iZakben

B.C stands for life Before Coffee

altereddude retweeted

Muratcan Koylan

@koylanai

13 days ago

Gradient descent for SKILL.md files sounds interesting, maybe a bit complex but it's becoming a real part of agent harness. SkillOpt is one of the first papers to treat markdown skill files as trainable parameters and provides a proper optimization framework for them. A few things I learned that you should consider too. 1. The validation gate is the only thing that matters in a self-editing loop. Held-out set, strict improvement, ties rejected. End-to-end, their best skills land with 1 to 4 accepted edits total. If your "self-improving agent" is accepting most of what it proposes, you're shipping slop. 2. Bounded edits are better than full rewrites. 4 to 8 edits per step is the sweet spot. Remove the budget and performance collapses. This is the textual analog of learning rate, and it transfers to any LLM-as-author loop. If you're using an agent to refactor your docs, your prompts, or your skills, cap the diff size. 3. Compactness wins. Median final skill: ~920 tokens. Skills do not need to be long. They need to be high-signal. Most skill files I see are bloated because length feels like effort. It isn't. 4. The harness is becoming less important; the skill is becoming more important. A Codex-trained skill ported into Claude Code hit +59.7 points on SpreadsheetBench. Procedural knowledge is more general than the runtime that produced it. 5. Frozen model + trained context is the practical adaptation. GPT-5.4-nano with a SkillOpt'd skill ≈ frontier behavior on procedural benchmarks. Cheaper, portable, inspectable, zero inference-time cost. This is the answer to "how do we adapt a frontier model for our domain" for almost everyone who isn't training their own models. 6. Verification is the bottleneck. Every gate in this paper depends on an auto-grader. That works for benchmarks. It fails for writing, design, and strategy, exactly the open-ended work we want to automate. Whoever builds the verifier for open-ended tasks owns the next stage. There are also two leassons I learned while shipping v2.3.0 of my Context Engineering Agent Skills repo, measured across composer-2, claude-opus-4-7, gpt-5.5, and gemini-3.1-pro via the @cursor_ai SDK: - Description and body are two different surfaces. The router only sees the description. The agent sees the body once activated. They can quietly disagree, and only end-to-end task tests catch it. - Aggregate accuracy is the wrong unit. When I rewrote three descriptions, the corpus average moved ~1pp. Individual skills moved 23–25pp. Per-skill effect size is where the action is. Also, in Feb 2026 I shared a piece called Personal Brain OS arguing that the markdown file is a first-class substrate for agent state. SkillOpt is the optimizer-shaped version of that same argument: not "store memory in files" but "treat files as trainable parameters with proper optimization machinery around them." That's the move from static to measured. The fast/slow split they describe already lives implicitly in the digital-brain-skill repo: - voice-guide and tone-of-voice.md are slow-state (rarely touched) - posts.jsonl and bookmarks.jsonl are fast-state What SkillOpt adds that I didn't have is a protected section invariant, a structural guarantee that fast edits cannot overwrite slow lessons. Removing that mechanism cost them 22 points on SpreadsheetBench. Worth borrowing. If you're building agents, SkillOpt: Executive Strategy for Self-Evolving Agent Skills is a good paper to read: https://t.co/ZS9SZXQ6Mv

koylanai's tweet photo. Gradient descent for SKILL.md files sounds interesting, maybe a bit complex but it's becoming a real part of agent harness.

SkillOpt is one of the first papers to treat markdown skill files as trainable parameters and provides a proper optimization framework for them.

A few things I learned that you should consider too.

1. The validation gate is the only thing that matters in a self-editing loop.

Held-out set, strict improvement, ties rejected. End-to-end, their best skills land with 1 to 4 accepted edits total. If your "self-improving agent" is accepting most of what it proposes, you're shipping slop.

2. Bounded edits are better than full rewrites. 4 to 8 edits per step is the sweet spot.

Remove the budget and performance collapses. This is the textual analog of learning rate, and it transfers to any LLM-as-author loop. If you're using an agent to refactor your docs, your prompts, or your skills, cap the diff size.

3. Compactness wins. Median final skill: ~920 tokens.

Skills do not need to be long. They need to be high-signal. Most skill files I see are bloated because length feels like effort. It isn't.

4. The harness is becoming less important; the skill is becoming more important.

A Codex-trained skill ported into Claude Code hit +59.7 points on SpreadsheetBench. Procedural knowledge is more general than the runtime that
produced it.

5. Frozen model + trained context is the practical adaptation.

GPT-5.4-nano with a SkillOpt'd skill ≈ frontier behavior on procedural benchmarks. Cheaper, portable, inspectable, zero inference-time cost. This is
the answer to "how do we adapt a frontier model for our domain" for almost everyone who isn't training their own models.

6. Verification is the bottleneck.

Every gate in this paper depends on an auto-grader. That works for benchmarks. It fails for writing, design, and strategy, exactly the open-ended work we want to automate. Whoever builds the verifier for open-ended tasks owns the next stage.

There are also two leassons I learned while shipping v2.3.0 of my Context Engineering Agent Skills repo, measured across composer-2, claude-opus-4-7,
gpt-5.5, and gemini-3.1-pro via the @cursor_ai SDK:
- Description and body are two different surfaces. The router only sees the description. The agent sees the body once activated. They can quietly disagree, and only end-to-end task tests catch it.
- Aggregate accuracy is the wrong unit. When I rewrote three descriptions, the corpus average moved ~1pp. Individual skills moved 23–25pp. Per-skill effect size is where the action is.

Also, in Feb 2026 I shared a piece called Personal Brain OS arguing that the markdown file is a first-class substrate for agent state. SkillOpt is the optimizer-shaped version of that same argument: not "store memory in files" but "treat files as trainable parameters with proper optimization machinery around them." That's the move from static to measured.

The fast/slow split they describe already lives implicitly in the digital-brain-skill repo:
- voice-guide and tone-of-voice.md are slow-state (rarely touched)
- posts.jsonl and bookmarks.jsonl are fast-state

What SkillOpt adds that I didn't have is a protected section invariant, a structural guarantee that fast edits cannot overwrite slow lessons. Removing that mechanism cost them 22 points on SpreadsheetBench. Worth borrowing.

If you're building agents, SkillOpt: Executive Strategy for Self-Evolving Agent Skills is a good paper to read: https://t.co/ZS9SZXQ6Mv

241

768K

altereddude retweeted

Khushi Agarwal

@Aicoder786

15 days ago

ANDREJ KARPATHY COULD HAVE CHARGED $2,000 FOR THIS COURSE. He put it on YouTube. The full training stack. Tokenization. Neural network internals. Hallucinations. Tool use. Reinforcement learning. RLHF. DeepSeek. AlphaGo. 3 hours of the most comprehensive LLM education that exists anywhere at any price. Not how to use the tools. How the entire system was built from the ground up and why it behaves the way it does. The engineers who understand this build things the ones who only use the tools cannot even conceive of. The gap between those two groups is not 3 hours. It is everything those 3 hours quietly unlock for the rest of your career.

660

273K

altereddude retweeted

Aria Westcott

@AriaWestcott

16 days ago

🚨 BREAKING: I asked Claude to improve my LinkedIn profile. It didn’t just improve it. It made it a recruiter magnet. Here are the 7 exact prompts I used:

288

337K

altereddude retweeted

Vaishnavi

@_vmlops

18 days ago

OPENAI DROPPED A PDF ON HOW THEY USE CODEX INTERNALLY and it's actually useful their engineers across security, infra, frontend, and api teams use it daily for: ▫️ understanding unfamiliar codebases fast (especially during incidents) ▫️ refactoring changes that span dozens of files ▫️ generating tests for edge cases devs usually miss ▫️ scaffolding boilerplate so you ship faster ▫️ staying in flow when your calendar is a disaster the one that hit different: one engineer said "i was in meetings all day and still merged 4 PRs because codex was working in the background" https://t.co/2GFtZ0cRsf

_vmlops's tweet photo. OPENAI DROPPED A PDF ON HOW THEY USE CODEX INTERNALLY

and it's actually useful

their engineers across security, infra, frontend, and api teams use it daily for:
▫️ understanding unfamiliar codebases fast (especially during incidents)
▫️ refactoring changes that span dozens of files
▫️ generating tests for edge cases devs usually miss
▫️ scaffolding boilerplate so you ship faster
▫️ staying in flow when your calendar is a disaster

the one that hit different:
one engineer said "i was in meetings all day and still merged 4 PRs because codex was working in the background"

https://t.co/2GFtZ0cRsf

153

204K

altereddude retweeted

Thariq

@trq212

20 days ago

a prompt I've been using a lot recently: implement <SPEC> and while you do, keep a running implementation-notes.html file (or markdown) with decisions you had to make weren't in the spec, things you had to change, tradeoffs you had to make or anything else I should know

trq212's tweet photo. a prompt I've been using a lot recently:

implement <SPEC> and while you do, keep a running implementation-notes.html file (or markdown) with decisions you had to make weren't in the spec, things you had to change, tradeoffs you had to make or anything else I should know https://t.co/qQFTES4fjo

343

10K

582

12K

821K

altereddude retweeted

CJ Zafir

@cjzafir

about 1 month ago

Codex can get dumber and slower on long sessions. Here's the fix: 1. Run Process_narration=false This will stop Codex from showing you all the planning steps, resulting in saving a lot of output tokens. 2. Prompt: "Act as an orchestrator. Use parallel agents to do the research and execution work. Write detailed tasks for each parallel agent and force them to act, iterate, get their tasks done, and bring back an in-depth report. Your job is to deeply analyze the agents' work, provide feedback, and provide them with continuous tasks." This prompt offloads the majority of the context-burning work to agents, and each agent has its own context window. So you can utilize 5 agents (5 context windows). 3. Add this hard rule: "Measure twice, cut once policy." Debugging and patching is messy work. Force Codex to plan first, act after (don't use plan mode; it's just overcomplicated). Ask it to make a task list for every task so it can track progress and iterate better. 4. Add this hard rule: "Keep the codebase clean, no tmp files, no dead code, no dead files. Stay organized all the time. No unnecessary folders, subfolders, or files." Claude keeps most of its working files in cache as temporary files (which is bloatware, but it keeps the codebase neat), but Codex is output-heavy. It creates tons of folders and files, and your workspace can become a mess after a few sessions. As a result, this contaminates the context window and degrades performance. Force Codex to stay organized and follow the file structure. These 4 techniques can help you save 40% context every session, and performance will be a lot better. For planning, use Codex 5.5 (extra high), and after the plan is done, shift to Codex 5.5 (high) with fast mode. This works faster.

cjzafir's tweet photo. Codex can get dumber and slower on long sessions.

Here's the fix:

1. Run Process_narration=false

This will stop Codex from showing you all the planning steps, resulting in saving a lot of output tokens.

2. Prompt: "Act as an orchestrator. Use parallel agents to do the research and execution work. Write detailed tasks for each parallel agent and force them to act, iterate, get their tasks done, and bring back an in-depth report. Your job is to deeply analyze the agents' work, provide feedback, and provide them with continuous tasks."

This prompt offloads the majority of the context-burning work to agents, and each agent has its own context window. So you can utilize 5 agents (5 context windows).

3. Add this hard rule: "Measure twice, cut once policy."

Debugging and patching is messy work. Force Codex to plan first, act after (don't use plan mode; it's just overcomplicated). Ask it to make a task list for every task so it can track progress and iterate better.

4. Add this hard rule: "Keep the codebase clean, no tmp files, no dead code, no dead files. Stay organized all the time. No unnecessary folders, subfolders, or files."

Claude keeps most of its working files in cache as temporary files (which is bloatware, but it keeps the codebase neat), but Codex is output-heavy. It creates tons of folders and files, and your workspace can become a mess after a few sessions. As a result, this contaminates the context window and degrades performance. Force Codex to stay organized and follow the file structure.

These 4 techniques can help you save 40% context every session, and performance will be a lot better.

For planning, use Codex 5.5 (extra high), and after the plan is done, shift to Codex 5.5 (high) with fast mode. This works faster.

124

226K

altereddude retweeted

AilaunchX

@Ai_Tech_tool

24 days ago

Instead of watching an hour of Netflix, watch this 2 hour hour Stanford lecture will teach you more about how LLMs like ChatGPT and Claude are built than most people working at top AI companies learn in their entire careers.

308

231K

altereddude retweeted

Nainsi Dwivedi

@NainsiDwiv50980

28 days ago

Claude Code ships with 5 architectural layers most engineers never open. Not features. Not settings. Layers — each solving a distinct problem that LLMs alone can't solve. And four of them have nothing to do with prompting. Here's the full Agent Development Kit: Layer 1 — CLAUDE.md → The Memory Layer Architecture rules, naming conventions, test expectations, repo map. Always loaded. Always active. Two scopes: • ~/.claude/CLAUDE.md → global • .claude/CLAUDE.md → project This isn't context you paste in before every session. It's context that never needs repeating. The agent's constitution. Layer 2 — Skills → The Knowledge Layer Each SKILL.md carries a description. Claude matches it at runtime and forks the skill into an isolated subagent. On-demand, never always-on. Task-specific knowledge without inflating your main context window. Modular by design. Layer 3 — Hooks → The Guardrail Layer PreToolUse → PostToolUse → SessionStart → Stop → SubagentStop This is the layer most teams skip. And the one they regret skipping first. Hooks are NOT AI. They're deterministic event-driven shell commands. • Auto-lint on every Write • Hard-block on rm -rf • Slack notification on Stop Event fires → Matcher checks → Command runs Quality enforced at the infrastructure level. Not the prompt level. Layer 4 — Subagents → The Delegation Layer Each subagent gets its own context window, model, tools, and permissions. Main agent delegates down. Receives results up. That's it. No infinite recursion — subagents can't spawn subagents. Main context stays clean. Hard boundaries by design. Layer 5 — Plugins → The Distribution Layer Bundle your skills + agents + hooks + commands into a plugin. One install. Whole team inherits the behavior. Think npm packages — but for what your agent knows how to do. Wrapping everything: → MCP Servers on the left (GitHub, databases, APIs, custom integrations) → Agent Teams on the right (parallel execution, message passing, shared permissions) The 5-layer stack in one line: CLAUDE.md sets rules → Skills provide expertise → Hooks enforce quality → Subagents delegate work → Plugins distribute to the team Most production failures in agentic systems trace back to one missing layer. Which one is the gap in your current setup?

NainsiDwiv50980's tweet photo. Claude Code ships with 5 architectural layers most engineers never open.
Not features. Not settings. Layers — each solving a distinct problem that LLMs alone can't solve. And four of them have nothing to do with prompting.
Here's the full Agent Development Kit:

Layer 1 — CLAUDE.md → The Memory Layer
Architecture rules, naming conventions, test expectations, repo map. Always loaded. Always active.
Two scopes:
• ~/.claude/CLAUDE.md → global
• .claude/CLAUDE.md → project
This isn't context you paste in before every session. It's context that never needs repeating. The agent's constitution.

Layer 2 — Skills → The Knowledge Layer
Each SKILL.md carries a description. Claude matches it at runtime and forks the skill into an isolated subagent. On-demand, never always-on.
Task-specific knowledge without inflating your main context window. Modular by design.

Layer 3 — Hooks → The Guardrail Layer
PreToolUse → PostToolUse → SessionStart → Stop → SubagentStop

This is the layer most teams skip. And the one they regret skipping first.

Hooks are NOT AI. They're deterministic event-driven shell commands.
• Auto-lint on every Write
• Hard-block on rm -rf
• Slack notification on Stop
Event fires → Matcher checks → Command runs
Quality enforced at the infrastructure level. Not the prompt level.

Layer 4 — Subagents → The Delegation Layer
Each subagent gets its own context window, model, tools, and permissions.
Main agent delegates down. Receives results up. That's it.
No infinite recursion — subagents can't spawn subagents. Main context stays clean. Hard boundaries by design.

Layer 5 — Plugins → The Distribution Layer
Bundle your skills + agents + hooks + commands into a plugin. One install. Whole team inherits the behavior.
Think npm packages — but for what your agent knows how to do.

Wrapping everything:
→ MCP Servers on the left (GitHub, databases, APIs, custom integrations)
→ Agent Teams on the right (parallel execution, message passing, shared permissions)

The 5-layer stack in one line:
CLAUDE.md sets rules → Skills provide expertise → Hooks enforce quality → Subagents delegate work → Plugins distribute to the team
Most production failures in agentic systems trace back to one missing layer.

Which one is the gap in your current setup?

169

158

altereddude retweeted

Akshay 🚀

@akshay_pachaar

27 days ago

As an AI Engineer. Please learn: - Harness engineering, not just prompt engineering - Prompt caching vs. semantic caching tradeoffs - KV cache management at scale - Speculative decoding vs quantization - Structured output failures & fallback chains - Evals (LLM-as-judge + human evals) - Cost attribution per feature, not just per model - Agent guardrails & loop budgets - LLM observability as a first-class discipline - Model routing & graceful fallback logic - Knowing when to fine-tune vs. in-context learning

331

222K

altereddude retweeted

Bilgin Ibryam

@bibryam

28 days ago

🔖 Claude Cookbook is worth bookmarking. 81 practical guides across 15 categories, covering agents, tools, RAG, evals, multimodal apps, Skills, integrations, and production workflows by @AnthropicAI https://t.co/lgkTL6Av8F

569

39K

altereddude retweeted

Shabnam Parveen

@shabnam_774

about 1 month ago

Instead of watching an hour of Netflix, watch this 2-hour Stanford lecture. It will teach you more about how LLMs like ChatGPT and Claude are actually built than most people in top AI companies learn across their entire careers. Save this.

179

410

53K

altereddude retweeted

Neo Kim

@systemdesignone

about 1 month ago

If I had to start with LLM from scratch, I'd learn these 30 concepts: 1 LLM 2 Token 3 Tokenization 4 Embeddings 5 Latent Space 6 Parameters 7 Pre-training 8 Base Model 9 Instruct Model 10 Fine-Tuning 11 Alignment 12 RLHF 13 Prompt 14 System Prompt 15 User Prompt 16 Context Window 17 Zero-Shot Learning 18 Few-Shot Learning 19 Chain-of-Thought 20 Inference 21 Latency 22 Temperature 23 Hallucination 24 Grounding 25 RAG 26 Workflow 27 Agent 28 Multimodality 29 Benchmarks 30 Guardrails What else should make this list?

systemdesignone's tweet photo. If I had to start with LLM from scratch, I'd learn these 30 concepts:

1 LLM
2 Token
3 Tokenization
4 Embeddings
5 Latent Space
6 Parameters
7 Pre-training
8 Base Model
9 Instruct Model
10 Fine-Tuning
11 Alignment
12 RLHF
13 Prompt
14 System Prompt
15 User Prompt
16 Context Window
17 Zero-Shot Learning
18 Few-Shot Learning
19 Chain-of-Thought
20 Inference
21 Latency
22 Temperature
23 Hallucination
24 Grounding
25 RAG
26 Workflow
27 Agent
28 Multimodality
29 Benchmarks
30 Guardrails

What else should make this list?

696

163

735

25K

altereddude retweeted

Paul Young

@PaulYoungX

about 2 months ago

Cet atelier de 30 minutes, animé par le créateur de Claude Code, vous en apprendra plus sur le vibe-coding que 100 tutoriels vidéo sur YouTube. Ajoutez-le à vos favoris et consacrez-y 30 minutes dès aujourd'hui. Cette vidéo transformera votre utilisation de Claude à jamais.

455

385K

altereddude retweeted

Evan Luthra

@EvanLuthra

about 2 months ago

Anthropic pays engineers $750,000+ a year to understand how LLMs work. Stanford just put a 2 hour lecture that covers 80% of it for FREE. Bookmark this. Give it 2 hours today. It might be the highest ROI thing you do this month:

232

22K

53K

altereddude retweeted

Akshay Shinde

@ConsciousRide

about 2 months ago

Path param identifies one exact resource in the url structure. It fits single item lookup by its unique id. Query param works for filters on a list endpoint. It stays optional and supports multiple conditions at once. Both can technically return the same data but the first one signals clear resource identity in rest design. The second one signals search or narrow down on collection. This separation helps routing, caching and overall clarity in bigger systems.

264

127

31K

Zack

@altereddude

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users