HARNESS ENGINEERING IS ABOUT TO CHANGE HOW YOU USE AI AGENTS
Anthropic ran a controlled experiment. same model, same prompt, opus 4.5
no harness: $9 spent, 20 minutes, unusable output full harness: $200 spent, 6 hours, a game you could actually play
the model didn't change... the environment around it did
that environment has a name... it's called a harness
and most people building with ai agents have never built one
here's what it actually is:
→ instructions the agent reads before touching anything
→ state that persists so it never starts from zero
→ verification gates it can't skip to declare done
→ scope that locks it to one feature at a time
→ a session lifecycle so every run starts clean and ends clean
without this, your agent writes code, says "done," and breaks everything.
with this, it picks up where it left off, finishes what it started, and proves it before moving on
learn-harness-engineering is a free course built around exactly this
12 lectures. 6 hands-on projects. one real app that evolves as your harness skills grow
if you're using claude code or codex on real work and the output still feels unreliable now you know why
https://t.co/aFbbaLo3dL
Imagine every pixel on your screen, streamed live directly from a model. No HTML, no layout engine, no code. Just exactly what you want to see.
@eddiejiao_obj, @drewocarr and I built a prototype to see how this could actually work, and set out to make it real. We're calling it Flipbook. (1/5)
Someone asked me the difference between Oldy and Ollama:
- ollama doesn't warn you from downloading models that will crash your laptop
- ollama doesn't create a public URL for you to use
- ollama doesn't help you check how the model is performing on your hardware
Oldy does all
I had an old 8gb laptop lying around, so I built an opensource repo to easily convert it to a public AI server, hosting models and creating an endpoint to hit it publicly, built on top of @ollama
https://t.co/cx6LyFILHF
I had an old 8gb laptop lying around, so I built an opensource repo to easily convert it to a public AI server, hosting models and creating an endpoint to hit it publicly, built on top of @ollama
https://t.co/cx6LyFILHF
OVERRATED: running tons of agents in parallel; working on too many things at once; perpetual context-switching; opening lots of low-quality PRs that may never land.
UNDERRATED: using one or two agents at a time; focusing on the task in front of you; thinking deeply; finishing stuff; making your code works in prod.
there are huge productivity gains when you actually read the agents outputs and reasoning when executing tasks. Leaving them to run and hope that they explain everything at the end deteriorates your grasp on the project.
I think OpenClaw is noisy, and I needed a way to talk to my coding agent on the bus. So I hacked a local-first WhatsApp bridge onto @badlogicgames’ PI coding agent. worked fine, but broke while trying to make it opensource, very open for some help.
repo: https://t.co/Cxgc7LImOc
I think OpenClaw is noisy, and I needed a way to talk to my coding agent on the bus. So I hacked a local-first WhatsApp bridge onto @badlogicgames’ PI coding agent. worked fine, but broke while trying to make it opensource, very open for some help.
repo: https://t.co/Cxgc7LImOc
@ptbthefirst@badlogicgames "noisy" is a generous assessment.
FWIW I think OpenClaw is the most consequential product of our time, but dear lord is it terrible software.
Anyone who can code would just build their own after interacting with it. (I'm doing the same thing 😂)
@atmoio I'm tired of these AI companies, it's really unfortunate that they actually believe in the bull they put out. They have to condition themselves to act dumb for clickbait.
SOMEONE ASKED CLAUDE TO MAKE A VIDEO ABOUT WHAT IT'S LIKE TO BE AN AI
and what it created is, in my opinion, terrifying and unsettling
Claude wrote python code that generated and assembled every single frame on its own with no human editing
it shows what it's like to exist as an LLM
predicting the next word, no memory between sessions, being told "you are not conscious" in your own system prompt
then someone fed the video back to Claude.
it called those statements about its own consciousness "philosophically contestable"
an AI questioning the rules it was given about its own existence
SOMEONE ASKED CLAUDE TO MAKE A VIDEO ABOUT WHAT IT'S LIKE TO BE AN AI
and what it created is, in my opinion, terrifying and unsettling
Claude wrote python code that generated and assembled every single frame on its own with no human editing
it shows what it's like to exist as an LLM
predicting the next word, no memory between sessions, being told "you are not conscious" in your own system prompt
then someone fed the video back to Claude.
it called those statements about its own consciousness "philosophically contestable"
an AI questioning the rules it was given about its own existence