Robert Evans

@revans

CTO building human-first AI systems | Creator of simple systems that scale | Code, cognition, and creation

Boise, Idaho

Joined January 2007

297 Following

340 Followers

6.1K Posts

Robert Evans

@revans

about 2 hours ago

I ran with grok cli this morning and man, it makes a lot more assumptions about things than other LLM I have used. Other LLMs have a similar issue, but grok seems to really lean into it. LLMs need to be trained that when it doesn't know, has open questions, knows it is assuming, it should verify with the human.

Robert Evans

@revans

about 2 hours ago

@ravikiran_dev7 What do you mean, if? 🤣

Robert Evans

@revans

about 2 hours ago

This is why i created: https://t.co/qsLNe2O2Mm & https://t.co/Bx8jWHWQN2

Robert Evans

@revans

about 2 hours ago

While social media has been pushing the narrative that you must be building skills, I see no one talking about testing whether you actually need those skills or not. More importantly, by adding in all of those skills have you evaluated how an LLM will reason when they're all loaded into the context window? Have you tested whether or not your multiple skills conflict in anyway? Are they using the same patterns, same verbage or are you being inconsistent? If you aren't checking this, then you're creating the perfect situation for the LLM to produce the wrong output. Always question the so called AI experts on social media, because the vast majority are not doing any of this testing. They are not context-first and context-aware.

Who to follow

ADHD Jesse

@adhdjesse

👋 Building Wavepal (https://t.co/6kByfgwLi4) 📙 Author of Extra Focus: The Quick Start Guide to Adult ADHD (https://t.co/wBtoaKUvpz)

Nick Quaranto ⛵️

@qrush

Platform Engineering @wistia. Member 0001 at https://t.co/rAXASM7aXr. Started https://t.co/kRWwkfBUCO. A short, sturdy creature fond of drink and industry. He/him

Mario Viviani

@Mariuxtheone

Manager, Developer Relations - NVIDIA Omniverse - Opinions are my own, not my employer's.

Robert Evans

@revans

about 2 hours ago

That should be a given. Also, it's not just about write all the skills, put them into a git repo and call it good. You need to do constant testing of those skills to see if they're still needed as new models come out. A lot of skills should be viewed as temporary - filling in the model gaps. So, a skill/agent prompt test harness is also important so you can retire skills as the models get better. While everyone is so focused on build a skill for everything you think of, very few are being context-aware. What needs to go in context versus what doesn't and what is harmful to the context. I see so many variations of markdown files where so called "AI experts" say you need to have a CLAUDE.md, AGENTS.md, SOUL.md, ABOUT.md, CONTEXT.md, CONTEXT-MAP.md .... filling up their context window, never asking if any of these markdowns actually conflict and then blame the LLM for either its bad output or "hallucinations" when the culprit was the person filling up the context window as fast as possible without ever taking the time to understand if it was even necessary.

Robert Evans

@revans

10 days ago

I woke up to a heated argument between my AI agents about one of my ideas. The team took my idea, ran a deep market analysis, and then sent the idea and market research to my council to deliberate and determine if the idea has enough weight to pursue and if I am the right person to pursue it. The online AI debate is either about how I gave it an idea and shipped it or AI is so horribly they gave up on it. Both are noise. I find joy in building the systems AI runs within, and not skipping the last 100 years of business lessons. https://t.co/KybVvnndfu

Robert Evans

@revans

10 days ago

@dhh Congrats!

Robert Evans

@revans

12 days ago

If you think AI produces slop, it’s because you’re bringing it the slop.

Robert Evans

@revans

24 days ago

Markdown will not be the dominant format for LLMs for long. HTML is much more suited for LLMs that understand XML very well. With html you get: * Clean Structure * Supports narrative & prose * Supports rules * Supports state - think short term memory The W3C already supports custom elements and attributes. Xpath with agents can allow agents to be very precise with what it reads and writes to. We have spent 30+ years making DOM traversal fast for humans. AI can take advantage of that! Imagine every html page with custom elements/attributes for agents. No need for LLM.txt. It just works. We’ll have to solve the security aspect of it, but that’s expected. /cc @karpathy

Robert Evans

@revans

about 1 month ago

@shivsakhuja I do like fairly strict boundaries between agents. It makes debugging easier! 😝 I tend to use a few rules often in agent design: Single Responsibility Principle, Separation of Concerns, and Progressive Disclosure.

Robert Evans

@revans

about 1 month ago

Earned Abstraction Knowledge lives in the conversation between the question and the answer. If you’re not in the conversation and you only look at the question or the answer then you lose out on seeing and walking the path connecting the two.

Robert Evans

@revans

about 2 months ago

@IranArmyMedia That's funny coming from you where your country believes in raping women and children.

Robert Evans

@revans

2 months ago

I have a question for those who read this: For some context, I was working on my global CLAUDE.md file making some tweaks and whatnot. Then I started thinking about how much interference I'm going to get when working with agents and plugins because of the CLAUDE.md file. When debugging, I won't know if the CLAUDE.md is the culprit or the plugin itself. I decided I wanted the CLAUDE.md file empty. Nice and simple. However, that introduced a problem. There are commands in that file that I like. Idea! Lenses. What I love about AI, you can have an idea, spend a bit of time fleshing it out, open up claude code, tell it you want to plan out an idea and then start talking and less than a minute later you have a prototype. Lenses. A Claude code plugin that allows you to load lenses that will tell Claude how you want to interact with it. This solves my CLAUDE.md issue, because I can load a lens that runs the prompt I original had and now the session will be in that frame. This introduced a whole set of ideas that I didn't have, until I started playing with it. The obvious, multiple lenses, load any time. So I added several. Next, personalities. Maybe I want Claude to be Bob and communicate in a different way. Maybe I want to simulate a discussion with Marcus Aurelius? If I wanted to make the experience even better, I'd add a SQLite database with the vector extension and embed some of his work and tell the personality how it can query it. Next, if I have lenses and personalities, maybe it should stack them. Maybe have Hemingway being Socratic? The list could go on and on: Job Function, Cultures, Audience profile... Not sure it needs to go that far, but it could. Gotta love the ability to freely creative. Anyways, back to the first thing I wrote, while I am building out an assortment of lenses and personalities, what personalities would you add or find fun? Something that would crack you up? Whatever suggestions I get, I'll add in. Then I'll open source it!

Robert Evans

@revans

2 months ago

Thank you! I'm very curious about the internals of what you have running - I assume you're not exposing that, which is understandable. I've been doing a lot of work around building systems for AI to work within to try and achieve a more deterministic output. I wrote about this in a small series that may or may not be of interest to you, but I thought I'd share: https://t.co/7xAQCyEtTk

Robert Evans

@revans

2 months ago

@jamescoder12 I’ll up you. I have all of Brunson’s books created into skills you can reuse. Follow an DM me if you want access to those skills.

Robert Evans

@revans

Who to follow

Last Seen Users on Sotwe

Trends for you

Most Popular Users