Finding Fable 5 Max effort & below surprisingly sloppy with poor attention to detail - especially relative to GPT 5.5 (Codex; xHigh)
One caveat: I've found models often perform poorly in their first week. Seems to take time to iron out model/harness production quirks
@alexhillman ah shit - I'm still kinda new to Twitter and not in the habit of checking notifications. Just spotted this now.
Read through the pinned summary - super cool design. Excited to dig deeper!
Figuring out the best way to build an AI operating system for myself, and created the following repo to document the journey and share helpful tools and resources along the way.
https://t.co/bfSHaGi1xO
So far, I've included:
- A list of my top questions/concerns with building an AI OS
- A repo security review skill that's helping me leverage existing open source tools with more confidence
- My 'Building with AI' guide from August 2025 - in case it's helpful for folks
When knowledge is ubiquitous, the measure of an effective education shifts from being about the amount of knowledge you can acquire to the value of the things you learn how to do.
Finding Claude Code isn't reliably calling OR following skills. Any tips?
Eg: Told it to copy an xlsx & modify formatting -> it re-created it missing sheets + formulas
Happened twice:
1st: It didn't call the Doc skill
2nd: Told it to use the skill. It did.. but didn't follow it
... insertions, metadata that might be stale. Be paranoid. If any metric doesn't match, that's a clue β investigate until you find the root cause and fix it. Nothing ships until validation passes."
Bonus points for pasting the rest of the thread into your prompt as well.
Trying to refactor a large, messy financial spreadsheet using Claude Code (with docs skill).
A few lessons I've learned the hard way + instructions for how to use them to level up your LLM's output:
Sum every summable metric (counts, revenue, totals) in both the original and your output β they must match within $1. Go row-by-row, year-by-year, category-by-category. Think through LLM sharp edges: off-by-one errors, string matching with whitespace, column references after...