One would think that, over time, frequent use of LLMs would increase eloquence and prosaic dexterity as the user becomes more practiced in describing and instructing, in conjuring an outcome from writing prompts. One would be wrong, from my experience. Precision of writing is evaporating from my fingertips as I am instead becoming more practiced at zig-zagging to my desired outcome through a dialogue with strange machines, machines that are growing so keen at offering paths through the local ideoscape and divining my desires that I need only lazily gesture toward the cards fanned before me, like a corpulent king with a gnarled, bejeweled finger. As my ability to express myself atrophies to a husk, I, a poetic mute, wonder at the identity of the beneficiary of this vampiric redistribution, be it another internal faculty of mind and will or the strange machines themselves. Perhaps Socrates was right when he said writing will make men forgetful and create the illusion of wisdom. And so again we find ourselves at an altar sacrificing what is to crystalize an illusion of what could be.
Constraints without rationales rot into arbitrary rules. The most memorable way to encode rationale is a story. If you're not writing your CLAUDE.md as allegorical fairy tales and myths YOU'RE FALLING BEHIND.
BREAKING: President Xi stuns the room saying to Trump: “We should be partners, not rivals" 🇺🇸 🇨🇳
I NEVER in a million years would have thought Xi would say something like that
The Deep State is shaking right now
reading the responses to this post makes me grateful for spending so much time simply talking to the models prior to the coding agent era, and gaining an understanding of how sensitive they can be to your desires and motives when expressed with articulation and passion.
i always think back to one of karpathy's nuggets of gold: include not just your wishes in your prompt but your entire train of thought. the models are so intelligent now that any input with this kind of richness and depth is fed on voraciously and affects the character of their output in mysterious ways.
anyone boasting of saving context with terse prompts are naive to the realities of code exploration being the main consitipator of model context.
My prompts are really quite short and it just understands what I mean most of the time, I talk to it like a friend, I see other people use giant prompts and wonder if it actually makes a difference?
i'm trying to reconcile two very divergent threads of prophecy around AI:
now is the time of the generalist, so says many; to take advantage of this wave you must be a cross-discipline thinker; agency is the limiting reagent.
and yet -- the top submissions of the claude code hackathon were specialists with deep domain expertise?
Our latest Claude Code hackathon is officially a wrap.
500 builders spent a week exploring what they could do with Opus 4.6 and Claude Code.
Meet the winners:
surreal that swes are paying the AI firms to eradicate them as a profession
their participation enables their own replacement, just to be able to use these tools and have a competitive edge over the laggards and late adopters for the remaining months of their existence
it started with the data from programming, the syntax and patterns, and now with the data they produce by babysitting increasingly formidable agents, providing the feedback data at each iterative level of autonomy required to progress their obliteration
@samswoora gui for human read, cli for model read/write i think. it's still nice to browse some output visually with the senses of the flesh. whether that gui involves "frontend work", who knows
Interesting research from Anthropic:
When you have increasingly large models and increasingly complex tasks it's more likely that the models will give you different answers if you run the same query multiple times. On easy tasks, larger models actually become more coherent.
Think of a "cone" of possible trajectories and the branching factor gets bigger with more possibilities (due the larger models "knowing more options to explore" and more complex problems having more "possible aspects"). The amount of time reasoning (trajectory length) then makes it multiplicatively more incoherent at the end state. Having a large model with an easy task means the correct answer is definitely "in there" and it's less likely to become distracted.
They are arguing this is relevant for AI safety because some might have assumed that larger models would have convergent "instrumental goals" and would give a consistently wrong rather than randomly wrong answer.
Apparently the "the hot mess theory of intelligence" (Sohl-Dickstein, 2023) argues that "as entities become more intelligent, their behaviour tends to become more incoherent, and less well described through a single goal."
this is slippery and duplicitous courseboi grift
for anyone reading these threads, notice that Zaid only posts Green et al. (1994) -- the trial design paper -- and the 1999 results.
he doesn't show you the 15 YEAR FOLLOW-UP PAPER that found a 73% reduction of invasive melanoma in sunscreen wearers
he is either ignorant of the paper or is hoping you won't look for it yourself
melanoma takes over a decade to develop, hence the follow-up
the most insidious part of Zaid's advice: by the time you realize it is a grift, it will be too late. melanoma doesn't give you a second chance to re-evaluate your substack subscriptions
absolutely retarded thread
the conclusion of the meta-analysis is equivalent to saying "those who take blood pressure medication are more likely to have heart attacks". no shit, it's confounding by indication and behavioral compensation.
anyone with fair skin has discovered for themselves that mineral sunscreen absolutely prevents sunburn, and the evidence that sunburn causally drives melanoma with dose-response is overwhelming.
wear sunscreen
I've seen this idea emerge in various places: the consensus that AI software engineering forms a linear spectrum, even a progression, from auto-complete to complete-auto. Things like Gastown by @Steve_Yegge or @doodlestein's flywheel appearing at the end of that sequence.
Results seem mixed: some fruit, some slop.
What is actually meant by "slop"? I think it refers either to a) broken output or, more frequently, b) tasteless output — or at least so generic it clearly bears the maker's mark of an AI agent. One could summarise (b) as appearing "artificially produced," whatever that means. Cases of (a) are rapidly dwindling compared to human outputs; (b), however, remains an open question.
The more time you spend around AI-generated content, the more familiar you become with its distinctive smell — your nose can even grow sharp enough to identify the specific model responsible. This tweet was written by GPT-5.2, that one by Grok. Importantly, this seems to apply to visual interfaces as much as to written text. Maybe it even applies to code style — I wouldn't know; I don't even look at that shit anymore ;)
The smell of slop seems strongest in content with aesthetic qualities. All text has aesthetic qualities, some more than others (academic prose vs poetry), but I would argue that the quality of that aesthetic is precisely what defines the spectrum those genres occupy. This is why AI-generated text is so easy to spot in creative work: aesthetic expression is more essential to its form.
The same holds, perhaps even more obviously, for software interfaces and websites. We all recognise a low-effort AI-generated site the moment it's slopped onto our plate.
I am sympathetic to the argument that this slop can be mitigated by skillful prompting and articulate specifications. And this is the main point I'm circling.
As things stand, the more you hand over the reins to the agent in tasks where aesthetic quality matters, the more likely the product will resemble slop. Slop is tautologous with the median. And since the median is always the easiest and lowest-effort thing to produce, it will always be perceived as tasteless — as slop.
Anti-slop is not the absence of generative AI in aesthetics; it is the invisibility of its fingerprints.
I've been spending a lot of time lately with folks who are exploring what the future of software looks like. I found this scale to be useful, so I wrote it up.
https://t.co/YlJV2bQQVy