bootstrapped founder, fractional CTO, event organiser, music maker
๐ฎ๐น๐ฌ๐ง
I talk tech, AI, creativity
yes to nuanced, balanced takes
no to bullshit hype
Fable is really good, but I still wouldn't trust it to have prod DB access.
Would happily pay for a product that sits between LLM and server and GUARANTEES that no destructive actions are ever taken.
This needs to be a company I can hold accountable, not an LLM that apologizes
@tlakomy false dichotomy. 99% of the code I produce is not handwritten, and I don't expect this to change. but also we are in a bubble. we won't get AGI any time soon. LLMs make very stupid mistakes still and can't be trusted unattended.
Can we just move past this collective hallucination of AI being able to replace a senior engineer, and just make the most of what is still an incredible technological advancement?
Also can't help but being sceptical about the opinion of people who get featured on OpenAI's official announcement, even with the best of intentions you will be inclined to just say positive things about them
I'm sorry but I'm not buying it. These models are getting incrementally better but let's not throw the "AGI" word around when they still make very stupid mistakes.
I literally just had it ignore a rule in AGENTS.md - which literally has 10 lines
One day while testing GPT-5.5, I had my first taste of AGI.
We had a branch with hundreds of visual and front-end changes, plus complex refactors.
At the same time, main had changed a lot too.
Conflicts everywhere.
OVERRATED: running tons of agents in parallel; working on too many things at once; perpetual context-switching; opening lots of low-quality PRs that may never land.
UNDERRATED: using one or two agents at a time; focusing on the task in front of you; thinking deeply; finishing stuff; making your code works in prod.
I also believe this and Sam Altman kinda confirmed this, in summer 2025 nonetheless.
The raw intelligence has likely peaked and they're trying to improve the models by giving them more tools / more context / better prompts
For those living under a rock: LLMs stopped becoming smarter around summer 2025.
Everything impressive you see since then is about finetuning them for specific tasks (mainly coding and software-tool-based task solving) and building tooling around them (such as agentic coding systems).
@zuess05 "when the AI stops making mistakes" this will never, ever, ever happen.
There will be fewer mistakes but unless there's a fundamental change in how LLMs work, you can never 100% trust them - for example - to have access to your production DB unassisted
current status:
- cancelled Cursor
- cancelled Claude max
- upgraded to ChatGPT 5x
Claude is still the best for personal chats (ChatGPT style makes me vomit) but for coding I can't rely on a model that gets regularly nerfed
@LBacaj I'm finding the cognitive load to be much higher. Used to tackle a handful of tasks a day, now I manage+review hundreds of files a day over dozens of tasks. My brain is fried at the end of the day!
Opus 4.6 has suddenly become really bad, I appreciate the speed at which Anthropic is pushing new Claude Code features but people mainly want a great model, everything else is bonus
Superwhisper's next update might be too powerful to release publicly.
The new voice model is so fast at transcription it started finishing sentences users hadn't thought of yet...
We even put it in a sandbox and it dictated its way out.
It also identified a flaw in the English language that had gone unnoticed for 600 years. Linguists have been informed.
Out of an abundance of caution, we are withholding the update until further notice.
Sincerely,
The Superwhisper Team
@SabatinoMasala the "out of 5" reviews system is completely broken and it needs to be rehauled for it to be meaningful. People avoid a 3.9 restaurant, but if you said your meal was 7.8/10 it would be a compliment. None of it makes sense!
3.0 should mean 'satisfactory', not 'food poisoning'
hey @sama did you actually almost buy a Business Yearly plan on https://t.co/n1ji5HrcU0 in July last year?
you might not need it but I can definitely work out a discount for you..