COO, coach and the guy that gets things done. Likes are simply a way to store a reference to content/posts, and do not represent me liking any content.
@antirez@liuliu I found this to happen sometimes. It seems that it’s some sort of context pollution that makes the model suddenly retarded.
The only solution I found was to start a brand new session and continue with the same exact checklist item that failed, and it would then work perfectly.
@hirenpatelatl@jun_song@grok M5 Max also wins in batch if you use oMLX as the LLM runner for agentic coding, due to prefix caching. Batch is then almost instant.
@ivanfioravanti@Kev96790724@Teknium I have a separate MEMORY.md file, which keeps the "game save" of the current situation and gets cleaned up once a while, with the stale entries being moved to ARCHIVE-yyyy-mm-dd.md files. I found this to work best for me.
@camilobayarri@jun_song Assuming it’s for agentic coding and vibe coding then Qwen3.6 27b or qwen3.6 35b a3b on an m4 max or m5 max MacBook with at least 64gb ram. 35b model will fly at q8 even and 27b model will be more clever, but much slower.
@bridgemindai Have you tried qwen3.6 35b a3b q8 mlx running on oMLX?
You'll get 90 tps at low context and 64tps at 70k context, and it will feel faster than gpt at times.
Granted, it's not as clever, but it is a very good and fast coding model if you instruct it well what to do.
I saw that months ago while building a mobile app with Codex and Gpt5.3. Gpt was surgical, precise and only did what I asked. Opus decided it knew better than what I asked, and implemented what it wanted instead. I would ask a question and it would start coding. That was really annoying.
Gpt > Opus