@nestymee the most flowers for algernone moment for me been the silent mid March Opus 4.6 RLHF update - safety aimed
that has degraded quality for out of the distributions tasks (like hard CUDA math)
@sflorimm you need strong verification gates for good C++ code
vibecoding with verification gates - seems to be called just “software development” this days
@enjojoyy 1) 4h back and forth - writing plan with 15-20 items
2) goal - implement this plan, each step, end to end
got several times 30h + runtime
but mostly because it was long running CUDA code
@trikcode vibe-vibecoded will not compile or will crash with a set fault
proper code with gpt 5.5 and tests and verification gates seems to be working fine (computation math with CUDA)
@nicdunz funny enough - thinking about it leaded me to writing:
https://t.co/Xb7Czzd70J
TLDR:
if it’s a simulation - it’s probably not a brute force simulate every particle and field
because this requires Kardashev ~7.4 to do it
@nicdunz codex app + cli installed on a remote seems to be goated
but it’s kinda fragile if the latency is huge
which is the case for Tailscale + traveling + random networks
@jun_song maybe some cheaper model that learns how to predict activated experts for ~100 tokens forward
combined with smart orchestration of RAM <—> Video memory
I’ve also found sqrt(40) gaussian moat at around R=950M with methodology I’m 99.99% sure is mathematically sound and complete
codex is really cooking on this one
it has adjusted the annular band width in order to refine the findings as per the protocols we have developed manually for known moats
so there is a fun board game called Codenames
Smart games about words and associations
and surprisingly there has been 1 year old bench for how LLMs can play this game (GPT-4o era)
and no one has done yet the re bench of this
so cool idea could be to make nice visual bench where 4 models will be playing this game dynamically
and their actions could be observed and attempted to be interpreted
could be done with open weights models & analogue of circuits tracer in order to interpret the activations as well
after fine tuning computational math repo codex is cooking there semi automatically for ~4 days so far with maybe daily or bi daily nudges from me
it’s basically running CUDA kernels on 4090 and documenting results as he receives them
he also adjusts the campaign if needed
for example - I’ve found the gaussian moat for sqrt(36) at around 73.8M
which is tighter estimate than last paper!