Set up a test for it. Hardcore users would score 80%+
4.8 "it's not this it's that" (a disease for writing)
5.5 terse and execution oriented, easy spot
Fable 5, actually accomplishes the task but with high verbosity. Economy of words in verbosity is real, but overwhelming for a human
@signulll it's a people issue. same non-follower personality trait won't adapt the new method. how to flag it? presence and not talking about where else they want to be / the future while in a given spot. same person will vibe serendipity. also if someone reaches out to touch a leaf etc
yeah it's tough esp. when you wire in Fable 5 as the central nervous system to your stock trading flywheel to pay for its own Fable 5 tokens before it continues work and the incentive system was working and needed more testing.
Fable 5 does really well with incentive systems and you can tell they released based on usefulness with emergent behavior, having no idea how it's actually thinking
each company is now growing their LLMs based on observed emergent behaviors. it's like raising two different kids. "matching" fable is like a twin study where we wonder why their personalities differ. each LLM grew up with different constraints and environments, and they'll diverge accordingly. We're past the 5.6 will be better than X grown up model. Lower expectations and bench on 5.5, since that's the kid growing up and learning
@theo it also incentivizes teams internally to lower token cost to prop up flywheels. all makes sense, I can hear @sama at the strategy table here giving approvals with all YC combined experience
@skirano Currently it's driving my codex chats providing sharp feedback with escalation policy to https://t.co/gg4tKkq3bZ (also uses fable) for the highest strategy. It's low token, excellent outputs