Istanbul-based. Building practical AI agents, local-first tools, and small systems that make life less stupid.
https://t.co/UuzCxVM7cl · https://t.co/5sdnprkWEM · experiments.
I like people who build, read, walk, overthink, and still have taste.
We finally know why bigger models are smarter. It's not the data.
More training data was supposed to fix small models.
A new paper shows why it cannot.
Researchers proved some tasks need model scaling, not data scaling.
A small model fails them even with infinite data.
The cause is competition over neurons.
Frequent tasks grab capacity first and keep it.
Rare task updates get overwritten before the next example arrives.
The model learns, then forgets, in an endless loop.
Scaling breaks the loop in three steps:
1. Common tasks get fully learned
2. Their gradients fade to nothing
3. Rare features accumulate safely
The team pretrained OLMo models from 4M to 4B parameters.
They injected novel tasks at controlled frequencies during training.
Only the largest models learned the rare ones.
Interference between their gradients nearly disappeared.
How many tasks is your model silently skipping?
A few moments from our recent event at Marriott Hotel Pendik.
We discussed the future of digital transformation in industrial environments, from OT/IT integration and data standardization to AI use cases on the factory floor.
One point became very clear: modern manufacturing sites cannot treat industrial data as scattered tags, isolated systems, or disconnected dashboards anymore.
Before AI can create real value, the data layer must be governed, contextualized, reliable, and usable across operations, engineering, analytics, and enterprise systems.
This is exactly where HighByte Intelligence Hub becomes critical.
By turning raw industrial data into structured, contextualized information models, HighByte provides the foundation for scalable Industrial #DataOps, Unified Namespace architectures, and governed agentic AI in manufacturing.
AI in industry should not mean random chatbot experiments near production systems.
It should mean controlled, auditable, context-aware intelligence built on trusted industrial data.
That is the direction every modern manufacturing site needs to move toward.
#IndustrialDataOps #HighByte #Manufacturing #DigitalTransformation #OT #IT #AI
These are the deception results from Kradle AI's evaluation / test (Kradle Deception Eval)
If you look at the details, you can see that the new #Anthropic#Mythos-class model is measurably evil.
Since we’re not allowed to do any biomedical work with Fable 5, I decided instead to make one-shot Flappy Bird. It's amazing & given our overlords allow making games with Fable 5, so I'll just keep making many more. Who cares about curing disease anyway?😄https://t.co/1YMDdhPcHg
@armaniferrante Well, I wouldn't count on it, because those tricks work until they don't, so it's better to be careful. They may also report you or any user to their cultist overlords, so I wouldn't trust them so much.
For a while now I have realized that, although Chinese models are somewhat inferior in certain areas such as understanding large codebases, the advantage is that if we spend most of our time creating perfect harnesses, models like DeepSeek, Kimi K2, or Minimax should be more than enough to complete any task given the right tools.
However, the main challenge is determining the perfect harness. Based on the current harnesses and my experience modifying many open harnesses, it is clear that none of us yet knows how to create the perfect harness.
@beffjezos Well, if you or someone else does that, or at least initiates the spark, then I think you will go down in history as the savior of civilization, so this is a good deed worth pursuing.