I believe this new model in Claude Code is a glimpse of the future we're hurtling towards, maybe as soon as the first half of next year: software engineering is done.
Soon, we won't bother to check generated code, for the same reasons we don't check compiler output.
Basically think of the o3 results as validating Douglas Adams as the science fiction author most right about AI.
When given longer to think, the AI can generate answers to very hard questions, but the cost is very high, and you have to make sure you ask the right question first.
This is the most important paper in a long time . It shows with strong evidence we are reaching the limits of quantization. The paper says this: the more tokens you train on, the more precision you need. This has broad implications for the entire field and the future of GPUs🧵
"We still need a human in the loop." @johnnyb shares how important ownership and oversight are in the evolving world of AI-driven coding.
We discuss this and much more on the latest episode of Cyber Sentries.
Time is the shitcoin of your life
a volatile, unpredictable, limited supply token
make smart trades with friends and family
the inevitable rug pull could happen any time
HODL every moment
This is a baby GPT with two tokens 0/1 and context length of 3, viewing it as a finite state markov chain. It was trained on the sequence "111101111011110" for 50 iterations. The parameters and the architecture of the Transformer modifies the probabilities on the arrows.
E.g. we can see that:
- state 101 deterministically transitions to 011 in the training data, so the probability of that transition becomes higher (79%). Not near 100% because we only did 50 steps of optimization.
- state 111 goes to 111 and 110 with 50% probability each, which the model almost learns (45%, 55%).
- states like 000 are never encountered during training, but have relatively sharp transition probabilities, e.g. 73% of going to 001. This is a consequence of inductive biases in the Transformer. One might imagine wanting this to be 50%, except in a real deployment almost every input sequence is unique, not present in the training data verbatim.
Not really sure where I was going with this :D, I think it's interesting to train/study tiny GPTs because it becomes tractable to visualize and get an intuitive sense of the entire dynamical system. Play with here: https://t.co/8jdceMLpqy
Old joke about agnostic technologists building artificial super intelligence to find out if there’s a God.
They finally finish & ask the question.
AI replies: “There is now, mfs!!”
I've developed a lot of plugin systems, and the OpenAI ChatGPT plugin interface might be the damn craziest and most impressive approach I've ever seen in computing in my entire life.
I used to think <person> was smart. Then I discovered that he disagrees with me about <political issue>, and I realized he couldn't be, because no one who disagrees with me about <political issue> could be smart.
"Of course it's expensive to rent your computers from someone else. But it's never presented in those terms. The cloud is sold as computing on demand, which sounds futuristic and cool, and very much not like something as mundane as 'renting computers'." https://t.co/I7xDQEjPBa
.@VMwareTanzu is bringing together voices in the .NET community to share how they use .NET for the enterprise - and at scale. Join them on March 30-31 to learn: https://t.co/YOEnawU73W
#DotNETBeyond