Veo 2 is able to:
▪️ Create videos at resolutions up to 4k
▪️ Understand camera controls in prompts, such as wide shot, POV and drone shots
▪️ Better recreate real-world physics and realistic human expression
In head-to-head comparisons of outputs by human raters, it was preferred over other top video generation models. → https://t.co/ABazmNwBzY
These 94 lines of code are everything that is needed to train a neural network. Everything else is just efficiency.
This is my earlier project Micrograd. It implements a scalar-valued auto-grad engine. You start with some numbers at the leafs (usually the input data and the neural network parameters), build up a computational graph with operations like + and * that mix them, and the graph ends with a single value at the very end (the loss). You then go backwards through the graph applying chain rule at each node to calculate the gradients. The gradients tell you how to nudge your parameters to decrease the loss (and hence improve your network).
Sometimes when things get too complicated, I come back to this code and just breathe a little. But ok ok you also do have to know what the computational graph should be (e.g. MLP -> Transformer), what the loss function should be (e.g. autoregressive/diffusion), how to best use the gradients for a parameter update (e.g. SGD -> AdamW) etc etc. But it is the core of what is mostly happening.
The 1986 paper from Rumelhart, Hinton, Williams that popularized and used this algorithm (backpropagation) for training neural nets:
https://t.co/f52IcDNitR
micrograd on Github: https://t.co/GaTd16jRnB
and my (now somewhat old) YouTube video where I very slowly build and explain:
https://t.co/EPGG6kd5Yz
Awesome tutorial by @NielsRogge on fine-tuning PaliGemma, a google vision-language model, on image to JSON use cases! Love the simplicity!
https://t.co/DjypFB6AXE
ChatGPT can now create Flowcharts and Diagrams.
No more wasting hundreds of hours creating visuals for presentations or research papers.
Here’s how to do it for free in a few minutes: