hi there! we couldn’t help but overhear you discussing biology from across the room. we really appreciate your enthusiasm, but we’re going to need that conversation to come to an end now
A little perspective: RL as a field spent 10 years making algorithms slower and slower. If you look at the original ALE, it actually can sim a few thousand frames per second per core. If you look at some of the last big env releases before a ton of people moved over to LLMs, you'll find several at dozens to hundreds of steps per second with such bad engineering that they don't even scale with vectorization.
The field did this exactly because they presumed they would have to train directly in the real world. In reality, what we got out of this is a bunch of brittle off-pol and model-based algorithms that burn a ton of compute and don't work outside of the benchmarks shown in the original pubs. There's a clear gap between on-pol and other methods. You don't simply switch and scale up compute to save data. You have to spend a TON more compute to match the perf of on-pol, and then you spend even more compute to gain in sample efficiency.
Our whole core realization with PufferLib is that we can write good sims for a lot of problems 10000x faster. Good doesn't even mean accurate. It means accurate enough with domain randomization and other tricks that our agents can implicitly sysid their current setting and act robustly. So far, this has worked across several different industries. I'd love to give examples here, but this is unfortunately where exact client details get confidential. We need to be better about negotiating publicity, and we're starting to do that as Puffer gets bigger.
Another major flaw with slower and slower algorithms is that the core research loop also gets slower and slower. We sim mazes and 2048 at 10+m steps per second. Big deal right, those are easy. Wrong: algorithmic improvements on those envs have consistently predicted performance improvement on every single env in our test suite. Without this, we wouldn't have been able to release so many core breakthroughs in the last 2 years with a grand total of ~20 GPUs. We ran 20,000 experiments on ~12 of them in the 3 weeks leading up to Puffer 4 launch. At traditional speeds, it would have taken Google scale compute and an infra team.
So no, we're not going to step the real world at 20m sps, but assuming that matters (or at least that it is the only thing that matters) is where the field went wrong. /rant.
In 2016, at an AI conference in NYC, I explained artificial consciousness, world models, predictive coding, and science as data compression in less than 10 minutes. I happened to be in town, walked in without being announced, and ended up on their panel. It was great fun.
Organizer: @davidchalmers42
Distinguished panel members:
@kahneman_daniel
Susan Schneider
@GaryMarcus
Jaan Tallin
Original YouTube video:
https://t.co/bbqzd4yR5a
29:40-39:15
Two relevant pages from the AI Blog:
1990: planning & reinforcement learning with recurrent world models and artificial curiosity. This yields a simple explanation of consciousness and self-awareness: https://t.co/OBoGmuZWab
1991: first very deep learning with self-supervised pre-training. A conscious chunker recurrent neural net (RNN) attends to unexpected events that surprise a lower-level subconscious automatiser RNN. The automatiser uses neural network distillation to compress and absorb the formerly conscious insights and behaviours of the chunker, thus making them subconscious: https://t.co/PzDWRXKzlV
Interview:
2016: J. Carmichael. Artificial Intelligence Gained Consciousness in 1991. Why A.I. pioneer Jürgen Schmidhuber is convinced the ultimate breakthrough already happened. Inverse, 2016. https://t.co/388KN8ga5D
Let us learn to be rich in a different way: more attentive to relationships, more intent on valuing the common good, more attached to the local area, more grateful in welcoming and integrating those who come to live with us.
@smdiehl I doubt it will give us better fb or google, but it can give us programmable money that has not existed before. This capability has been co opted by the chinese eCNY. You can argue this is not desired, but its quite remarkable that such thing can work at all
@smdiehl nah, bitcoin's contribution is that it shows decentralization has chance to compete with centralized institutions at global scale. It's unfortunate that bitcoin does not solve p2p payment, but its intellectual children will.
@fst_nml Sharding can increase capacity by around 20x, then there will be all kinds of L2s. Ethereum has the better decentralization than any of the high throughput chains. One exception is Algorand.
Fire the generals, think tankers, and NGOs and hire OJ instead. He understands sovereignty is ultimately rooted in power, not wishful thinking. A disciple of Schmitt.