The purpose of taxation is to give the government money to do things. The government can't do everything itself (via direct employees) so it contracts out work to the private sector. SpaceX provides services to the government for dramatically cheaper rates than its competition, so it wins a lot of contracts.
If you don't want your tax dollars going to things like space exploration, you can try to elect representatives who don't allocate money towards such contracts. You might not win, and that's democracy.
@iamsahaj_xyz One way for this to be true by the way is if you have worse taste than the models.
Personally I find that they usually won't write the way I want in the first or second pass but I can get them to do it, and once you establish a good architecture they follow it.
@QiaochuYuan I disagree with the premise, why would we expect one to have pulled away? The IP moat isn't that deep, it's mostly about logarithmic compute scale.
It's sorta reaching but what if you did use something like rust and the dioxus hotpatcher? I'm not sure that this would have tangible benefits to pi but I think it would be a pretty cool experiment. I think "agents self-modifying their harness" is a very cool concept that would be sad to just limit to interpreted languages.
PV can't get THAT much more efficient, even in theory. We have single digit percents left in monocrystalline (~22 to 29%), and perovskite can get us in theory to 43%. But these are not fast, these are slogs, it'll be a decade or two to get to 40% commercial. If we suspend realistic constraints and say you can get to 99%, that's not even a 5x, but we want power to 10x or even 100x.
@goodside@aidan_mclau I have the impression they're not bottlenecked on training compute at this point, I've assumed that they are bottlenecked on useful things to do in post-training that lead to better generalization.
Doesn't seem plausible to me that they wouldn't try. If one lab did, they'd be able to tout that their LLM is the best at chess, a pretty good proxy for the ability to reason, intuit, switch between the two, consistently over 100+ steps without making mistakes or losing track.
It also would make a really good benchmark because it could be elo scored instead of a % which naturally has to saturate eventually.
@mitsuhiko Kinda confused on their strategy because they must know this is long term a losing battle. They would need to go full DRM/anti-cheat, which is not practical for the kinds of environments they want to ship to.