Inadvertent disengagement of FSD at highway speeds on a curve is a super dangerous case since the immediate loss of steering input results in a straightened trajectory and you depart your lane before you realize what is happening. Accidentally tapping the brake or jostling the steering wheel will lead to this as well. I warned my wife about this recently after experiencing it myself.
@martin_casado Even better they can play hide and seek where a good model introduces a subtle vuln in a large codebase and RL the next model against the hard samples that the original model can’t find.
One of the best uses of AI is to make complicated systems dramatically more accessible and governments are some of the most complicated systems around.
Something I've been thinking about - I am bullish on people (empowered by AI) increasing the visibility, legibility and accountability of their governments.
Historically, it is the governments that act to make society legible (e.g. "Seeing like a state" is the common reference), but with AI, society can dramatically improve its ability to do this in reverse. Government accountability has not been constrained by access (the various branches of government publish an enormous amount of data), it has been constrained by intelligence - the ability to process a lot of raw data, combine it with domain expertise and derive insights. As an example, the 4000-page omnibus bill is "transparent" in principle and in a legal sense, but certainly not in a practical sense for most people. There's a lot more like it: laws, spending bills, federal budgets, freedom of information act responses, lobbying disclosures... Only a few highly trained professionals (investigative journalists) could historically process this information. This bottleneck might dissolve - not only are the professionals further empowered, but a lot more people can participate.
Some examples to be precise: Detailed accounting of spending and budgets, diff tracking of legislation, individual voting trends w.r.t. stated positions or speeches, lobbying and influence (e.g. graph of lobbyist -> firm -> client -> legislator -> committee -> vote -> regulation), procurement and contracting, regulatory capture warning lights, judicial and legal patterns, campaign finance... Local governments might be even more interesting because the governed population is smaller so there is less national coverage: city council meetings, decisions around zoning, policing, schools, utilities...
Certainly, the same tools can easily cut the other way and it's worth being very mindful of that, but I lean optimistic overall that added participation, transparency and accountability will improve democratic, free societies.
(the quoted tweet is half-ish related, but inspired me to post some recent thoughts)
@karpathy@yacineMTB One of the best uses of AI is to make complicated systems dramatically more accessible and governments are some of the most complicated systems around.
A CEO from one of our portfolio companies shared this with their team. I’m re-sharing it with their permission, because it resonated and reflects what all founders and CEOs should be communicating.
--
We are living through a period of compounding change. And in moments like this, the biggest risk is no longer making the wrong decision. It is moving too slowly while the world moves around you.
There are two paths. We can play defense:
- Protect what we have
- Optimize what works
- Wait for clarity
It feels safe. It isn’t.
Or we can play offense:
- Learn faster than the environment changes
- Use new tools to solve old problems in better ways
- And create entirely new strategies and businesses
That’s where the opportunity is.
Challenge yourself to do things faster and better than you have ever attempted. Stay uncomfortable. Stay on the front foot.
I love my Cybertruck but I have noticed that FSD disengagement at speed in a turn is somewhat dangerous because the wheel immediately straightens out. If the disengage was accidental, unexpected, or the driver is otherwise unprepared to immediately match the previous turning force the trajectory of the vehicle will quickly deviate from nominal.
@mattpocockuk I think the ambiguity is due to the tech landscape evolving faster than their internal biz/product strategy can keep up, especially since there are likely conflicting interests within the company on this.
@karpathy How to organize this so that the meat computers still have a chance of understanding/learning which research directions were fruitful? Some master list of research topics and associated experiments seems useful in the fully distributed version.
@ibuildthecloud Do you think that most of the coding agent gains in the last 6 months are coming from lab RL on traces from their own agents or someplace else? We are still in the early days of this feedback loop but it will accelerate. Both the tech and biz strategy push labs this direction
Thanks, I definitely see the value for dev workflow. The per-workspace agentic chat and attendant service ui component here might generalize to all sorts of interesting biz workflows. (I am building these sorts of chat+app vertical micro apps for customers now.) It would be interesting if the same framework could be used by both developer and user with different config. Biz agents might just be code agents with the sharp edges removed and different tools and skills.
My concern is that agents are a system problem and the system is the combination of the harness and the model. The harness introduces and presents abstractions for the model (tools, skills, etc) that models will use to varying degrees of accuracy. Foreign harnesses will achieve some baseline feature usage accuracy like maybe 85% effective usage but models RL trained on native harness traces (eg cc on Claude or codex on codex) should achieve much better usage accuracy of native harness features on ongoing basis.