I look forward to participating in the Verification Summit (https://t.co/wVitrsgYfq) and sharing my perspective on Physical AI safety. I strongly agree that verification and validation are key frontiers for unlocking Physical AI in high-stakes, high-reliability applications, from autonomous cars to industrial robotics!
@fv_summit@khoslaventures@PramaanaLabs@boldcapfund
5 days to go !
14 speakers. One stage. One question: Can we hold AI accountable where it matters the most?
Join the room where AI stops saying 'Sorry'!
It's time to verify AI !
@khoslaventures@PramaanaLabs@boldcapfund
Outputs that sound right are not the same as outputs we can prove right.
The next phase of AI runs on verified systems: formal methods, theorem provers, runtime checks, provable agents.
It's time to verify AI !
Join us on the 10th of June at SF !
@PramaanaLabs@boldcapfund
We can push the frontier on AI only when we know automatically what works and what doesn't. Verification is the tool for supercharging AI.
Join us at https://t.co/3EdFBkBEjH to verify!
Autoformalization is the next critical frontier unlock to get us to ASI!! We are excited to co-host the inaugural verification summit with @PramaanaLabs and @vkhosla on June 10th. If you are a researcher, founder or investor interested in the frontier you should be here. @boldcapfund
We are hosting the launch edition of verification summit.
A technical-first gathering for engineers, founders, researchers and practitioners building verified AI & formalization for real world domains.
Apply: https://t.co/o7N6xUQMEw
https://t.co/GxmlDsYrAY
Q.E.D.
Using Lean 4 to identify contradictions in laws.
Very exciting work by Pramaana Labs https://t.co/zl239Thp7L. They have build a DSL called LegalLean to formalise US tax codes.
Everyone says they’re building a world model. Very few actually are.
Most AI learns to see the world.
A world model learns to predict what happens in it — specifically, what happens when people do things. That’s a different problem.
Seeing is passive. Consequences require understanding cause and effect Any AI can learn to read a scene. A world model learns INTUITION about what changes it. @gen_intuition
Had fun presenting at the @OpenAI demo party hosted by Boldcap.
Shared a sneak peek into our work on verification - the key missing piece to unlocking ASI across all domains. More soon :)
Appreciate the fantastic organization, @siddharth_ram !
@karpathy Lean has a real case to be the target language.
You can take a production function f(a, b, c), fix a and b, leave c free, and prove behavior and optimizations without rewriting the code. SMT can handle constraint solving on top of lean. We are building such systems at Pramaana.
@dwarkesh_sp I believe superintelligence would be solved by making domains computable. We are seeing signs of it in math like Erdos problems being solved in Lean. Once we create this underlying domain infrastructure, we can discover superhuman solutions reliably.
Sharing an interesting recent conversation on AI's impact on the economy.
AI has been compared to various historical precedents: electricity, industrial revolution, etc., I think the strongest analogy is that of AI as a new computing paradigm (Software 2.0) because both are fundamentally about the automation of digital information processing.
If you were to forecast the impact of computing on the job market in ~1980s, the most predictive feature of a task/job you'd look at is to what extent the algorithm of it is fixed, i.e. are you just mechanically transforming information according to rote, easy to specify rules (e.g. typing, bookkeeping, human calculators, etc.)? Back then, this was the class of programs that the computing capability of that era allowed us to write (by hand, manually).
With AI now, we are able to write new programs that we could never hope to write by hand before. We do it by specifying objectives (e.g. classification accuracy, reward functions), and we search the program space via gradient descent to find neural networks that work well against that objective. This is my Software 2.0 blog post from a while ago. In this new programming paradigm then, the new most predictive feature to look at is verifiability. If a task/job is verifiable, then it is optimizable directly or via reinforcement learning, and a neural net can be trained to work extremely well. It's about to what extent an AI can "practice" something. The environment has to be resettable (you can start a new attempt), efficient (a lot attempts can be made), and rewardable (there is some automated process to reward any specific attempt that was made).
The more a task/job is verifiable, the more amenable it is to automation in the new programming paradigm. If it is not verifiable, it has to fall out from neural net magic of generalization fingers crossed, or via weaker means like imitation. This is what's driving the "jagged" frontier of progress in LLMs. Tasks that are verifiable progress rapidly, including possibly beyond the ability of top experts (e.g. math, code, amount of time spent watching videos, anything that looks like puzzles with correct answers), while many others lag by comparison (creative, strategic, tasks that combine real-world knowledge, state, context and common sense).
Software 1.0 easily automates what you can specify.
Software 2.0 easily automates what you can verify.
Last year, AlphaProof & AlphaGeometry reached a key landmark in AI by achieving silver medal level performance at the International Math Olympiad.
Today, @Nature is publishing the methodology behind our amazing agent AlphaProof! @GoogleDeepMind
Paper: https://t.co/eUGKeVrH3O
Great talking about a stealth startup turning AI outputs into machine checkable proofs with @ranjan_vittal and @krishnan_rag in Palo Alto. [Yes, Palo Alto.]