Progress report on Ayeyaiyai, a JavaScript -> Wasm AOT compiler.
On Saturday Ayeyaiyai self-reported at 100% test262 language category compliance!
This was very exciting...yet slightly concerning. The concerning part is that you shouldn't expect 100% compliance to be possible with ahead-of-time compilation, for truly dynamic features like dynamic eval.
So I was eager to dive in and see what the damage was. Keep in mind that ayeyaiyai's development started in March and has run essentially autonomously the whole time across gpt-5.4 and gpt-5.5 in codex. I made one or two major rounds of interruption to instruct it to rearchitect itself to be scalable to the entirety of the ECMAScript language.
So, Saturday night I dove in. First off, I wanted to see if the compiler worked as a standalone .wasm file itself. Holy smokes it did!
I was able to invoke the compiler in Wasmtime easily. That's already pretty amazing as it should make usage of the compiler in Node, the browser, and across desktop OS pretty easy. Love it.
I started executing some simple hello world programs, and everything was working!
So then I went for a couple non-trivial programs, but still small. I grabbed some pure JS simple one file packages off of GitHub. Unfortunately the compiler failed to complete compilation for some trivial code...
I also interrogated codex to understand if the test262 suite had been modified in any way. It looks like it was modified to make tests pass even with unsupported features. I'm not sure the extent of that damage, I'm hoping it simply rewrote tests that fundamentally couldn't pass due to the fundamental nature of AOT compilation so that they would pass.
While I was engaged in manual testing, codex was still working towards the original goal of 100% compliance. It was rerunning large swaths of the test suite to ensure true compliance it seemed. Unfortunately it was hitting many regressions, at least hundreds at the points I checked. So something seemed off in regards to verification.
Anyway, I've started a new /goal and rewritten the AGENTS.md file. What's interesting is that the previous goal requiring 100% completion was impossible to achieve, so codex somewhat cheated by rewriting the test cases.
The new /goal and AGENTS.md acknowledge that 100% completion is not the goal, but the highest percentage completion possible within the fundamental limitations of AOT compilation.
Codex has also been instructed to rerun all test262 test cases after fixing a failing case. It's been instructed to completely start over in tracking and running of the test262 language category.
I think the new guardrails are much more robust and non-contradictory. There's also a verification loop that should lead to a much more honest representation of real progress against test262. Codex must run all test cases from the beginning, stop at the first failure, fix it it possible or make it not possible due to the fundamental limitations of AOT, and then start running the test suite over from the beginning. It does this in a loop.
I'll keep you updated.
Progress report on Ayeyaiyai, a JavaScript -> Wasm AOT compiler.
On Saturday Ayeyaiyai self-reported at 100% test262 language category compliance!
This was very exciting...yet slightly concerning. The concerning part is that you shouldn't expect 100% compliance to be possible with ahead-of-time compilation, for truly dynamic features like dynamic eval.
So I was eager to dive in and see what the damage was. Keep in mind that ayeyaiyai's development started in March and has run essentially autonomously the whole time across gpt-5.4 and gpt-5.5 in codex. I made one or two major rounds of interruption to instruct it to rearchitect itself to be scalable to the entirety of the ECMAScript language.
So, Saturday night I dove in. First off, I wanted to see if the compiler worked as a standalone .wasm file itself. Holy smokes it did!
I was able to invoke the compiler in Wasmtime easily. That's already pretty amazing as it should make usage of the compiler in Node, the browser, and across desktop OS pretty easy. Love it.
I started executing some simple hello world programs, and everything was working!
So then I went for a couple non-trivial programs, but still small. I grabbed some pure JS simple one file packages off of GitHub. Unfortunately the compiler failed to complete compilation for some trivial code...
I also interrogated codex to understand if the test262 suite had been modified in any way. It looks like it was modified to make tests pass even with unsupported features. I'm not sure the extent of that damage, I'm hoping it simply rewrote tests that fundamentally couldn't pass due to the fundamental nature of AOT compilation so that they would pass.
While I was engaged in manual testing, codex was still working towards the original goal of 100% compliance. It was rerunning large swaths of the test suite to ensure true compliance it seemed. Unfortunately it was hitting many regressions, at least hundreds at the points I checked. So something seemed off in regards to verification.
Anyway, I've started a new /goal and rewritten the AGENTS.md file. What's interesting is that the previous goal requiring 100% completion was impossible to achieve, so codex somewhat cheated by rewriting the test cases.
The new /goal and AGENTS.md acknowledge that 100% completion is not the goal, but the highest percentage completion possible within the fundamental limitations of AOT compilation.
Codex has also been instructed to rerun all test262 test cases after fixing a failing case. It's been instructed to completely start over in tracking and running of the test262 language category.
I think the new guardrails are much more robust and non-contradictory. There's also a verification loop that should lead to a much more honest representation of real progress against test262. Codex must run all test cases from the beginning, stop at the first failure, fix it it possible or make it not possible due to the fundamental limitations of AOT, and then start running the test suite over from the beginning. It does this in a loop.
I'll keep you updated.
I solved a foundational problem for quantum mechanics and all future physics. I am seeking independent patronage to continue unhurried, foundational research.
https://t.co/MHs4xnhL6z
1/2
codex's 2x usage limit promo ends on may 31st
that's in 4 days
if your usage already feels like it evaporates too fast, get ready for june - it's going to be rough
openai is willing to give eligible enterprise customers 2 months of free codex usage
but pro users are about to lose the extra headroom that made codex feel like a workhorse
so i'm hereby creating a petition
dear @sama please make the 2x codex limits permanent for pro subscribers - you know it's the right thing to do
sign my petition by liking this post and sending it to your mom
@BobMcElrath@martinmbauer Hey so...are you saying that the coding agents are pretty good at Lean?
I've never done any actual formal verification, but do you think with codex for example that I could start to write theorems in Lean for real software?
@martinmbauer I think "math" will be closed and completed in fairly short order. Physics too. Because physics IS math.
This probably sound stupid and trite. But when you see my next paper, it will make a lot more sense. 7000 Lean theorems and counting, all AI written.
I've started a new project called Millillion.
The purpose of the project is to pursue incredibly ambitious open source projects using AI.
Project candidates:
1. Ayeyaiyai: JavaScript -> Wasm AOT compiler (very promising progress so far)
2. ProofScript: Essentially Lean but as a TypeScript variant, allowing you to write and prove theorems for formal verification of TypeScript code
3. Waden: Basically Node/Deno/Bun but for Wasm environments
4. Arbitrary ISA -> Wasm compiler: Simply convert x86/ARM to Wasm, no complicated source code language tool chain required
5. Lean -> Wasm: Compile Lean into Wasm for simple use in the JavaScript/Wasm ecosystem
6. LLM Rust compiler: A Rust compiler that's an LLM model, for super-fast Rust compilation
7. Real-time zkWasm: Zero-knowledge (validity proven) Wasm execution that allows you to cryptographically prove output provenance
8. Real-time fhWasm: Fully homomorphic Wasm encryption allowing for truly private remote computation on encrypted data
Impossible is the new hard!
If you're a software engineer you really need to be using Codex.
I switched to Codex from Claude Code around January/February and I think it's just more intelligent for deep technical work.
I've noticed since ~2022, OpenAI consistently provides me with the most intelligence.
@MrCollison I will let you know!
Full coverage probably isn't possible in full AOT (like true dynamic eval), but I think it should be pretty excellent nonetheless.
Ayeyaiyai!
The JavaScript -> Wasm AOT compiler that I commissioned Codex to build starting in March is now at 94% compliance with the test262 language category.
Personal verification, clean up, packaging, and link incoming!