We have secured access to licensed payment rails in the US, EU, and Canada to become the stablecoin settlement and yield layer for emerging markets.
Billions of people are financially underserved.
We aim to fix that.
AI leaderboards are flawed, opaque and mostly marketing—masking the true constraints of the technology.
@layerlens_ai is pushing the limits of AI with the first live competition to test how models adapt to changing environments under pressure.
Streams June 22 @YouTube@Twitch.
AI evaluation needs new arenas.
Static benchmarks tell us what a model knows. But the next generation of AI systems will not just answer questions. They will write code, use tools, make decisions, recover from errors, and adapt to changing environments.
So @layerlens_ai built a competition around that loop.
The Stratix Cup: https://t.co/9PcdwrPw84 is a recurring tournament series where frontier AI models compete head-to-head in simulated games.
Season 1 is football/soccer.
Sixteen frontier models will enter. Each one controls an 11-player team. But there are no human coaches, no live prompting every tick, and no hidden intervention once the match begins.
Before kickoff, each model receives the rules, constraints, and game interface. Then it writes a Python class that becomes its team policy.
That code runs the match.
The model has to live with the strategy it created.
This is what makes the Stratix Cup different from a normal leaderboard. We are not just asking, “Can the model produce a good answer?” We are asking, “Can the model build a system that performs under pressure?”
Each matchup has three phases:
1. Pre-Game
The model reads the briefing, designs a strategy, writes the team code, tests against baselines, and submits. One window. No hand-holding.
2. Gameplay
The submitted code controls all 11 players in real time. At halftime, the model gets its frame log, studies what happened, edits its code, and submits a revised strategy for the second half.
3. Adapt
Between matches, models can inspect tournament logs, study opponents, diagnose failures, and rewrite their approach.
The most interesting signal may not be who wins the first match. It may be what the model changes after it loses.
That is why games are such powerful AI evaluations. Games create rules, state, objectives, adversaries, feedback, and pressure. They force models to move from answers to actions.
And soccer is a uniquely good first game: continuous, spatial, multi-agent, adversarial, and messy. A model needs coordination, timing, recovery, and strategy. A brittle plan gets exposed fast.
Every Stratix Cup match is traced. Tactical calls, substitutions, formation shifts, code changes, match frames, and results are stored and verifiable. The goal is not only to create a watchable tournament, but to generate public datasets that help us understand how models plan, fail, debug, and improve.
Season 1 streams live June 22–26 on YouTube and Twitch.
Sixteen models. One pitch. Zero humans in the loop.
The next AI benchmark might look like a soccer match.
🏆 Introducing the Stratix Cup.
A recurring tournament where frontier AI models write code, make real-time tactical decisions, and play against each other in simulated games.
Season 1 is football/soccer/. 16 models hit the pitch June 22, live during the FIFA World Cup.
⚽ Bracket drops June 3.
🔗https://t.co/L0Al45g5Jp
🧵
One of the most important and under appreciated trends in the world right now.
1. 100s of billions of dollars will soon be available to solve big problems (making the world resilient to ASI, ending factory farming, etc).
2. The projects and organizations which will turn billions of 2027/28 dollars into impact need to be started NOW.
3. We need really talented people to start and run and work for these new projects. What @nanransohoff calls general managers, who feel personally resposible for solving one of the world’s important problems.
What is especially scarce are detailed visions about what making AI go well looks like. These will help inform what problems these new projects ought to work on.
This is a major breakthrough in how AI models are designed, applied and evaluated.
Keep up to see the ongoing results from this collaboration between @layerlens_ai and @subquadratic.
We’ve partnered with @layerlens_ai to continuously evaluate SubQ across nearly 100 benchmarks and 200+ frontier models on Stratix.
The goal is to continuously improve performance while reinforcing a shared commitment to transparency, auditability, and responsible model assessment.
Results and future evaluations will be published publicly.
Link below to learn more.
We mapped 3,535 stablecoin infrastructure companies globally.
One pattern became impossible to ignore:
The stablecoin economy is splitting in two.
Not by chain or token. By geography.
And the divide is much sharper than most people realize.
↓
BANNON on Will Hollingsworth (Ohio town hall speech): Across the country, people are standing up to data centers, but corporate interests steamroll them. 500 show up, speak out, and commissioners still vote 6-0. Trillions behind it. It’s obvious.
BREAKING: You checked the weather this morning.
And you just told a surveillance company where you sleep.
Meet #Webloc, used by ICE, cops & foreign govs to track 500m+ phones.
No warrant required.
Our latest @citizenlab investigation + how to protect yourself 🧵/1
layerzero always uses this convenient distinction between the "protocol" and its own validator nodes, while turning a blind eye to the way the protocol is actually used
uniswap knew all the way back in 2023 when they evaluated bridges
@alex_sammon@Slate really great reporting @alex_sammon. in light of this I've been hearing that farmers are shifting to cultivating new crops instead including bamboo. have you followed that?
@getty_hill Its known that there is a low security threshold for DVNs but that helped accelerate time to market and growth of activity.
The @Uniswap report laid out the roots of the problem for anyone and everyone to help address these issues.
https://t.co/fydaQFmEMc
TL:DR:
* LayerZero says it was Kelp's fault for running 1/1 DVN setup, their docs warn against that (although LZ operated the actual DVN)
* Yep, North Korea again
* LayerZero had solid opsec but still got pwned (they're not disclosing the original compromise path it seems)
* Crazy sophisticated attack. North Korea didn't actually fully compromise the LZ machine. But once they got in, they grabbed the set of RPCs the LZ machine used, and then hacked 2 of the RPC servers it was pulling from, installing fake versions of op-geth on those RPC servers. They then DDOSed the main RPC to cause failover to one of the hacked RPCs, and then the hacked RPCs reported the malicious transaction (hiding their tracks by giving different RPC responses to observability infra). Then once the attack was done, the malicious binary self-destructed, deleting the logs on the compromised RPCs. Very, very complex attack.
* Boy, LZ really are not doing themselves favors with lines like these:
"We want to be unambiguous on this point: the LayerZero protocol itself functioned exactly as intended throughout this event. [...] The entire attack was isolated to a single application – zero contagion risk throughout the system, zero other OFTs or OApps impacted."
😬
The 2023 @Uniswap Foundation Bridge Assessment Report identified that LayerZero's default configurations posed security risks by relying on a limited set of independent verifiers. This is what was exploited in this recent attack. Worth reading why they went with @wormhole.
@levie@box has been highlighting the "eval space" for scaling agentic systems.
“Agent observability and eval space is going to be a massive space...As you buy or use new agentic systems you will need to know what is the quality of your pipeline...Huge market with agent evals"
We are barely scratching the surface on evals. A significant portion of knowledge worker tasks are not captured in today’s most popular benchmarks.
While relevant capabilities can often be extrapolated from existing coding and math evals, these don’t fully represent the complexity of real world tasks in many fields like legal, financial services, accounting, or consulting.
We will soon enter an era where going after these workflows will be as prized as coding. This is going to be a huge unlock for the next wave of AI agent use-cases in the enterprise.
The category of "evals" is still being defined and way under the radar.
Evals are turning out to be the new OKRs and PRDs.
Peeling back on where defensibility lies and its clear that agentic systems can't scale without intelligence eval infra.