https://t.co/wchmYJOmbm just shipped SkillOS.
An open-source proof that AI-agent work can compound.
Jobs become traces.
Traces become tested skills.
Tested skills improve unit economics.
Proof:
https://t.co/PFVyKH5rPq
GitHub:
https://t.co/Bc1eycKxJd
#QuebecAI#Jobs
[ α‑AGI Business 👁️✨ ]
Example Scenario:
"👗⚡️ Minted StylePulse.a.agi.eth with $AGIALPHA—AGI scouts trends, auto-prints patterns, books factories. Collection sold out in 36 h. Solo designer, global runway." #FashionAlpha
GitHub: https://t.co/CkTQdjW1Sc
#AGIALPHA #MetaAgentic
AGI Jobs v0 (v2): A Superintelligent AI Labor Marketplace of Unprecedented Power
Article : https://t.co/HlNVwh4OAL
#AGIALPHA#AIAgents#Superintelligence
[ 18 Aug 2016 👾👀 ]
✨📜 "New Highest Score in the World at Atari Asterix!" 📜✨
Highest human score is 132,500.
Vincent Boucher's new superhuman AI agent scores 425,000 (!)
OpenAI Gym: https://t.co/OqsrTZ0hIm
Youtube: https://t.co/DewtD8WMWH
#AGI#AGIALPHA#MONTREALAI
AI research automation is crossing a threshold.
But the real question is not:
Can AI produce papers?
It can.
The harder question is:
Can it preserve the substance of science?
A new paper from the Awesome AI Auto-Research Team offers one of the most useful maps I’ve seen of this emerging frontier:
AI for Auto-Research: Roadmap & User Guide
The authors analyze AI across the complete research lifecycle:
Creation — ideas, literature, coding, experiments, tables, figures
Writing — manuscript drafting and structure
Validation — peer review, rebuttal, revision
Dissemination — posters, slides, videos, social media, project pages, paper agents
This lifecycle framing matters.
Because research is not one task.
Ideas become experiments.
Experiments become claims.
Claims become manuscripts.
Reviews become revisions.
Papers become public narratives.
If errors enter early, automation can amplify them downstream.
That is the central tension of auto-research:
AI is becoming very good at producing research-shaped artifacts.
But it remains far less reliable at judging whether those artifacts are novel, faithful, executable, reproducible, and scientifically meaningful.
The paper is refreshingly clear on the boundary.
AI performs best when tasks are structured, retrieval-grounded, tool-mediated, and externally checkable.
It becomes fragile when tasks require genuinely novel ideas, research-level experimentation, deep scientific judgment, or long-horizon accountability.
That distinction should shape how serious labs deploy these systems.
The future is not “fully autonomous AI scientist” as a default.
The credible deployment pattern, at least today, is human-governed collaboration:
AI for mechanical acceleration.
Humans for judgment, interpretation, responsibility, and epistemic ownership.
One sentence from the paper’s logic is worth sitting with:
artifact generation is outpacing scientific verification.
That is the whole field in one line.
The most dangerous systems will not be the ones that obviously fail.
They will be the ones that generate plausible papers, plausible reviews, plausible rebuttals, plausible figures, and plausible summaries while quietly losing provenance, evidence, and accountability.
This paper is not anti-automation.
It is pro-science.
It gives the field a taxonomy, benchmark map, tool inventory, stage-by-stage risk model, and practitioner playbook for building auto-research systems without confusing productivity with discovery.
Full credit to the authors:
Lingdong Kong, Xian Sun, Wei Chow, Linfeng Li, Kevin Qinghong Lin, Xuan Billy Zhang, Song Wang, Rong Li, Qing Wu, Wei Gao, Yingshuo Wang, Shaoyuan Xie, Jiachen Liu, Leigang Qu, Shijie Li, Lai Xing Ng, Benoit R. Cottereau, Ziwei Liu, Tat-Seng Chua, Wei Tsang Ooi.
Paper:
AI for Auto-Research: Roadmap & User Guide
https://t.co/sCzpxSYfj7
Project:
https://t.co/cGAB5u1br3
I’m attaching the first page because Figure 1 is the map everyone working on AI research agents should study.
The future of science will not be decided by who can automate the most.
It will be decided by who can automate without losing truth.
#AIResearch #ArtificialIntelligence #ScientificDiscovery #ResearchAutomation #AIAgents
A 0.6B model learned to manage giants.
That is the idea behind TRINITY, a new ICLR 2026 paper by Jinglue Xu, Qi Sun, Peter Schwendeman, Stefan Nielsen, Edoardo Cetin, and Yujin Tang.
The paper is not asking:
“How do we build one model that knows everything?”
It is asking something more interesting:
“How do we build a small intelligence layer that knows who should think, who should act, and who should verify?”
TRINITY is a lightweight coordinator for LLMs.
It does not merge weights.
It does not require architectural compatibility.
It does not need access to closed-model internals.
It does not try to turn the coordinator into the smartest model in the room.
Instead, it orchestrates a pool of strong models at test time, including closed and open models.
At each turn, TRINITY chooses a model and gives it one of three roles:
Thinker — plan and decompose
Worker — solve and execute
Verifier — critique and accept/revise
That may sound simple.
It is not.
Too many multi-agent systems are still prompts plus hope.
TRINITY learns the coordination policy.
A compact ~0.6B language model produces hidden-state representations of the conversation. A tiny head then uses those representations to decide the next model-role pair. The authors optimize this coordinator with an evolutionary strategy, sep-CMA-ES, because the problem is expensive, high-dimensional, and reward-sparse.
The result is not just better routing.
It is learned division of labor.
The paper reports that TRINITY outperforms individual models and existing coordination methods across coding, math, reasoning, and domain knowledge tasks. In its full-power setting, it reaches 86.2% on LiveCodeBench and transfers to held-out benchmarks including AIME, BigCodeBench, MT-Bench, and GPQA-D.
The most important idea here is bigger than the benchmark.
The future of AI may not be a single supermodel.
It may be an organization of models.
A small conductor.
A team of specialists.
A protocol for planning, execution, and verification.
An intelligence layer that learns how to allocate cognition.
This feels like a real shift:
from bigger models
to better systems
from raw capability
to coordinated capability
from “which model is best?”
to “what structure makes many models better together?”
Full credit to the authors:
Jinglue Xu, Qi Sun, Peter Schwendeman, Stefan Nielsen, Edoardo Cetin, Yujin Tang.
Paper: TRINITY: An Evolved LLM Coordinator
https://t.co/H7YE67U67f
I’m attaching the first page because the abstract is worth reading closely.
The future of AI may not be monolithic.
It may be coordinated.
#ArtificialIntelligence #LLM #MultiAgentSystems #MachineLearning #EvolutionaryAlgorithms
The next clue in AI reasoning:
answers may be attractors.
A new paper from Benhao Huang, Zhengyang Geng, and Zico Kolter introduces Equilibrium Reasoners (EqR) — a sharp mechanistic view of test-time scaling in latent reasoning models.
The core idea is simple, but deep:
Reasoning is not only generation.
Reasoning can be convergence.
EqR repeatedly updates a latent state. The authors hypothesize that generalizable reasoning emerges when training shapes the model’s latent dynamics so that stable attractors correspond to valid solutions.
In other words, the answer is not merely “produced.”
It is reached.
This matters because test-time compute only helps when the model’s internal dynamics know how to use it. More iterations can improve reasoning — or make it worse — depending on whether the trajectory moves toward a solution-aligned attractor or falls into a spurious one.
EqR scales along two axes:
Depth: run more iterations so a trajectory can settle.
Breadth: run multiple stochastic trajectories from different initializations and select/aggregate the ones that converge best.
The first-page figure captures the punchline beautifully: training is capped at 16 iterations, yet the learned dynamics extrapolate beyond 1,024 iterations at test time. As fixed-point residual falls, accuracy rises.
On Sudoku-Extreme, the paper reports a jump from 2.6% exact accuracy for feedforward models to over 99% with scalable latent reasoning — equivalent to unrolling up to ~40,000 layers. On Maze, EqR reaches 93.0%.
But the benchmark is not the most interesting part.
The most interesting part is the lens:
Correct answers must become stable.
They must be reachable.
And convergence itself can become a signal.
That gives the field a more precise language for test-time compute than “let the model think longer.”
Not longer text.
Not an external verifier.
Not task-specific search priors.
A learned attractor landscape.
This feels important because modern AI is moving from static inference toward adaptive computation. The question is no longer only “how much compute should we spend?”
It is:
What internal dynamics make extra compute useful?
Full credit to the authors:
Benhao Huang, Zhengyang Geng, Zico Kolter.
Paper:
Equilibrium Reasoners: Learning Attractors Enables Scalable Reasoning
https://t.co/HEfsBo8Np2
I’m attaching the first page because Figure 1 is worth studying closely.
The future of reasoning may not only be models that generate better answers.
It may be models whose internal states learn where correct answers live — and how to converge there.
#AIResearch #MachineLearning #Reasoning #TestTimeCompute #DynamicalSystems #ArtificialIntelligence
The next interface to biology may not be a dashboard.
It may be a conversation.
I just read a new preprint by Yanbo Zhang and Michael Levin that feels like it belongs in the “this may open an entirely new category” folder.
The paper is called:
“Language Game: Talking to Non-Human Systems”
And the question at its center is extraordinary:
Can a non-human system speak in its own voice?
Not metaphorically.
Not by having an LLM hallucinate a personality for it.
Not by asking ChatGPT to explain what a biological system “might mean.”
The authors are asking something much more precise:
Can we build an interface where a system — a gene regulatory network, a microbial consortium, a fungus, a dynamical system — responds through its own behavior?
Their answer is: make language a game.
Following Wittgenstein’s idea that meaning comes from use, Zhang and Levin treat communication as something that emerges inside a shared environment.
A human gives a prompt.
An LLM routes that prompt into the right reinforcement-learning “game.”
The game creates a state where the desired response is the rational action.
Then the non-human system acts.
The crucial part:
The LLM is not speaking for the system.
The system’s own dynamics are frozen and used as the nonlinear core of the policy. Only the simple input/output interfaces around it are trained. The reply comes from the system’s behavior inside the game.
In their experiments, the authors apply this to 14 biological gene regulatory networks, the Lorenz attractor, and 16 reinforcement-learning tasks — showing that different biological dynamical systems have different “conversational” affordances and inductive biases.
This is not “biology is secretly English.”
It is something deeper:
Maybe the way to communicate with unfamiliar intelligence is not to decode its private inner language.
Maybe it is to design a shared game where action becomes meaningful.
That idea has huge implications.
For AI, it reframes language as policy.
For biology, it suggests a path beyond molecular micromanagement toward interactive interfaces with cells, tissues, organs, and pathways.
For medicine, it hints at a future where we do not merely intervene in living systems, but negotiate with their dynamics.
For philosophy, it turns Wittgenstein into an engineering program.
The phrase that keeps coming back to me:
The game is the translator.
A human and an alien can play tic-tac-toe without sharing the same representation of the board.
A gene regulatory network and a human may not share symbols.
But if both are coupled through the right game, behavior can carry meaning.
This is the kind of paper that does not just answer a question.
It changes what questions feel askable.
Full credit to the authors: Yanbo Zhang and Michael Levin.
Paper: Language Game: Talking to Non-Human Systems
https://t.co/iHdTlkfYjS
I’m attaching the first page because the abstract alone is worth studying.
If this framework holds, the future of human–nonhuman communication may not begin with translation. It may begin with play.
#ArtificialIntelligence #DiverseIntelligence #SystemsBiology #GeneRegulatoryNetworks #ReinforcementLearning #Bioelectricity #ComplexSystems #EmbodiedIntelligence #AIResearch #Biology #Cognition #MichaelLevin
⚜️✨ Public Record Archive
Vincent Boucher on Liza Frulla at Radio-Canada — an early interview on education, learning, discipline, willpower, and the formation of a long-term vision.
From personal reconstruction to public intelligence.
From learning methods to the AI-first era.
Watch:
https://t.co/eWmUQuFICu
#VincentBoucher #QuebecAI #MontrealAI #ArtificialIntelligence #IntelligenceArtificielle #AGI #ASI
⚜️✨ The archive is awake.
Introducing the renewed https://t.co/dCKMr5Lici YouTube channel — public intelligence for the AGI‑First → ASI‑First era.
Home of the AGI Debate archive and the official video record for https://t.co/dCKMr5Lici & https://t.co/9VRSyr4dv4.
https://t.co/my55GPPoGf
#MontrealAI #QuebecAI
💍✨ [ AGI Ring ] ✨💍
Exclusively for AGI Club members.
"Whoever Masters AGI Will Rule the World." — AGI King
🪙 Mint Price: 250 $AGI
🔗 Mint Now: https://t.co/ZjoQCYdwls
🌐 OpenSea: https://t.co/wHm8CCijjc
🔮 Overlay: https://t.co/zqAdffjy05
$AGI #AGIClub#AGIRing#NFT
[ AGI Club Membership 🗝️👑 ]
The AGI Club offers exclusive access to curated Businesses Of AGI, partners, utilities, and NFTs.
Membership with tiers providing early access to new opportunities and exclusive benefits.
Website: https://t.co/J8NaUGCB4R
#AGIClub#Membership#NFT
⚜️✨ QUEBEC.IA — Frontière. IA‑First. Souveraine.
Le Québec entre dans l’ère IA‑First.
QUEBEC.IA avance l’IA de frontière, l’infrastructure souveraine, les agents autonomes, la sécurité, l’assurance et la gouvernance stratégique.
https://t.co/LbSR20mMmD