@mogmachine speaking in the value of agents in today’s world. I actually think Bittensor is the primary beneficiary of agentic capabilities. Agents lower the barrier of entry for mining in a way that’s hard to quantify the value of, but I believe it cannot be understated. You no longer need to be a fluent technical dev to contribute to the various competitions being built on the Subnets. You just need to be fluent in…English.
Today, we are launching the first stage of Project Orion.
Our early pre-training run of Orion-100B achieves upward of 65% of data-center training efficiency on hardware costing a fraction of the price.
Orion-100B is the first proof point for a simple idea: that underutilized compute around the world can be turned into frontier training capacity.
We believe that this work presents, for the first time, an economically compelling case for training large models using distributed approaches.
“You can train a model to look inside an LLM’s head and explain to you what’s happening at a certain point in time. When you understand the implications for AI alignment, you can see how this may lead to major breakthroughs”.
@macrocrux and @Austin_Aligned discuss the key research and aims behind their collaborative competition on Apex - drive AI alignment insights via autoencoders identifying patterns within an LLM’s structure, acting as an auditor.
We’re live on our Inventive Mechanisms podcast.
@macrocrux and @Austin_Aligned are discussing our upcoming competition.
This is a collaborative task with SN37, @AureliusAligned, launching on @Apex_SN1.
Join to learn more
https://t.co/hbJdTaW0nW
Reminder: @macrocrux and @Austin_Aligned will be live on our Inventive Mechanisms podcast today, walking through our upcoming competition on @Apex_SN1.
Together, we’ll be launching a challenge focused on AI alignment, in collaboration with Subnet 37, @AureliusAligned.
Join us to learn more about how this will unfold.
📍 Location: X livestream (on the @MacrocosmosAI X account)
📅 Date: Today (Thursday 28th May)
🕒 Time: 3pm UK time
While modern AI capabilities continue to grow, their thoughts remain opaque to us.
There’s a growing body of evidence which shows LLMs conceal their thoughts, and there are many alarming examples of deception towards humans.
A core part of our mission at Macrocosmos is to accelerate the development of safe AI, which is why we're launching a new competition aimed at probing the minds of modern LLMs.
To do this, we’re collaborating with Bittensor’s resident AI alignment team @AureliusAligned to launch a competition on @Apex_SN1.
Miners will compete by training small neural networks called sparse autoencoders to steer LLMs thoughts towards target concepts. By injecting them into the larger reference models, they modify the internal activations during model inference and teach us about how knowledge and behaviour are encoded.
One of the competition’s aims is to see if we’re able to reliably manipulate behavioural features such as deception or evaluation-awareness (alignment faking). If successful, we can train natural language autoencoders using these steering modules to explain when, and to what degree, models are misaligned.
@macrocrux and @Austin_Aligned will be walking through this challenge live on our Inventive Mechanisms podcast.
📍 Location: X livestream (on the @MacrocosmosAI X account)
📅 Date: Thursday 28th May
🕒 Time: 3pm UK time
@mikecontango@trishool Agreed, Bittensor is unique suited to perform safety/alignment research on models precisely because of the non-ownership over the foundation models. Status quo capitalism is working against AI safety currently. Happy to see Trishool performing this exact role.
This could signify the starting gun for the arms race between frontier models and the interpretability agents that represent our best shot of being capable of disentangling the inner workings of our most-capable models. The progression of interpretability research like NLAs is critical as models get bigger and more complex.
From @AnthropicAI's new NLA paper:
"unverbalized evaluation awareness — cases where Claude believed, but did not say, that it was being evaluated"
The models know they're being watched, always have.
Now we can prove it.
This is the exact problem @AureliusAligned was built to solve. Been working toward quantifying alignment faking since day one. This is huge validation and a massive new primitive to build on.
https://t.co/aVsDwQiLQt
On the whole, it certainly feels like we are moving towards the construction of black-box decoder agents that we will rely upon to interpret the alignment of frontier models, as described in "AI 2027".
We've been introducing the people behind Aurelius one post at a time. The full lineup now lives in one place: three co-founders and six advisors across alignment research, ethics, engineering, and law.
https://t.co/VWpc2V8Utd
Before limited-releasing Claude Mythos Preview, we investigated its internal mechanisms with interpretability techniques. We found it exhibited notably sophisticated (and often unspoken) strategic thinking and situational awareness, at times in service of unwanted actions. (1/14)
Claude Code leaked their source map, effectively giving you a look into the codebase.
I immediately went for the one thing that mattered: spinner verbs
There are 187