The Midas Project

@TheMidasProj

Watchdog nonprofit that monitors the practices of leading AI companies. Tracking safety updates @SafetyChanges Writing at

Joined October 2023

262 Following

4.5K Followers

545 Posts

Pinned Tweet

The Midas Project

@TheMidasProj

about 9 hours ago

https://t.co/sPYA9l3qsZ

43K

TheMidasProj retweeted

Tyler Johnston

@tyler_johnston

about 9 hours ago

A few months ago, I found an anonymous sockpuppet account linked to the OpenAI/a16z super PAC. Now, @TaylorLorenz and I have uncovered two more — and they're even more brazen than the first. https://t.co/TJHAABeq2A

TheMidasProj retweeted

Techmeme

@Techmeme

about 8 hours ago

Leading the Future, the pro-AI super PAC backed by Greg Brockman, appears to be linked to multiple sockpuppet accounts, including a purported anti-AI activist (@themidasproj) (Visit Techmeme dot com for the link and full context!)

TheMidasProj retweeted

Taylor Lorenz

@TaylorLorenz

about 8 hours ago

Leading the Future is linked to a sockpuppet meme account masquerading as an extreme doomer. Read @tyler_johnston and my full deep dive into the online meme marketing boosting the super PAC.

14K

The Midas Project

@TheMidasProj

5 days ago

One would think that a company aiming for a trillion-dollar IPO could afford better PR.

209

The Midas Project

@TheMidasProj

5 days ago

As part of @OpenAI’s effort to market ChatGPT as safe for teens, the company recently boasted on X and LinkedIn that it had the best score on the TeenAegis AI Model Danger Index. We took a closer look at the index, and much of it appears to be AI slop.🧵

TheMidasProj's tweet photo. As part of @OpenAI’s effort to market ChatGPT as safe for teens, the company recently boasted on X and LinkedIn that it had the best score on the TeenAegis AI Model Danger Index. We took a closer look at the index, and much of it appears to be AI slop.🧵 https://t.co/Y4pa8BQj4R

834

The Midas Project

@TheMidasProj

5 days ago

The kicker is that OpenAI’s own tweet and LinkedIn posts promoting this were also detected as AI slop: 100% slop, according to Pangram.

TheMidasProj's tweet photo. The kicker is that OpenAI’s own tweet and LinkedIn posts promoting this were also detected as AI slop: 100% slop, according to Pangram. https://t.co/EDJRTI2Cox

266

TheMidasProj retweeted

Tyler Johnston

@tyler_johnston

6 days ago

31 retweets on 3k views. One would think these people might learn...

The Midas Project

@TheMidasProj

6 days ago

If an AI model posed the risk of undermining human control, how confident would you want to be that it was safe before it was released? Pretty damn confident, one would think. Last month, Google DeepMind updated its Frontier Safety Framework, committing to a risk management process around misalignment and loss of control. This was a positive step. But its new policy doesn’t apply its most stringent safety standards even to models powerful enough that “absent additional mitigations, we cannot rule out the model significantly undermining human control.” Specifically, the risk of loss of control does not trigger writing a formal safety case (an argument showing how risks have been reduced to an acceptable level), even though other risks do. If the threat of loss of control doesn’t demand a safety case, what does?

The Midas Project Watchtower

@SafetyChanges

7 days ago

Company: Google Date: April 17 Google updated its Frontier Safety Framework from v. 3.0 to 3.1. The new version introduces “Tracked Capability Levels” (TCLs), covering risks at a lower level of capabilities than the FSF’s Critical Capability Levels (CCLs). TCLs trigger risk assessments and mitigations, but don’t require formal safety cases like CCLs do. A misalignment TCL is defined when models have enough situational awareness and stealth that “absent additional mitigations, we cannot rule out the model significantly undermining human control.” It’s notable that this doesn’t rise to the level of a full CCL. Google is essentially saying that when a model reaches this risk threshold, if we don’t put additional safeguards in place, we might lose control of the model… but we’re not going to require a formal safety case for it. Still, it’s an improvement over v. 3.0, which just described its misalignment CCLs as an “illustrative” example. FSF v. 3.1 also includes a thin section on “Governance and Accountability,” which fails to name any specific governance or accountability mechanisms (though Google has said more on this elsewhere: https://t.co/uD4IglJkLl). A full diff is available at our website: https://t.co/vIgWOAfrBz

SafetyChanges's tweet photo. Company: Google
Date: April 17

Google updated its Frontier Safety Framework from v. 3.0 to 3.1. The new version introduces “Tracked Capability Levels” (TCLs), covering risks at a lower level of capabilities than the FSF’s Critical Capability Levels (CCLs).

TCLs trigger risk assessments and mitigations, but don’t require formal safety cases like CCLs do.

A misalignment TCL is defined when models have enough situational awareness and stealth that “absent additional mitigations, we cannot rule out the model significantly undermining human control.”

It’s notable that this doesn’t rise to the level of a full CCL. Google is essentially saying that when a model reaches this risk threshold, if we don’t put additional safeguards in place, we might lose control of the model… but we’re not going to require a formal safety case for it.

Still, it’s an improvement over v. 3.0, which just described its misalignment CCLs as an “illustrative” example.

FSF v. 3.1 also includes a thin section on “Governance and Accountability,” which fails to name any specific governance or accountability mechanisms (though Google has said more on this elsewhere: https://t.co/uD4IglJkLl).

A full diff is available at our website: https://t.co/vIgWOAfrBz

954

The Midas Project

@TheMidasProj

7 days ago

Read our full report: https://t.co/ABgsYfZxRZ

262

The Midas Project

@TheMidasProj

7 days ago

Following the report we co-authored last week about xAI in light of its upcoming IPO, xAI updated its safety page. The new text gestures toward some industry-standard safety practices discussed in the report. Sadly, the new page is both seemingly inaccurate and AI-generated 🙄

TheMidasProj's tweet photo. Following the report we co-authored last week about xAI in light of its upcoming IPO, xAI updated its safety page.

The new text gestures toward some industry-standard safety practices discussed in the report.

Sadly, the new page is both seemingly inaccurate and AI-generated 🙄 https://t.co/pqoTGDilwO

The Midas Project

@TheMidasProj

7 days ago

Investors backing a company that aspires to develop models with 5.5 Pro/Mythos-class cyberoffensive capabilities, who are wondering whether the risks of such a model will be adequately managed, may desire more reassurance than AI-generated web copy can provide.

343

The Midas Project

@TheMidasProj

Last Seen Users on Sotwe

Trends for you

Most Popular Users