Evals-Consensus.ai @EvalConsensusAI - Twitter Profile

1 day ago

The still-missing ingredient isn’t what to do (results forthcoming soon!) It’s getting buy-in from the stakeholders - so we're reaching out now, and are happy to speak with relevant organizations. https://t.co/qxUo7bYVH0

Evals-Consensus.ai @EvalConsensusAI

8 days ago

In the coming weeks, anyone interested in endorsing the results should be in touch; we’ll be sharing results with potential endorsers, then will open endorsement widely to labs, auditors, industry, organizations, academics, and civil society groups.

0

1

0

299

0

1

Evals-Consensus.ai @EvalConsensusAI

1 day ago

Everyone says we need better evals - but why does a consensus process help? The short answer is that common knowledge encourages coordination, and makes bad practices more costly. The long answer is a thread.🧵

1

0

3

Evals-Consensus.ai @EvalConsensusAI

1 day ago

Our (imperfect) answer is to find consensus on practices that are seen as important across stakeholder groups. That answer won’t tell anyone whether an eval is good enough, much less great, but it will help raise the minimum bar for those practices which are broadly accepted.

1

0

2

Evals-Consensus.ai @EvalConsensusAI

7 days ago

In our round 1, many participating AI evaluation experts noted that typical methods for addressing evaluation awareness are not reliable, thus not (yet) recommended. @GeorgeBalston, @Miles_Brundage, @charlotte_stix; it's great to see this, as fixing this gap will be critical!

Apollo Research

@apolloaievals

8 days ago

Black-box access may soon no longer be enough to robustly make or verify safety and security claims. Deeper, white-box access is a necessary update to counter 'evaluation awareness' and keep loss-of-control evaluations state of the art. A new policy blog explains why. 🧵

1

50

10

21

5K

0

1

0

362

Evals-Consensus.ai @EvalConsensusAI

8 days ago

In the coming weeks, anyone interested in endorsing the results should be in touch; we’ll be sharing results with potential endorsers, then will open endorsement widely to labs, auditors, industry, organizations, academics, and civil society groups.

Evals-Consensus.ai @EvalConsensusAI

9 days ago

Preliminary results are starting to make that picture clearer. The interim results about what the process is surfacing are available to participants; if your organization has a stake in how AI evaluations are performed, your input is still welcome! https://t.co/LvL8D8xpxg

0

1

0

598

0

1

0

299

Evals-Consensus.ai @EvalConsensusAI

9 days ago

Preliminary results are starting to make that picture clearer. The interim results about what the process is surfacing are available to participants; if your organization has a stake in how AI evaluations are performed, your input is still welcome! https://t.co/LvL8D8xpxg

Evals-Consensus.ai @EvalConsensusAI

about 1 month ago

Here's a direct link to request to participate: https://t.co/hIvydDHlFW

0

1

0

346

0

1

0

598

Evals-Consensus.ai @EvalConsensusAI

9 days ago

Last week, @davidmanheim argued that AI evaluation results are becoming load-bearing. But that creates a problem: How do we arrive at better common practices without first deciding who is in charge, or trying to dictate it ourselves? 🧵 https://t.co/Tn6ITF8Shv

David Manheim

@davidmanheim

16 days ago

AI Evaluation was always critical for development, and marketing, but it’s increasingly load-bearing in policy, safety cases, and public discourse. Evals used for development just need to work, and marketing is marketing, but now, they must be more robust and communicate more.🧵

1

3

1

2K

1

10

4

1

2K

Evals-Consensus.ai @EvalConsensusAI

9 days ago

We're not interested in pretending consensus exists if it doesn't! But after round 1, many practices are broadly agreed to be important across labs, auditors, academia, civil society, and government; others are context-dependent, not widely accepted, or are disagreed upon.

1

0

246

Evals-Consensus.ai

@EvalConsensusAI

Last Seen Users on Sotwe

Trends for you

Most Popular Users