By combining advanced reasoning with the ability to act and call tools, frontier AI agents hold significant economic and scientific promise β yet also pose potential risks.
Our new issue brief identifies emerging practices for securing the agent stack:
https://t.co/LDxkuY4C2Q
As frontier AI capabilities evolve, information sharing, incident reporting, and incident response will play an important role in managing frontier AI risks. Our new issue brief overviews each and highlights the FMF's ongoing information-sharing efforts:
https://t.co/qkXfPDjfZJ
New Issue Brief: Adversarial Distillation
Although authorized distillation has legitimate use cases, adversarial distillation can enable malicious actors to extract frontier model capabilities while bypassing built-in safety measures.
Read more here: https://t.co/HkwRpFNyui
As announced last March, the member firms of the FMF signed a first-of-its-kind agreement designed to facilitate voluntary information-sharing related to unique risks from frontier AI.
We are pleased to share a progress update here: https://t.co/xUPs2Tmaql
NEW: Technical Report on Managing Cybersecurity Risks in Frontier AI Frameworks
Building on our prior series on frontier AI frameworks, this technical report outlines emerging risk management practices in the cyber domain.
Read more: https://t.co/arieDQOY9p
π¨ NEW: Research Update on Frontier AI and Nuclear Security
Over the past year, the Frontier Model Forum partnered with nuclear security experts to conduct preliminary research.
Read our initial findings: https://t.co/kFAz0CxHja
New Publication: Chain of Thought Monitorability
Our latest issue brief explores how Chain of Thought monitoring can help prevent certain types of harm, and why it shows promise as a new layer of defense for frontier AI safety and security: https://t.co/c9jbUaX0ef
(3 of 3) As frontier AI systems become more powerful & widely deployed, advancing our understanding of them & building robust safety tools is essential. We are excited to support each of the AISF recipients & look forward to their scientific contributions.
π§΅ (1 of 3) NEWS: A new cohort of 11 grantees have been awarded more than $5 million through the AI Safety Fund for projects in #Biosecurity, #Cybersecurity, & AI Agent Evaluation and Synthetic Content. https://t.co/D4GX067JM5
(3 of 3) We welcome engagement with these technical reports from across the frontier AI safety and security ecosystem. Please reach out if you are interested in further refining and harmonizing the implementation of frontier frameworks.
π§΅ NEW TECHNICAL REPORT (1 of 3)
Our latest technical report outlines practices for implementing, where appropriate, rigorous, secure, and fit-for-purpose third-party assessments.
Read more here: https://t.co/mBXR9AEepY
(2 of 3) The technical report is the latest in our series on frontier AI frameworks. The first three can be found here:
Capability Assessments π https://t.co/OVPPH6OBvO
Risk Taxonomies and Thresholds π https://t.co/g4ZeR43aFv
Mitigations π https://t.co/tFIcrikfM8
Collectively, the issue briefs flesh out the core elements of frontier AI frameworks with respect to bio-related risks.
We welcome all feedback and look forward to further developing and refining risk management practices at the intersection of frontier AI and biology. (5/5)
π§΅ NEW ISSUE BRIEF (1/5)
As frontier AI capabilities advance, it's crucial to develop robust risk management practices for AI and biology.
Our latest issue brief outlines the current landscape of AIxBIO safeguards, highlighting over 20 mitigations and related best practices.
This is our third technical report in a series on implementing frontier AI frameworks:
Capability Assessments π https://t.co/OVPPH6OBvO
Risk Taxonomies and Thresholds π https://t.co/g4ZeR43aFv
Our final report will cover third-party assessments
π¨ NEW: Our latest frontier AI frameworks technical report. This one examines mitigations, including current common strategies to prevent misuse of frontier capabilities, effectiveness assessments, and areas of continued work.
Read more here: https://t.co/4DGcaHhzZl