Launch Announcement: Leading AI Research Organizations Create Forum of Independent Evaluators
Today, a group of leading AI researchers announced the creation of the AI Evaluator Forum (AEF), a network of independent organizations assessing AI capabilities and risks in the public interest. The forum will enable collaboration among its members to develop common best practices and standards, foster collaborative research, and facilitate engagements with industry, governments, and civil society.
“Both governments and industry have again and again turned to independent evaluation as a means to address AI risks and promote a race to the top in the AI industry. But the evaluation ecosystem needs to come together to scale best practices that are still nascent and to maximize their impact,” said Conrad Stosz, former Acting Director of the U.S. Center for AI Standards and Innovation and an organizer of the forum.
Alongside the announcement, the Forum is also releasing:
- A public statement signed by over 40 leading voices from across the AI sector, arguing for greater transparency about third-party evaluations, including to aid external scrutiny and surface potential conflicts of interest.
- AEF-1, a detailed voluntary standard that helps sets a baseline level of independence, transparency, and access for third-party AI evaluations, which forum members will begin adopting in relevant forthcoming evaluations.
"Without independent evaluation, AI companies grade their own homework. A robust and independent evaluation ecosystem is necessary to ensure that developers are held accountable by third parties in the public interest," said Sayash Kapoor, a researcher at Princeton University and co-author of the book AI Snake Oil.
The Forum’s founding members include Transluce, METR, the RAND Corporation, the Holistic Agent Leaderboard at Princeton University, SecureBio, the Collective Intelligence Project, Meridian Labs, and the AI Verification and Evaluation Research Institute. Together these organizations are advancing the science of AI evaluation and assessing a broad range of AI risks, from biosecurity to self harm.
Additional quotes:
- “AI is too important of a technology to be developed and deployed without credible third-party evaluation. The formation of AEF is a key step towards a mature third-party evaluation ecosystem that can undergird societal trust in AI,” said Miles Brundage, Executive Director, AI Verification and Evaluation Research Institute (AVERI)
- “Independent AI evaluation is a critical part of responsible AI deployment: we can’t build trust in evaluations if they aren’t designed and carried out independently, and replicable by other third parties,” said Sarah Schwettmann, Co-founder and Chief Scientist at Transluce.
- "Third-party evaluators are proven to provide high quality insights into AI. Transparency and evaluator independence can only enhance trust in and quality of information we can gain as a field," said Irene Soleiman, who co-leads EvalEvals, a research community devoted to improving AI evaluations.
- “Robust third-party evaluations are required to create confidence in AI development and deployment. They create transparency on the security of AI systems for users and the broader public,” said Rajiv Dattani, co-founder of The Artificial Intelligence Underwriting Company, a company focused on providing AI insurance.
- “Frontier AIs keep surprising us, in their capabilities and weaknesses alike. As AI becomes more powerful, we cannot afford such surprises. A rigorous, independent community of evaluators is essential for understanding where we're taking this technology and where it's taking us,” said Seth Donoughe, Director of AI at SecureBio.