Home

Mandated Third-Party AI Audits are Coming—Addressing AI’s Socio-Technical Challenges Will Be Key

Brandie Nonnecke / Jul 16, 2024

National Institute of Standards and Technology (NIST) headquarters in Gaithersburg, Maryland. Shutterstock

Sen. John Hickenlooper's (D-CO) announcement of the Validation and Evaluation for Trustworthy (VET) AI Act marks a significant legislative effort to bring much-needed oversight to the rapidly advancing field of AI. This proposed legislation directs the National Institute of Standards and Technology (NIST) to develop guidelines for third-party evaluators, aiming to ensure AI systems are developed and deployed responsibly. However, several concerns must be addressed to make this initiative effective.

The bill proposes a framework where independent evaluators, similar to those in the financial industry, work with AI developers and deployers to verify compliance with established guardrails, such as data protection and cybersecurity standards. NIST, in collaboration with the Department of Energy and the National Science Foundation, would create voluntary specifications for internal assurance and external verification regarding the development, testing, and deployment of AI systems.

While promising, responsible AI is not just a technical challenge but a socio-technical one. AI is driven by human assumptions and behavior. For example, how a developer categorizes data could inadvertently cause discrimination or how the public interacts with AI could cause it to perform in unexpected ways. Further, how a third-party auditor operationalizes potential AI risks directly affects how those risks are defined, evaluated, and addressed, enabling uninvestigated risks to fall under the radar. Humans are fallible, and by default, so are AI audits.

Sen. Hickenlooper is clear that we must act quickly to address AI harms, “AI is moving faster than any of us thought it would two years ago,” said Sen. Hickenlooper. “But we have to move just as fast to get sensible guardrails in place to develop AI responsibly before it’s too late. Otherwise, AI could bring more harm than good to our lives.” Technical and socio-technical AI standards will be key to achieve this goal.

Standards bodies, such as IEEE, ISO, and CEN-CENELEC, are developing both technical and socio-technical standards for AI. However, there is a lack of consensus on the appropriate focus of these standards, and especially on their implementation. Even established socio-technical standards, such as ISO/IEC 27001:2022 Information Security, Cybersecurity and Privacy Protection, are not foolproof. Cybersecurity incidents are still on the rise. But the development of standards and their adequate implementation to address the socio-technical challenges of AI will take years to iron out.

We’re still at the early stages of defining responsible AI concepts, like “trustworthiness” and “transparency.” Currently, these concepts are prone to varied interpretations, requiring careful consideration in relation to the AI technology and application area. For instance, NIST has identified over 50 types of bias that plague AI. Addressing any one of these biases may inadvertently cause others to emerge. Thus, an audit addressing “AI bias” is not clear-cut. The auditor must be clear on what type(s) of bias were evaluated and whether addressing this bias causes others to emerge.

The AI risk assessment landscape is rapidly growing, with industry, government, academia, and civil society all producing different assessment tools and strategies. Primary among them is NIST’s AI Risk Management Framework (AI RMF) which provides high-level guidance on identifying and managing AI risks, including socio-technical risks. Yet its flexibility, while important to keep pace with technical advancements and differing use cases of AI, simultaneously creates a lack of clarity among implementers on best practices.

NIST’s Assessing Risks and Impacts of AI (ARIA) program seeks to address this challenge by providing independent assessments of societal risks via real-world testing scenarios. And its US AI Safety Institute is convening experts from over 200 organizations to collaboratively develop science-based guidelines and standards for AI measurement and policy. While these initiatives are promising, AI risk assessments like the NIST AI RMF, are already a key component of many AI companies' responsible AI strategies. As a result, these early audits could set well-intended but inappropriate precedents as we work to define and understand the socio-technical risks of AI.

Historically, third-party auditors in other industries have faced challenges that should serve as cautionary tales. In the financial sector, conflicts of interest and insufficient oversight have led to catastrophic failures.The same pitfalls could plague AI assurance if stringent criteria and transparent processes are not enforced.

To address this challenge, one of the bill's key features is establishing a collaborative advisory committee to review and recommend criteria for certifying individuals or organizations capable of conducting AI assurance. NIST would also study the AI assurance ecosystem, examining current capabilities, methodologies, necessary resources, and market demand. In doing so, NIST will help provide much-needed clarity on how to appropriately vet third-party AI auditors and their practices.

While the VET AI Act represents a proactive step towards responsible AI development, its success will largely depend on the robustness of responsible AI standards, including best practices in operationalizing these standards. Ensuring these auditors are thoroughly vetted, unbiased, and highly qualified is paramount. As the Senate considers this legislation, it is critical to take into account the socio-technical challenges of AI. Without this, the bill's well-intentioned goals will fall short, leaving the door open for the very harms it seeks to prevent.

Authors

Brandie Nonnecke
Brandie Nonnecke, PhD is Founding Director of the CITRIS Policy Lab, headquartered at UC Berkeley. She is an Associate Research Professor at the Goldman School of Public Policy (GSPP) where she directs the Tech Policy Initiative, a collaboration between CITRIS and GSPP to strengthen tech policy educ...

Topics