Sorry, you need to enable JavaScript to visit this website.
Skip to main content
U.S. flag

An official website of the United States government

Dot gov

The .gov means it’s official.

Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

Https

The site is secure.

The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

Role of Standards

March 27, 2024
Earned Trust through AI System Assurance

It was an uncontroversial point in the comments that international technical standards are vitally important196 and may be necessary for defining the methodology for certain kinds of audits.197 Developing technical standards for emerging technologies is a core Administration objective.198 The current dearth of consensus technical standards for use in AI system evaluations is a barrier to assurance practices. This barrier may be especially pronounced for evaluation of foundation models.199 Compounding the challenge of standards development is the reality that AI is being developed, deployed, and advanced across many different sectors, each with its own applications, risks, and terminology, and that the AI community has yet to coalesce on fundamental questions surrounding terminology.200 Under-developed standards mean uncertainty for companies seeking compliance, diminished usefulness of audits, and reduced assurance for customers, government, and the public.201

Among the issues for which commenters wanted standards and benchmarks for both internal and external evaluation and other assurance practices were:

  • AI risk hierarchies, acceptable risks, and tradeoffs;
  • Performance of AI models, including for fairness, accuracy, robustness, reproducibility, and explainability;
  • Data quality, provenance, and governance;
  • Internal governance controls, including team compositions and reporting structures;
  • Stakeholder participation;
  • Security;
  • Internal documentation and external transparency; and
  • Testing, monitoring, and risk management.

Here, we stress the need for accelerated international standards work and provide further justification for expanding participation in technical standards and standards-setting processes. The comments yielded three important caveats about conventional technical standards: the relative immaturity of the AI standards ecosystem, its relative non-normativity, and the dominance of industry in relation to other stakeholders. Addressing these critiques will improve AI accountability.

Standards-setting organizations publish requirements and guidelines (alongside other types of documents not pertinent here).  Requirements contain “shall” and “shall not” statements, while guidelines tend to contain “should,” “should not,” or “may” statements.202 Leading commentary on standards for AI audits is supportive of guidelines that can be more flexible than requirements and standards that focus on processes as well as outputs.203 Nevertheless, it is important to recognize that guidelines do not constitute compliance regimes. Technical standards-setting organizations hesitate – and may not be equipped – to settle policy and values debates on their own.204 Non-prescriptive standards – for instance, providing ways to measure risk, without identifying a threshold beyond which risk is unacceptable – help with future-proofing. However, such flexibility means that the governments, public, and downstream users of the technology cannot assume that compliance with such standards means that risks have been acceptably managed. Separate legal or regulatory requirements are required to set norms and compel adherence.205

We are cognizant of the critique that non-prescriptive stances have sometimes impeded efforts to ensure that standards respect human rights.206 Others also worry that, as in cybersecurity, overreliance on voluntary, non-prescriptive standards will fail to create the necessary incentives for compliance.207 One of the key ways to continue expanding standards work and to address those critiques is to build out additional participation mechanisms in the guidance and standardization process. There should be concerted efforts to include experts and stakeholders as non-prescriptive guidance comes to develop normative content and/or binding force. The inclusion of experts and stakeholders in standards development is particularly important given the centrality of normative concepts such as freedom from harmful discrimination and disinformation in standards work. Civil society and industry echo this sentiment, emphasizing the need for more inclusion – beyond AI actors – in crafting and assessing standards, profiles, and best practices.208

Accessibility of industry standards and associated development processes is one hurdle to meaningful participation by experts and stakeholders. We counted at least one AI assurance standard that cannot be viewed during its development without existing membership in ISO/IEC or access via a country’s ISO national member (e.g., ANSI in the U.S.).209 This document, and other standards like it, may represent fundamental milestones in the field of AI assurance; and while development processes by established standards organizations are generally well-established and ultimately accessible with effort, we acknowledge the real financial and logistical barriers to simply browsing its emerging forms. Further, while many frameworks and documents may be free to download, many industry technical standards require a license and expenditure to view.210 As the state-of-the-art advances, regular updates to these and other publications will impose new costs and access barriers.

Traditional, formal standards-setting processes may not yield standards for AI assurance practices sufficiently rapidly, transparently, inclusively, and comprehensively on their own, and may lag behind technical developments.211 Several commenters recommended that government develop a taxonomy or hierarchy of AI risks to shape how AI actors prioritize risk.212 Others requested government help in devising assurance methodologies that take equity and public participation seriously.213 We note that NIST is already leading and encouraging community leaders to develop a series of AI RMF “profiles” that will provide more detailed guidance to the application of the NIST AI RMF in different domains.214 For example, the Department of Labor’s Office of Disability Employment Policy (ODEP) is working with key partners to create a Profile for Inclusive Hiring. This policy framework aims to guide employers to practice disability inclusion and accessibility when they decide to use AI in talent acquisition processes.
Looking ahead, there is a question about how standards will evolve globally to keep pace with technological development and societal needs. There are several key issues that will help inform this issue:

  • Whether current standards continue to develop alongside AI implementations at an appropriate pace and with appropriate scope;215
  • Whether competing standards emerge inadvertently, creating perverse incentives for stakeholders and opportunities for arbitrage; and
  • Whether future industry standards foster a sufficiently large marketplace of certification, auditing, and compliance entities to ensure appropriate levels of compliance.216

Commenters have suggested governmental actions to support the development and adoption of AI standards, including, as one commenter expressed, by supporting research on data quality benchmarks and data commons for AI companies.217 For at least some AI technologies, government has already played a significant role in the actual testing of systems and publication of results. Since 2002, for instance, NIST’s Facial Recognition Vendor Tests have assessed the accuracy of privately developed facial recognition technology. This research has not only demonstrated the overall degree of accuracy of the tested algorithms, but has also identified common challenges across algorithms such as accuracy differentials based on race or gender.

Generally, government can foster the utility of standards for accountability purposes by (a) encouraging and fostering participation by diverse stakeholders, including civil society, non-industry participants, and those involuntarily affected by AI systems; (b) helping improve and expand access to standards publications by those traditionally under-represented parties; (c) supporting methods to align industry standards with societal values; and (d) in appropriate circumstances, developing guidelines or other resources that contribute toward standards development.218

We also note that, while international standards development is critical, national standards might also be necessary to protect national security interests.

 


196 See MITRE Comment at 8 (“Common terminology is critical for any field’s advancement as it enables every professional to represent, express, and communicate their findings in a manner that is effectively and accurately understood by their peers”); Engine Advocacy Comment at 6; Intel Comment at 5; Palantir Comment at 21-22; GovAI Comments at 9. But cf. Google DeepMind Comment at 14-15 (While recognizing that baseline definitions for AI accountability terms is good, “applying these terms is likely to vary based on the jurisdiction, sector, as well as use case, and definitions will require room to evolve as the technology changes.”).

197 See, e.g., PWC Comment at A3 (“Use of the term “audit” without reference to a generally accepted body of standards fails to convey the level of effort applied, the scope of procedures performed, the level of assurance provided over the findings, or the qualifications of the provider, among other shortcomings”); Open MIC Comment at 25 (“Without mandatory standards for AI audits and assessments … there is an incentive for companies to ‘social wash’ their AI assessments; i.e. give investors and other stakeholders the impression that they are using AI responsibly without any meaningful efforts to ensure this”); Salesforce Comment at 11 (“If definitions and methods were standardized, audits would be more consistent and lead to more confidence.”); Global Partners Comment at 16 (“The lack of measurable standards or benchmarks creates the risk of rendering impact assessments as unproductive exercises by providing an appearance of accountability but not enough to achieve it effectively”); BSA | The Software Alliance Comment at 2 (“Without common [auditing] standards, the quality of any audits will vary significantly because different audits may measure against different benchmarks, undermining the goal of obtaining an evaluation based on an objective benchmark.”).

198 See The White House, United States National Standards Strategy for Critical and Emerging Technology (USG NSS CET) (May 2023).

199 See, e.g., Information Technology Industry Council (ITI), supra note 78, at 10 (citing Rishi Bommasani, Percy Liang, and Tony Lee, Language Models are Changing AI: The Need for Holistic Evaluation, Center for Research on Foundation Models, Stanford HAI (2021)) (recommending investment in developing metrics to quantify and evaluate bias in AI systems and metrics to measure foundation model performance); Microsoft Comment at 12 (need investment in international AI standards to underpin an assurance ecosystem).

200 Engine Advocacy Comment at 6-7.

201 See generally Credo AI Comment at 6.

202 See, e.g., The International Organization for Standardization-International Electrochemical Commission (ISO/IEC) 23894:2023 Guidelines on risk management for AI (containing should statements, such as “top management should consider how policies and statements related to AI risks and risk management are communicated to stakeholders”). But see ISO/IEC 17065, Requirements for bodies certifying products, processes and services, (stating that “Interested parties can expect or require the certification body to meet all the requirements of this International Standard.…”).

203 See, e.g., Raji et al, Change from the Outside, supra note 184 at 16 (recommending “standards as guidelines, not deployment checklists” and “standards for processes, not only for outcomes”).

204 See CDT Comment at 28 (“Such standards will often embody policy and value judgments: standards for an audit designed to evaluate whether a system is biased, for example, may have to set forth how much variation in performance, if any, is permissible across race, gender, or other lines in order to still be considered unbiased.”).

205 See NIST AI RMF at 7 (recognizing the need for guidance on risk tolerances from “legal or regulatory requirements”). See also The Center For AI and Digital Policy (CAIDP) Comment at 4 (“Credible assurance of AI systems could be through certification programs under Federal AI legislation based on …established governance frameworks” and noting that AI RMF “is voluntary which does not set adequate and appropriate incentives for accountability.”).

206 See Corinne Cath, The Technology We Choose to Create: Human Rights Advocacy in the Internet Engineering Task Force, Telecommunications Policy, Vol. 45, No. 6 (2021), at 102144. See also Michael Veale, Kira Matus, and Robert Gorwa, AI and Global Governance: Modalities, Rationales, Tensions, Annual Review of Law and Social Science, Vol. 19 (2023)..

207 See Chung, John J. “Critical Infrastructure, Cybersecurity, and Market Failure.” 96 Or. L. Rev. 441, 459-62 (2018)  (explaining why the NIST cybersecurity framework relies on voluntary recommendations rather than prescriptive standards); Robert Gyenes, A Voluntary Cybersecurity Framework Is Unworkable- Government Must Crack the Whip, 14 PGH. J. Tech. L. & Pol’y 293 (2014) (explaining how voluntary cybersecurity settings leads to repeated harms that could be prevented by prescriptive standards and would help to inoculate other parties from future data exploits)..

208 See, e.g., CDT Comment at 29; FPF Comment at 7; Leadership Conference Comment at 5; Google DeepMind Comment at 2..

209 ISO, ISO/IEC CD 42005: Information technology — Artificial intelligence — AI system impact assessment. The public may offer comments on draft standards once those standards reach the enquiry stage; see ISO, Get involved.

210 At the time of writing, access to standards cited by commenters from ISO/IEC and the Institute for Electrical and Electronics Engineer Standards Association’s (IEEE SA) would cost over $1,700. See ISO Store (combine prices for ISO/IEC 17011:2017 Requirements for bodies providing audit and certification of AI management systems ($174); ISO/IEC 17020:2012 Requirements for the operation of various types of bodies performing inspection ($110); ISO/IEC 17021-15:2023 Requirements for bodies providing audit and certification of management systems ($48); ISO/IEC 17025:2017 General requirements for the competence of testing and calibration laboratories ($174); ISO/IEC 17065:2012 Requirements for bodies certifying products, processes and services ($174); ISO/IEC 22989:2022 Artificial intelligence concepts and terminology ($223); ISO/IEC 23894:2023 Artificial intelligence – Guidance on risk management ($148), ISO/IEC 42010:2023 Software systems and enterprise – Architecture description ($223), ISO/IEC 42006 Information technology – Artificial intelligence – Requirements for bodies providing audit and certification of artificial intelligence management systems ($74), ISO/IEC FDIS 5339 Information technology – Artificial intelligence – Guidance for AI applications ($174)); IEEE SA Standards Store (IEEE 1012-2016: Standard for System, Software, and Hardware Verification and Validation ($196). Note that the ISO/IEC prices were converted to USD from Swiss Francs and may vary over time given changing currency exchange rates.

211 See, e.g., MITRE Comment at 8. See also ISO/IEC, last visited Jan. 18, 2024, (stating that ISO/IEC standard development usually takes roughly 3 years to develop from first proposal to publication)..

212 See, e.g., Credo AI Comment at 4-5, Centre for Information Policy Leadership Comment at 8, Center for American Progress Comment at 4, 12-13.

213 See, e.g., Data & Society Comment at 9 (urging government research support for participatory assessments and  context-dependent assessments.); Global Partners Digital Comment at 18 (urging government investment in the production of guidelines and best practices for “meaningful multi-stakeholder participation in the AI assessment process.”).

214 NIST, NIST AI Public Working Groups.

215 See USG NSS CET, at 11 (“The number of standards organizations and venues has increased significantly over the past decade, particularly with respect to [critical and emerging technologies]. Meanwhile the U.S. standards workforce has not kept pace with this growth.”).

216 See e.g., GovAI Comment at 5 (“[T]here are only a few individuals and organizations with the expertise to audit cutting-edge AI models.”).

217 See Global Partners Digital Comment at 14.

218 See generally NIST, U.S. Leadership in AI, supra note 57.