Dual Use Foundation Artificial Intelligence Models with Widely Available Model Weights

Date of Publication

February 21, 2024

Docket Number

240216-0052

Earned Trust through AI System Assurance

SUMMARY

On October 30, 2023, President Biden issued an Executive Order on “Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence,” which directed the Secretary of Commerce, acting through the Assistant Secretary of Commerce for Communications and Information, and in consultation with the Secretary of State, to conduct a public consultation process and issue a report on the potential risks, benefits, other implications, and appropriate policy and regulatory approaches to dual-use foundation models for which the model weights are widely available. Pursuant to that Executive Order, the National Telecommunications and Information Administration (NTIA) hereby issues this Request for Comment on these issues. Responses received will be used to submit a report to the President on the potential benefits, risks, and implications of dual-use foundation models for which the model weights are widely available, as well as policy and regulatory recommendations pertaining to those models.

DATES

Written comments must be received on or before March 27, 2024.

ADDRESSES

All electronic public comments on this action, identified by Regulations.gov docket number NTIA–2023–0009, may be submitted through the Federal e-Rulemaking Portal. The docket established for this request for comment can be found at www.Regulations.gov, NTIA–2023–0009. To make a submission, click the ‘‘Comment Now!’’ icon, complete the required fields, and enter or attach your comments. Additional instructions can be found in the “Instructions” section below, after “Supplementary Information.”

FOR FURTHER INFORMATION

Please direct questions regarding this Request for Comment to:

Travis Hall
Subject line: ‘‘Openness in AI Request for Comment’’.

If submitting comments by U.S. mail, please address questions to:

Bertram Lee,
National Telecommunications and Information Administration, U.S. Department of Commerce,
1401 Constitution Avenue NW, Washington, DC 20230.

Questions submitted via telephone should be directed to:

(202)-482-3522.

Please direct media inquiries to NTIA’s Office of Public Affairs:

Telephone: (202) 482–7002 or
NTIA’s Office of Public Affairs email

SUPPLEMENTARY INFORMATION:

Background and Authority

Artificial intelligence (AI)¹ has had, and will have, a significant effect on society, the economy, and scientific progress. Many of the most prominent models, including the model that powers ChatGPT, are “fully closed” or “highly restricted,” with limited or no public access to their inner workings. The recent introduction of large, publicly-available models, such as those from Google, Meta, Stability AI, Mistral, the Allen Institute for AI, and Eleuthera AI, however, has fostered an ecosystem of increasingly “open” advanced AI models, allowing developers and others to fine-tune models using widely available computing.²

Dual use foundation models with widely available weights (referred to here as open foundation models) could play a key role in fostering growth among less resourced actors, helping to widely share access to AI’s benefits.³ Small businesses, academic institutions, underfunded entrepreneurs, and even legacy businesses have used these models to further innovate, advance scientific knowledge, and gain potential competitive advantages in the marketplace. The concentration of access to foundation models into a small subset of organizations poses the risk of hindering such innovation and advancements, a concern that could be lessened by availability of open foundation models. Open foundation models can be readily adapted and fine-tuned to specific tasks and possibly make it easier for system developers to scrutinize the role foundation models play in larger AI systems, which is important for rights- and safety-impacting AI systems (e.g. healthcare, education, housing, criminal justice, online platforms etc.).⁴ These open foundation models have the potential to help scientists make new medical discoveries or even make mundane, time-consuming activities more efficient.⁵

Open foundation models have the potential to transform research, both within computer science⁶ and through supporting other disciplines such as medicine, pharmaceutical, and scientific research.⁷ Historically, widely available programming libraries have given researchers the ability to simultaneously run and understand algorithms created by other programmers. Researchers and journals have supported the movement towards open science ⁸, which includes sharing research artifacts like the data and code required to reproduce results.

Open foundation models can allow for more transparency and enable broader access to allow greater oversight by technical experts, researchers, academics, and those from the security community.⁹ Foundation models with widely available model weights could also promote competition in downstream markets for which AI models are a critical input, allowing smaller players to add value by adjusting models originally produced by the large developers.¹⁰ The accessibility of open foundation models also provides tools for individuals and civil society groups to resist authoritarian regimes, furthering democratic values and U.S. foreign policy goals.

While open foundation models potentially offer significant benefits, they may pose risks as well. Foundation models with widely-available model weights could engender substantial harms, such as risks to security, equity, civil rights, or other harms due to, for instance,¹¹ affirmative misuse, failures of effective oversight, or lack of clear accountability mechanisms.¹² Others argue that these open foundation models enable development of attacks against proprietary models due to similarities in the data sets used to train them.¹³ The wide availability of dual use foundation models with widely available model weights and the continually shrinking amount of compute necessary to fine-tune these models together create opportunities for malicious actors to use such models to engage in harm.¹⁴ The lack of monitoring of open foundation models may worsen existing challenges, for example, by easing creation of synthetic non-consensual intimate images or enabling mass disinformation campaigns.¹⁵

On October 30, 2023, President Biden signed the Executive Order on “Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence.”¹⁶ Noting the importance of maximizing the benefits of open foundation models while managing and mitigating the attendant risks, section 4.6 the Executive Order tasked the Secretary of Commerce, acting through NTIA and in consultation with the Secretary of State, with soliciting feedback “from the private sector, academia, civil society, and other stakeholders through a public consultation process on the potential risks, benefits, other implications, and appropriate policy and regulatory approaches related to dual-use foundation models for which the model weights are widely available.”¹⁷ As required by the Executive Order, the Secretary of Commerce, through NTIA, and in consultation with the Secretary of State, will author a report to the President on the “potential benefits, risks, and implications of dual-use foundation models for which the model weights are widely available, as well as policy and regulatory recommendations pertaining to those models.”¹⁸

In particular, the Executive Order asks NTIA to consider risks and benefits of dual-use foundation models with weights that are “widely available.”¹⁹ Likewise, “openness” or “wide availability” of model weights are also terms without clear definition or consensus. There are gradients of “openness,” ranging from fully “closed” to fully “open.”²⁰ There is also more information needed to detail the relationship between openness and the wide availability of both model weights and open foundation models more generally. This could include, for example, information about what types of licenses and distribution methods are available or could be available for open foundation models, and how such licenses and distribution methods fit within an understanding of openness and wide availability.²¹

NTIA also requests input on any potential regulatory models, either voluntary or mandatory, that could maintain and potentially increase the benefits and/or mitigate the risks of dual use foundation models with widely available model weights. We seek input as to different kinds of regulatory structures that could deal with not only the large scale of these foundation models, but also the declining level of computing resources needed to fine-tune and retrain them.

Definitions

This Request for Comment uses the terms defined in Sec. 3 of the Executive Order. In addition, we use broader terms interchangeably for both ease of understanding and clarity, as set forth below. “Artificial intelligence” or “AI” refer to a machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions, influencing real or virtual environments.²² Artificial intelligence systems use machine- and human-based inputs to perceive real and virtual environments, abstract such perceptions into models through analysis in an automated manner, and use model inference to formulate options for information or action.

Foundation models are typically defined as, “powerful models that can be fine-tuned and used for multiple purposes.”²³ Under the Executive Order, a “dual-use foundation model” is “an AI model that is trained on broad data; generally uses self-supervision, contains at least tens of billions of parameters; is applicable across a wide range of contexts; and that exhibits, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matters….”²⁴ Both definitions of “foundation model” and of “dual-use foundation model” – highlight the key trait of these models, that they can be used in a number of ways.²⁵

“Generative AI can be understood as a form of AI model specifically intended to produce new digital material as an output (including text, images, audio, video, software code), including when such AI models are used in applications and their user interfaces.”²⁶ The term “generative AI” refers to a class of AI models built on foundation models “that emulate the structure and characteristics of input data in order to generate derived synthetic content.”²⁷ Chatbots like ChatGPT, large language models like BLOOM, and image generators like Midjourney are all examples of generative AI.

This Request for Comment is particularly focused on the wide availability, such as being publicly posted online, of foundation model weights. “Model weights” are “numerical parameter[s] within an AI model that help [. . .] determine the model’s output in response to inputs.”²⁸ In addition to model weights, there are other “components” of an AI model, including training data, code, or other elements, which are involved in its development or use, and may or may not be made widely available.

The Executive Order directs NTIA to focus on dual-use foundation models that were trained on broad data; generally use self-supervision; contain at least tens of billions of parameters; are applicable across a wide range of contexts; and exhibit, or could be easily modified to exhibit, high levels of performance at tasks that pose a serious risk to security, national economic security, national public health or safety, or any combination of those matter.²⁹ NTIA also remains interested in the discussion of models that fall outside of the scope of this Request for Comments in order to better understand the current landscape and potential impact of regulatory or policy actions.

Instructions for Commenters

Through this Request for Comment, we hope to gather information on the following questions. These are not exhaustive, and commenters are invited to provide input on relevant questions not asked below. Commenters are not required to respond to all questions. When responding to one or more of the questions below, please note in the text of your response the number of the question to which you are responding. Commenters should include a page number on each page of their submissions. Commenters are welcome to provide specific actionable proposals, rationales, and relevant facts.

Please do not include in your comments information of a confidential nature, such as sensitive personal information or proprietary information. All comments received are a part of the public record and will generally be posted to Regulations.gov without change. All personal identifying information (e.g., name, address) voluntarily submitted by the commenter may be publicly accessible.

Questions

Is there evidence or historical examples suggesting that weights of models similar to currently-closed AI systems will, or will not, likely become widely available? If so, what are they?
Is it possible to generally estimate the timeframe between the deployment of a closed model and the deployment of an open foundation model of similar performance on relevant tasks? How do you expect that timeframe to change? Based on what variables? How do you expect those variables to change in the coming months and years?
Should “wide availability” of model weights be defined by level of distribution? If so, at what level of distribution (e.g., 10,000 entities; 1 million entities; open publication; etc.) should model weights be presumed to be “widely available”? If not, how should NTIA define “wide availability?”
Do certain forms of access to an open foundation model (web applications, Application Programming Interfaces (API), local hosting, edge deployment) provide more or less benefit or more or less risk than others? Are these risks dependent on other details of the system or application enabling access?
1. Are there promising prospective forms or modes of access that could strike a more favorable benefit-risk balance? If so, what are they?

What, if any, are the risks associated with widely available model weights? How do these risks change, if at all, when the training data or source code associated with fine tuning, pretraining, or deploying a model is simultaneously widely available?
Could open foundation models reduce equity in rights and safety-impacting AI systems (e.g. healthcare, education, criminal justice, housing, online platforms, etc.)?
What, if any, risks related to privacy could result from the wide availability of model weights?
Are there novel ways that state or non-state actors could use widely available model weights to create or exacerbate security risks, including but not limited to threats to infrastructure, public health, human and civil rights, democracy, defense, and the economy?
1. How do these risks compare to those associated with closed models?
2. How do these risks compare to those associated with other types of software systems and information resources?
What, if any, risks could result from differences in access to widely available models across different jurisdictions?
Which are the most severe, and which the most likely risks described in answering the questions above? How do these set of risks relate to each other, if at all?

What benefits do open model weights offer for competition and innovation, both in the AI marketplace and in other areas of the economy? In what ways can open dual-use foundation models enable or enhance scientific research, as well as education/training in computer science and related fields?
How can making model weights widely available improve the safety, security, and trustworthiness of AI and the robustness of public preparedness against potential AI risks?
Could open model weights, and in particular the ability to retrain models, help advance equity in rights and safety-impacting AI systems (e.g. healthcare, education, criminal justice, housing, online platforms etc.)?
How can the diffusion of AI models with widely available weights support the United States’ national security interests? How could it interfere with, or further the enjoyment and protection of human rights within and outside of the United States?
How do these benefits change, if at all, when the training data or the associated source code of the model is simultaneously widely available?

Are there other relevant components of open foundation models that, if simultaneously widely available, would change the risks or benefits presented by widely available model weights? If so, please list them and explain their impact.

What model evaluations, if any, can help determine the risks or benefits associated with making weights of a foundation model widely available?
Are there effective ways to create safeguards around foundation models, either to ensure that model weights do not become available, or to protect system integrity or human well-being (including privacy) and reduce security risks in those cases where weights are widely available?
What are the prospects for developing effective safeguards in the future?
Are there ways to regain control over and/or restrict access to and/or limit use of weights of an open foundation model that, either inadvertently or purposely, have already become widely available? What are the approximate costs of these methods today? How reliable are they?
What if any secure storage techniques or practices could be considered necessary to prevent unintentional distribution of model weights?
Which components of a foundation model need to be available, and to whom, in order to analyze, evaluate, certify, or red-team the model? To the extent possible, please identify specific evaluations or types of evaluations and the component(s) that need to be available for each.
Are there means by which to test or verify model weights? What methodology or methodologies exist to audit model weights and/or foundation models?

In which ways is open-source software policy analogous (or not) to the availability of model weights? Are there lessons we can learn from the history and ecosystem of open-source software, open data, and other “open” initiatives for open foundation models, particularly the availability of model weights?
How, if at all, does the wide availability of model weights change the competition dynamics in the broader economy, specifically looking at industries such as but not limited to healthcare, marketing, and education?
How, if at all, do intellectual property-related issues—such as the license terms under which foundation model weights are made publicly available—influence competition, benefits, and risks? Which licenses are most prominent in the context of making model weights widely available? What are the tradeoffs associated with each of these licenses?
Are there concerns about potential barriers to interoperability stemming from different incompatible “open” licenses, e.g., licenses with conflicting requirements, applied to AI components? Would standardizing license terms specifically for foundation model weights be beneficial? Are there particular examples in existence that could be useful?

What security, legal, or other measures can reasonably be employed to reliably prevent wide availability of access to a foundation model’s weights, or limit their end use?
How might the wide availability of open foundation model weights facilitate, or else frustrate, government action in AI regulation?
When, if ever, should entities deploying AI disclose to users or the general public that they are using open foundation models either with or without widely available weights?
What role, if any, should the U.S. government take in setting metrics for risk, creating standards for best practices, and/or supporting or restricting the availability of foundation model weights?
1. Should other government or non-government bodies, currently existing or not, support the government in this role? Should this vary by sector?
What should the role of model hosting services (e.g. HuggingFace, GitHub, etc.) be in making dual-use models with open weights more or less available? Should hosting services host models that do not meet certain safety standards? By whom should those standards be prescribed?
Should there be different standards for government as opposed to private industry when it comes to sharing model weights of open foundation models or contracting with companies who use them?
What should the U.S. prioritize in working with other countries on this topic, and which countries are most important to work with?
What insights from other countries or other societal systems are most useful to consider?
Are there effective mechanisms or procedures that can be used by the government or companies to make decisions regarding an appropriate degree of availability of model weights in a dual-use foundation model or the dual-use foundation model ecosystem? Are there methods for making effective decisions about open AI deployment that balance both benefits and risks? This may include responsible capability scaling policies, preparedness frameworks, et cetera.
Are there particular individuals/entities who should or should not have access to open-weight foundation models? If so, why and under what circumstances?

How should these potentially competing interests of innovation, competition, and security be addressed or balanced?
Noting that E.O. 14110 grants the Secretary of Commerce the capacity to adapt the threshold, is the amount of computational resources required to build a model, such as the cutoff of 1026 integer or floating-point operations used in the Executive Order, a useful metric for thresholds to mitigate risk in the long-term, particularly for risks associated with wide availability of model weights?
Are there more robust risk metrics for foundation models with widely available weights that will stand the test of time? Should we look at models that fall outside of the dual-use foundation model definition?

What other issues, topics, or adjacent technological advancements should we consider when analyzing risks and benefits of dual-use foundation models with widely available model weights?

Stephanie Weiner,
Chief Counsel, National Telecommunications and Information Administration.

¹ Artificial Intelligence (AI) “has the meaning set forth in 15 U.S.C. 9401(3): a machine-based system that can, for a given set of human-defined objectives, make predictions, recommendations, or decisions influencing real or virtual environments. Artificial intelligence systems use machine- and human-based inputs to perceive real and virtual environments; abstract such perceptions into models through analysis in an automated manner; and use model inference to formulate options for information or action.” see Executive Office of the President, Safe, Secure, and Trustworthy Development and Use of Artificial Intelligence, 88 Federal Register 75191 (November 1, 2023). “AI Model” means “a component of an information system that implements AI technology and uses computational, statistical, or machine-learning techniques to produce outputs from a given set of inputs.” see Id.

² See e.g., Zoe Brammer, How Does Access Impact Risk? Assessing AI Foundation Model Risk Along a Gradient of Access, The Institute for Security and Technology (December 2023); Irene Solaiman, The Gradient of Generative AI Release: Methods and Considerations, arXiv:2302.04844v1 (February 5, 2023).

³ See e.g., Elizabeth Seger et al., Open-Sourcing Highly Capable Foundation Models, Centre for the Governance of AI (2023).

⁴See e.g. Executive Office of the President: Office of Management and Budget, Proposed Memorandum For the Heads of Executive Departments and Agencies (November 3, 2023); Cui Beilei et al., Surgical-DINO: Adapter Learning of Foundation Model for Depth Estimation in Endoscopic Surgery, arXiv:2401.06013v1 (January 11, 2024) (Using low-ranked adaptation, or LoRA, in a foundation model to help with surgical depth estimation for endoscopic surgeries).

⁵ See e.g., Shaoting Zhang, On the Challenges and Perspectives of Foundation Models for Medical Image Analysis, arXiv:2306.05705v2 (November 23, 2023).

⁶ See e.g., David Noever, Can Large Language Models Find And Fix Vulnerable Software?, arxiv 2308.10345 (August 20, 2023); Andreas Stöckl, Evaluating a Synthetic Image Dataset Generated with Stable Diffusion, Proceedings of Eighth International Congress on Information and Communication Technology Vol. 693 (July 25, 2023).

⁷ See e.g., Kun-Hsing Yu et al., Artificial intelligence in healthcare, Nature Biomedical Engineering Vol. 2 719-731 (October 10, 2018); Kevin Maik Jablonka et al., 14 examples of how LLMs can transform materials science and chemistry: a reflection on a large language model hackathon, Digital Discovery 2 (August 8, 2023).

⁸See e.g., Harvey V. Fineberg et al., Consensus Study Report: Reproducibility and Replicability in Science, National Academies of Sciences (May 2019); Nature, Reporting standards and availability of data, materials, code and protocols; Science, Science Journals: Editorial Policies; Edward Miguel, Evidence on Research Transparency in Economics, Journal of Economic Perspectives Vol. 35 No. 3 (2021).

⁹ See e.g., Rishi Bommasani et al., Considerations for Governing Open Foundation Models, Stanford University Human-Centered Artificial Intelligence (December 2023).

¹⁰See, e.g., Jai Vipra and Anton Korinek, Market concentration implications of foundation models: The Invisible Hand of ChatGPT, Brookings Inst. (2023).

¹¹ Id.

¹² Id.

¹³ For example, researchers have found ways to get both black box large language models as well as more open models to produce objectionable content through adversarial attacks. See e.g., Andy Zou et al., Universal and Transferable Adversarial Attacks on Aligned Language Models, arXiv:2307.15043 (July 27, 2023).("Surprisingly, we find that the adversarial prompts generated by our approach are quite transferable, including to black-box, publicly released LLMs . . . When doing so, the resulting attack suffix is able to induce objectionable content in the public interfaces to ChatGPT, Bard, and Claude, as well as open source LLMs such as LLaMA-2-Chat, Pythia, Falcon, and others.”).

¹⁴ See e.g., Zoe Brammer, How Does Access Impact Risk? Assessing AI Foundation Model Risk Along a Gradient of Access, The Institute for Security and Technology (December 2023).

¹⁵ Id and see e.g. Pranshu Verma, The rise of AI fake news is creating a ‘misinformation superspreader’, Washington Post (December 17, 2023).

¹⁶Exec. Order No. 14110, 88 Fed. Reg. 75191 (November 1, 2023).

¹⁷Id.

¹⁸Id.

¹⁹ Exec. Order No. 14110, 88 Fed. Reg. 75191 (November 1, 2023).

²⁰ See, e.g., Irene Solaiman, The Gradient of Generative AI Release: Methods and Considerations, arXiv:2302.04844v1 (February 5, 2023); Bommasani et al., supra note 9.

²¹ See, e.g., Carlos Munoz Ferrandis, OpenRAIL: Towards open and responsible AI licensing frameworks, Hugging Face Blog (August 31, 2022); Danish Contractor et al., Behavioral Use Licensing for Responsible AI, arXiv:2011.03116v2 (October 20, 2022).

²² Exec. Order No. 14110, 88 Fed. Reg. 75191 (November 1, 2023).

²³ See, e.g., “A foundation model is any model that is trained on broad data (generally using self-supervision at scale) that can be adapted (e.g., fine-tuned) to a wide range of downstream tasks[.]” Rishi Bommasani et al., On the Opportunities and Risks of Foundation Models, arXiv:2108.07258v3 (July 12, 2022).

²⁴ Exec. Order No. 14110, 88 Fed. Reg. 75191 (November 1, 2023).

²⁵ Id.

²⁶ G7 Hiroshima Process on Generative Artificial Intelligence (AI) Towards a G7 Common Understanding on Generative AI, Organisation for Economic Co-operation and Development (OECD) (September 7, 2023) .

²⁷ Exec. Order No. 14110, 88 Fed. Reg. 75191 (November 1, 2023).

²⁸ Id.

²⁹ Id.

Open Model Weights RFC Final

Program

Artificial Intelligence

Breadcrumb

SUMMARY

DATES

ADDRESSES

FOR FURTHER INFORMATION

SUPPLEMENTARY INFORMATION:

Background and Authority

Definitions

Instructions for Commenters

Questions

How should NTIA define “open” or “widely available” when thinking about foundation models and model weights?

How do the risks associated with making model weights widely available compare to the risks associated with non-public model weights?

What are the benefits of foundation models with model weights that are widely available as compared to fully closed models?

Are there other relevant components of open foundation models that, if simultaneously widely available, would change the risks or benefits presented by widely available model weights? If so, please list them and explain their impact.

What are the safety-related or broader technical issues involved in managing risks and amplifying benefits of dual-use foundation models with widely available model weights?

What are the legal or business issues or effects related to open foundation models?

What are current or potential voluntary, domestic regulatory, and international mechanisms to manage the risks and maximize the benefits of foundation models with widely available weights? What kind of entities should take a leadership role across which features of governance?

In the face of continually changing technology, and given unforeseen risks and benefits, how can governments, companies, and individuals make decisions or plans today about open foundation models that will be useful in the future?

What other issues, topics, or adjacent technological advancements should we consider when analyzing risks and benefits of dual-use foundation models with widely available model weights?