Artificial Intelligence (AI)

Inventory of NARA Artificial Intelligence (AI) Use Cases

Artificial Intelligence (AI) promises to drive the growth of the United States economy and improve the quality of life of all Americans. Executive Order (EO) 13960, Promoting the Use of Trustworthy Artificial Intelligence in the Federal Government, directed federal agencies to inventory their Artificial Intelligence (AI) use cases and share their inventories with other government agencies and the public. As stated in the Executive Order, federal applications of Artificial Intelligence should benefit the US economy and improve the quality of life of all Americans. As such, the growing adoption of AI must coincide with the launch of practices that ensure AI is deployed in a manner that fosters public trust and protects the rights and values of the American people.

 

The National Archives and Records Administration AI Strategic Goals and Objectives as of 2024

The National Archives and Records Administration AI Compliance Plan for OMB

 

In alignment with Executive Order (13960) of December 8, 2020, National Archives and Records Administration (NARA) has prepared an inventory of AI use cases including current and planned uses, consistent with the agency's mission. The links below the following table summarize updates to the inventory of AI use cases by year.

 

NARA AI Use Case Inventory Descriptions
AI Use Case Name AI Use Case Description Current Status What specific AI techniques are used?
AI Pilot Project to Screen and Flag for Personally Identifiable Information (PII) in Digitized Archival Records The National Archives and Records Administration (NARA) is piloting an AI-powered solution to automatically identify and redact Personally Identifiable Information (PII) from digitized archival records. This initial phase focuses on records already accessible in the National Archives Catalog and those awaiting inclusion, using a weighted algorithm to prioritize documents containing the most sensitive PII. NARA's long-term goal is to refine this prototype into a user-friendly tool for preliminary scans of unpublished records, with the added capability of detecting custom-defined entities. Pilot (in-progress) The AI-powered PII detection and redaction project is underway, with a custom AWS model currently in development and being prepared for user acceptance testing. A parallel effort is evaluating Google Cloud Platform's out-of-the-box PII detection service. A final solution will be determined based on testing results and a comparative analysis of both options.
Freedom of Information Act (FOIA) Discovery AI Pilot

NARA aims to employ AI to streamline the Freedom of Information Act (FOIA) request process. The AI system would utilize natural language processing (NLP) to search records based on content similarity to the FOIA query. Additionally, the AI would automatically redact sensitive information from the records, such as personal information or other details, depending on the nature of the request.

Pilot (in-progress) The AI-powered FOIA processing project at NARA is currently in the pilot phase. Initial development of the system, which incorporates NLP-based search and automated redaction capabilities, has been completed. Testing is currently underway to evaluate the system's effectiveness and identify areas for improvement.
Auto-fill of Descriptive Metadata for Archival Descriptions The National Archives and Records Administration (NARA) is exploring the use of AI to automate the creation of archival descriptions, also known as self-describing records. This process involves filling out metadata fields like summaries and authorities to make records easily searchable in the National Archives Catalog. Currently, most records have minimal metadata due to the labor-intensive nature of manual entry. The AI system would analyze the content of documents and existing metadata from the records management system to predict and populate these descriptive fields, improving the discoverability of millions of records for the public. Pilot (in-progress) The AI-powered solution is currently in pilot to automate the creation of archival descriptions (self-describing records). This process involves using machine learning to analyze document content and existing metadata to generate descriptive fields like summaries and authorities, improving the discoverability of records in the National Archives Catalog. The pilot project is currently underway, with business users evaluating the quality and accuracy of the AI-generated metadata.
AI based Semantic Search for National Archives Catalog The National Archives and Records Administration (NARA) aims to enhance the search functionality of its vast catalog by implementing semantic search. This AI-powered technique goes beyond keyword matching, understanding the user's intent and the contextual meaning behind their search terms. By providing more accurate and relevant results, semantic search will streamline the research process for historians, researchers, and the general public, making it easier and faster to discover critical records and documents within the NARA catalog.

Additionally, semantic search can help to identify the relationships between records and documents in the NARA catalog. This can help to provide a more comprehensive understanding of the historical events and processes represented in the records and can facilitate new insights and discoveries.
Pilot (in-progress) The semantic search pilot project at NARA is currently evaluating various options. We have explored both open-source models and the AWS Titan model, as well as semantic search capabilities using Google Vertex AI, utilizing the Gemini large language model. Based on initial findings, the pilot has decided to proceed with Vertex AI for the semantic search functionality.

Create an AI based knowledge articles chat interface for working with CRG documents

This project aims to develop an AI-powered chat interface to assist the National Personnel Records Center (NPRC) customer services staff in efficiently querying Case Reference Guide (CRG). The AI assistant chat interface will leverage cloud-based solutions for indexing and searching of different type of case references documents (e.g., PDF, HTML, Images etc.) and provide relevant information, guidance, and support from a curated knowledge base of articles, policies, and procedures related to NPRC/CRG records.

Pilot (in-progress) The AI-based knowledge chat interface pilot is currently evaluating various options for indexing and querying search results, including Amazon Kendra and Amazon Q Business. These solutions allow users to verify the information presented by providing links to reference documents.
Generative AI for Google Workspace for Internal Employees

NARA has initiated a pilot project to evaluate the potential of Generative AI (GenAI) to enhance workplace productivity.

- Create workforce tools that support employees
- Draft, reply, summarize, and prioritize your Gmail
- Write and refine content in Gmail and Google Docs
- Create original images from text, right within Google Slides
- Transform raw data to insights and analysis via auto completion, formula generation, and contextual categorization in Sheets
- Generate and capture notes in Meet
- Enable workflows for getting things done in Chat
Pilot (in-progress) The NARA pilot project is leveraging Google's "Gemini" AI model, integrated into Google Workspace applications (Gmail, Docs, Slides, Sheets, Drive, Chat, and Meet), to enhance workplace productivity. The pilot involves approximately 50 users from various NARA organizations, each testing GenAI use cases relevant to their roles. This collaborative effort aims to identify how GenAI can best streamline workflows, automate tasks, and improve overall employee productivity.
AI Pilot for National Declassification Center (NDC) at NARA The National Declassification Center (NDC) at NARA is exploring the use of AI to automate and streamline the declassification process for classified documents. The AI system would identify and mark declassifiable content, redact sensitive information, and auto-fill metadata fields to track the process. The AI would leverage standard machine learning techniques to predict metadata values and natural language processing to identify sensitive content. This implementation aims to enhance the efficiency and accuracy of the declassification process at the NDC. Planned for future pilot (TBD)  

Develop a Custom Large Language Model (LLM) to Power Generative AI- Capabilities

This pilot aims to train a custom Large Language Model (LLM) on NARA's digital data to improve efficiency and user experience in generative AI applications across NARA's digital systems.

Planned for future pilot (TBD)  
Develop a Natural Language Based Chat Interface (like ChatGPT) to Interact With the Archival Documents

This pilot AI initiative aims to develop an AI-powered chat interface, similar to ChatGPT, that enables natural language interaction with documents within the NARA digital systems. The chat interface will leverage advanced natural language processing (NLP) and information retrieval techniques to understand user queries, retrieve relevant information from the documents, and provide accurate and contextually relevant responses in a conversational manner.

Planned for future pilot (TBD)  

Topic Summarizer and Entity Extraction using AI

This pilot project aims to leverage AI to automatically generate descriptive metadata, such as content summaries and scope notes, for digital objects within the Descriptive Authority Service (DAS) system. By analyzing the content of digital objects, the AI model will extract key information and create concise, informative descriptions, significantly reducing the manual effort required for metadata creation and improving the discoverability of digital assets. Planned for future pilot (TBD)  
Automated Data Discovery and Classification Pilot NARA is planning a pilot project to test AI/ML based automated data discovery and classification using public/mock-up datasets. The pilot will explore both supervised and unsupervised AI/ML techniques, utilizing a Software as a Services (SaaS) solution that enables document-level search and discovery. If the SaaS solution doesn't recognize a specific document type, it can be trained using a learning set of examples to identify all documents of that type. Planned for future pilot (TBD)  

 

2024 NARA AI Inventory (updated September 2024)
2023 NARA AI Inventory (updated September 2023)

 

Any questions regarding the AI inventory can be directed to NARA Responsible AI Officials (RAIO) raio@nara.gov.

Top