The National Archives Catalog

Contribution Type: Optical Character Recognition (OCR) Transcription Validation

Mandatory Repeatable Edit Type Data Type Source Level Available Public Element
No No Editable Variable Character Length (2 GB) Catalog Community Member
NARA Staff Member
NARA Partner
File Unit
Item
Item AV
Digital Object
Yes

 

Definition:

Optical character recognition (OCR) transcription text of a typewritten or handwritten document is a transcription that is machine generated and produced by NARA or a NARA partner. OCR tools transcribe the words as they are written or typed in or on the document, however it is not always accurate. OCR transcription validation allows NARA staff or community members to contribute edits, correct, or validate the OCR transcription.

 

Purpose: To allow Catalog users to edit and correct the NARA generated OCR transcription of a document thereby enhancing the searchability and discoverability of digital objects in the Catalog.

 

Relationship: Transcription validates have an attribution type modifier and may have a related contributor name.

 

Guidance:

OCR transcriptions are generated for textual digital objects automatically during processing and preparation for publication online. OCR transcriptions are reviewed, edited, and validated voluntarily by Catalog users including Citizen Archivists, staff, and NARA partners following the guidance provided on the Resources page and the Citizen Contributions Policy on Archives.gov.

 



Previous Element
Next Element
Table of Contents
Lifecycle Data Requirements Guide

Top