National Archives News

1950 Census Release Will Offer Enhanced Digital Access, Public Collaboration Opportunity

By Victoria Macchi | National Archives News

refer to caption

Unused Form 17Fld-1, Portfolio Control Label for the 17th Decennial Census, 1950. The document was folded in half, and due to adhesive on the blank reverse side, the two halves adhered together.

WASHINGTON, December 14, 2021 — With the scheduled April 1, 2022, release of 1950 Census records a little more than three months away, the National Archives is completing efforts to digitize those records and using technology to make them more accessible than ever.

“Employees from across the agency have worked on digitizing and indexing the records and developing and testing a new, dedicated 1950 Census website,” said Project Manager Carol Lagundo, who leads the 1950 Census project at National Archives. “It’s taken innovation and creativity to keep this project on track throughout the pandemic and to continue to meet our project milestones. We hope the public will benefit from our hard work.”

The new website will include a name search function powered by an Artificial Intelligence/Machine Learning (AI/ML) and Optical Character Recognition (OCR) technology tool. This is important for genealogists and other researchers who rely on census records for new information about the nation’s past.

“The OCR being used to transcribe the handwritten names from the census rolls is about as good as the human eye,” said Project Management Director Rodney Payne. “Some of the pages are legible, and others are difficult to decipher. So, the National Archives developed a transcription tool to enable users to submit name updates. This will allow other users to find specific names more easily, and it provides an opportunity for the public to help the agency share these records with the world.”

National Archives officials are encouraging interested members of the public to use the transcription tool and assist the agency to make the records as accurate as possible.

“This is an exciting project for the National Archives, and we know it is important information for so many Americans. We are looking forward to collaborating with the public to refine and enhance the first draft of OCR-created names. This is a great example of automating as much as we can and then collaborating with the public to make access happen,” said Chief Innovation Officer Pamela Wright.

The website is currently in development and will undergo rigorous testing in the coming months to ensure a successful launch.

The National Archives is also working to provide bulk download access of the full 1950 Census dataset on launch day. This will be of interest to digital humanists, web developers, social scientists, and anyone wanting to explore aggregations of the records. Other organizations and companies will be able to use this functionality to provide 1950 Census data on their own websites.

When made available on the Amazon Web Services Registry of Open Data, the 1950 Census dataset—over 165 terabytes of data—will include the metadata index, the population schedules, the enumeration district maps, and the enumeration district descriptions for the 1950 Census records. This is approximately 10 times the size of the 1940 Census dataset.

Included in the dataset are approximately:

  • 6.5 million digital TIFF images and corresponding JPEG derivative images of the microfilmed “1950 Census of Population and Housing” forms for U.S. states and territories
  • 33,215 TIFF images and corresponding JPEG derivative images of the original paper “1950 Census of Population and Housing: Indian Reservation Schedule” forms
  • 9,600 digitized images of the 1950 Census Enumeration District Maps, which are annotated maps of counties, cities, and other minor civil divisions that show enumeration districts, census tract, and related boundaries and numbers used for each census
  • 63,000 digitized images of the 1950 Census Enumeration District Descriptions, which are written descriptions of geographic areas included within enumeration districts
  • 232,000 1950 Census Enumeration District Descriptions, which were produced by generating OCR output of the Enumeration District Description images. More than 25 NARA staff reviewed and cleaned up the OCR output.

For more information see, NARA’s 1950 Census blog posts on History Hub.

Resources for earlier censuses as well as tips for searching these records are available on Archives.gov.

Reports and statistics from the 1950 census are available through the U.S. Census Bureau.

Top