
Vol. 25:3 ISSN 0160-8460 Fall 1997
Electronic Technologies Projects Make Connections
by Joyce M. Ray
The NHPRC has been supporting research on the preservation of electronic records since 1980. Beginning in 1990, the Commission has also funded experimental projects that are using electronic technologies, including CD-ROM and the World Wide Web, to publish collections of significant historical documents. The Commission's electronic records research program focuses on the archival preservation of records originally created in electronic form, while the electronic publishing projects deal with the digital conversion of historical documents created on paper. These projects collectively contribute to a better understanding of how digital information of long-term value can be maintained and provided to users.
Researchers for NHPRC-supported electronic records projects have developed guidelines and models for the design and evaluation of electronic recordkeeping systems, proposed standards for preserving and managing digital information, and produced tools to improve electronic records management. Their work has significant implications for how electronic records contained in databases, word processing documents, and electronic mail will be managed in the future. Because of the growing interest of archivists and records managers in electronic records issues, reports from NHPRC-sponsored research projects have become popular session topics at professional archival meetings. Debate on different approaches is increasing because archival institutions large and small are facing difficult policy and resource decisions about the historically valuable electronic records that fall within their domains.1 At the 1997 meeting of the Society of American Archivists in Chicago, for example, reports on NHPRC research projects were presented by representatives of the New York State Archives and Records Administration, Indiana University, the City of Philadelphia, WGBH Foundation, and the Mississippi Department of Archives and History. The widespread availability of the World Wide Web has also made it possible for projects to share their findings more quickly than in the past [see sidebar for Web site addresses of NHPRC-supported electronic records research projects]. Research results are now available to decision makers in a more timely manner.
NHPRC electronic records research grants have supported projects to:
- Develop model records management guidelines for Federal and State Web sites (Syracuse University).
- Promote development and acceptance of a Universal Preservation Format for audio and video digital recordings (WGBH Foundation).
- Develop and test model requirements for new electronic recordkeeping systems and explore ways to improve the recordkeeping functionality of existing information systems (University of Pittsburgh, Indiana University, City of Philadelphia, State University of New York - Albany, Delaware Public Archives).
- Test different approaches to the archival management of electronic records (New York State Archives and Records Administration, Vermont State Archives, Kansas State Historical Society, South Carolina Department of Archives and History, Mississippi Department of Archives and History).
At the same time, historical editing projects have begun laying the groundwork for electronic delivery of historical source material for educational and research purposes [see sidebar for Web site addresses of NHPRC-supported editing projects]. NHPRC electronic publishing grants have supported:
- Optical imaging of surviving records of the War Department, 1784-1800, for publication on CD-ROM. The War Department's records for this period were destroyed in a warehouse fire; the collection has been partially reconstructed from outgoing correspondence.
- Optical imaging of the legal papers of Abraham Lincoln for publication on CD-ROM. The records contain valuable information about Lincoln's early professional life and provide a rare insight into the workings of a small mid-nineteenth century legal office.
- Development of guidelines and demonstration models for the electronic publication of historical documents. The Model Editions Partnership is a consortium of seven historical documentary editions, in partnership with leaders of the Text Encoding Initiative, that is collaborating to develop methods for creating and delivering historical editions on the World Wide Web and CD-ROM. The editions participating in the partnership, all of which are also supported by individual NHPRC grants, are: the Documentary History of the First Federal Congress, the Documentary History of the Ratification of the Constitution and the Bill of Rights, the Papers of General Nathanael Greene, the Papers of Henry Laurens, the Lincoln Legal Papers, the Papers of Margaret Sanger, and the Papers of Elizabeth Cady Stanton and Susan B. Anthony.
There are many unknowns about electronic publishing, and even more unknowns about the publication of historical documentary sources. Publishers do not yet know if CD-ROM versions of these materials will be commercially successful. In many cases it is unclear who will take responsibility for long-term preservation of the electronic formats, yet the cost of digital conversion and migration to new hardware and software as technologies change is too high to ignore. Retrieval strategies must also be developed that can transcend hardware and software changes. Finally, it is not yet known whether users will find electronic formats, whether CD-ROM or the World Wide Web, sufficiently "friendly" to justify conversion and long-term maintenance costs.
The answers to these questions have important implications for how historians and other researchers will use all kinds of historical resources in the future. Archivists are exploring ways to provide access to their holdings on the World Wide Web through the creation of electronic finding aids and digital libraries. Although a consensus definition of the term "digital library" has not yet emerged, for purposes of this discussion a digital library is assumed to be a collection of digital images, texts, and/or other objects accumulated for informational rather than evidential value and organized by its creator for access by remote users. Digital libraries can provide online access to historical materials such as photographs, high-use documents, and even electronic databases. These digital collections must be organized and presented in logical structures for retrieval. For this reason, the development of digital libraries has much in common with electronic publication of historical texts.
A report released this year by the Commission on Preservation and Access, SGML as a Framework for Digital Preservation and Access, illustrates the relationship between electronic text publishing and the creation of electronic finding aids and digital libraries for archival materials.2 SGML (Standard Generalized Markup Language) is an international standard (ISO 8879) for the coding of electronic text. Its use has been promoted by the archival profession for the creation of electronic finding aids, and it has also been endorsed by the publishing industry for electronic publications.3 The National Archives and Records Administration has recently issued revised regulations for the transfer of permanent electronic records to the National Archives which specify that an agency may transfer electronic textual documents with SGML tags to preserve the structure of the records. In the area of electronic publishing, the Internet prototypes being developed for the Model Editions Partnership reflect a variety of approaches that rely on SGML in varying degrees. The Lincoln Legal Papers and the Margaret Sanger Papers mini-editions use images of original manuscripts, while the other sample editions use transcriptions of original manuscripts. For the Lincoln project, SGML markup provides a gateway to a relational database of images and content information developed by the editors. For the Sanger project, SGML markup creates "envelopes" that include both manuscript images and content information. For the other projects that present transcriptions, SGML markup is used to describe the text itself and to link the text to content information. The new Commission on Preservation and Access report advocates the further use of SGML as a standard to facilitate discovery and retrieval of documents contained in digital libraries.4
Archives are now facing the need not only to preserve electronic records created by others and designated for permanent archival retention because of their historical significance, but also the need to preserve the digital libraries they themselves are creating to enhance access to their holdings. In many cases the original formats of these historical materials must be preserved for their intrinsic value in addition to the digital versions, thus creating new maintenance costs. For digital imaging, it is necessary to find a reasonable balance between the desire to produce the highest quality images possible and the need to hold down imaging, storage, and migration costs, all of which rise in proportion to image quality. It is also important to identify appropriate levels of description and markup for textual materials and finding aids that will provide adequate access to holdings and yet be cost-effective.
Fortunately, the NHPRC is not the only funding source for digital library research and development. Federal grant programs sponsored by the National Science Foundation and the Institute for Museum and Library Services have distinct programs for digital library research. The National Endowment for the Humanities supports work in the electronic delivery of humanities texts, and the Library of Congress has contributed substantially to the digital library knowledge base through its American Memory Project, which is digitizing materials from the Library's own holdings, and through a competitive grants program sponsored by the Ameritech Corporation.
The NHPRC remains the only Federal grants program with a specific focus on the management of evidential records originally created in electronic form, and it is one of the few programs supporting electronic publication of historical documentary texts. Because of its belief in the importance of both these programs, the Commission has designated research and development on appraising, preserving, disseminating and providing access to important documentary sources in electronic form as a top-level priority in its strategic plan that takes effect in fiscal year 1999. NHPRC grants for electronic records research and for electronic publishing are contributing to a whole spectrum of knowledge about digital information, from its creation as an original record or as a digitized document, to its preservation in a recordkeeping system or digital library, and finally to its delivery to users.
NHPRC grants for electronic publishing and for electronic records research are helping to increase our capability to preserve important documentary sources in electronic form for future use. They may even help to discover new uses for these materials, such as techniques for distance education or innovative approaches to classroom teaching. Documentary editors can learn a great deal from archivists about long-term preservation and maintenance strategies for digital information, while archivists can learn much from documentary editors about the delivery of historical resources in digital form to users. It is those users - historians, teachers, students, and perhaps new audiences now unknown - who will be the ultimate beneficiaries of both electronic publishing and electronic records research projects.
1 The National Archives and Records Administration, for example, has held an internal series of discussions on electronic records issues, the National Archives of Canada has experimented with various approaches to the management of electronic records with historical value, and the European Union has recently completed a survey of plans for electronic records management by the national archives of member countries. Likewise, several state archives have undertaken consulting and planning projects, some with NHPRC support, to develop strategies for managing electronic records.
2 James Coleman and Don Willis, SGML as a Framework for Digital Preservation and Access. Commission on Preservation and Access, Washington, DC, 1997.
3 The emerging standard for Encoded Archival Description is one of the many SGML-based Document Type Definitions [DTD's] developed under the aegis of the Text Encoding Initiative for the markup of various types of documents; other DTD's have been developed for historical documents such as letters, diaries, and essays.
4 The report notes, however, that another markup language, XML (Extensible Markup Language), shows promise as a potential bridge between SGML, which provides detailed structure to documents but can be difficult to create programs for, and the World Wide Web's HTML, which is widely available and relatively simple but which is not rich enough to allow documents to be searched or managed in precise ways.
