Center for Digital Antiquity

Resources

How to use tDAR

tDAR is designed to serve the needs of a wide range of archaeologists, researchers, organizations, and institutions who use or manage archaeological resources.

tDAR can assist users in:

Managing a wide variety of archaeological information in one place. Protecting confidential materials.
Organizing documents, data sets, and images. Preserving their legacy and contributing to the discipline.
Downloading reports or bibliographies anywhere. Guarding against data loss; preserving documents, data, and images.
Sharing current research materials with partners. Fulfilling NSF, NEH, and other data management plan requirements.
Publishing data associated with articles and books. Complying with NHPA, ARPA, and 36 CFR 79.

For information on how to use tDAR, please visit: https://www.tdar.org/using-tdar/

Publications

This report by Cultural Heritage Partners, PLLC describes and analyzes federal requirements for the access to and long-term preservation of digital archaeological data. We conclude that the relevant federal laws, regulations, and policies mandate that digital archaeological data generated by federal agencies must be deposited in an appropriate repository with the capability of providing appropriate long-term digital curation and accessibility to qualified users.

Download

Reports in Digital Archaeology is an online publication series devoted to issues regarding research and practice in digital archiving of archaeological materials and archaeologically related data. If you are interested in submitting a manuscript to Reports in Digital Archaeology, please contact Digital Antiquity.

  1. Building tDAR: Review, Redaction, and Ingest of Two Reports Series, Joshua Watts (June, 2011)
  2. Policies, Preservation, and Access to Digital Resources: The Digital Antiquity 2010 National Repositories Survey, Joshua Watts (September 2011)
  3. The Digital Archive of Huhugam Archaeology: Crowd Sourcing User Needs, Keith Kintigh (August, 2018)
  4. The US Air Force CRM Program Meets the Challenges of Digital Data Curation: A Case Study Using tDAR, Francis P. McManamon et al. (March, 2019)

Digital Antiquity’s Focus on Data Security

The Center for Digital Antiquity (Digital Antiquity) strives to protect and preserve the archaeological and cultural heritage data and information that is deposited in tDAR. We focus on preserving, curating, and maintaining these data. We accept this responsibility as one of our primary missions. This document outlines the various approaches we take to store and secure digital information. We also describe tDAR features that allow data contributors to control and manage access to information that they place in the repository.

Physical Security

Digital Antiquity’s offices are located in Hayden Library on the main campus of Arizona State University (ASU) in Tempe. Access to the offices during business hours is controlled by Digital Antiquity staff. The office area is locked when staff are not present. Computers within the offices are password protected. Access to the data is limited to designated Digital Antiquity data curation staff and management during their time of employment. Access is provided via a secure connection.

Technical Aspects to tDAR’s Security

Digital Antiquity employs multiple strategies to the security of digital files stored in the repository. tDAR uses 256-bit TLS 1.1 encryption throughout the website and application to secure information. After a user registers and logs in to tDAR, all actions occur over a secure channel (e.g. uploading files, making purchases, viewing resources, etc.). The only actions non-registered users can perform are search and view basic metadata for resources and collections in tDAR, non-registered users cannot access data files.

Files are stored on servers at ASU’s data center. A suite of tests is run every time Digital Antiquity makes any modification to tDAR’s source code. In addition to the testing by Digital Antiquity technical staff for each release, ASU’s data center and Digital Antiquity run audits using a suite of common intrusion testing tools to identify potential vulnerabilities.

Files stored in tDAR are protected by multiple, redundant security measures. Physical access to the data center where tDAR’s files are stored is restricted and monitored. Data center staff do not have sign-on permissions to tDAR or the virtual machines that run it. In addition, a firewall has been constructed to prevent tDAR’s database and backend file store from communicating with any host other than the tDAR webserver.

To protect files from catastrophic loss, Digital Antiquity maintains two backup procedures for tDAR’s data. Both procedures employ strong encryption. One set of backups (updated biweekly) are kept in the Phoenix area in a secure storage area. The second set of backups are maintained in Virginia and utilize Amazon’s Glacier storage service.

Security and Access Control for Confidential Files

Contributors to tDAR can control and limit access to files they place in the repository. The metadata about those files are always public, which means that anyone can learn about the existence of the resource. Public metadata in tDAR records, such as title and description, but not exact site location or files, are exposed to search engines (e.g., Google, Bing, etc.) for indexing. Digital files in tDAR can be marked as public, confidential, or embargoed. When a digital file is marked “public” anyone who is a registered tDAR user and logged in may download the file. A file marked “confidential” will be inaccessible to users who have not been explicitly granted access to that file by the individual who has this authority. Records in tDAR that have attached confidential or embargoed files provide a link that can be used to request access. Each request generates an email to the record owner. The record owner may decide to provide or not provide access to this request. Digital Antiquity facilitates communication between the record owner and the individual requesting access, but does not grant or deny such requests. Embargoed files are treated as confidential files for a user-designated period, after which the file becomes publicly accessible. Contributors may change the access designation at any time.

The metadata describing tDAR records allow contributors to designate UTM coordinates or specific site locations on a map. If these spatial designations are smaller than one square mile, the tDAR software will obfuscate the spatial data when displaying this information to users who have not logged in or have not been explicitly granted access. When obfuscated, spatial designations will be randomized and will display an area greater than one square mile.

Many clients choose to provide publicly appropriate redacted versions of digital files to upload to tDAR along with full confidential versions containing sensitive information. Digital Antiquity provides redaction services for clients who wish to take advantage of this option. Using professional redaction tools, data curators permanently remove confidential or sensitive information (e.g. archaeological site locations or other information as designated by the client) from a copy of the complete file. This service produces an edited version of the report that is appropriate for access by registered tDAR users.

For an outside evaluation of tDAR’s security please review a 2014 report compiled by Sara Rivers Cofield, Curator of the Maryland Archaeological Conservation Laboratory (MAC Lab), who received a Department of Defense Legacy Grant to evaluate tDAR as a repository for digital portions of collections held at the MAC Lab for Defense agencies. Section 5.2 of the report (pp 54-56) addresses the questions related to data security using tDAR. The full report or a one-page fact sheet can be accessed at here.

Download 

Introduction

In the United States, federal and other public agencies have legal responsibilities to care for archaeological collections resulting from investigations that they conduct or require. Digital data and records are a part of these collections and also must be curated properly (for a detailed legal analysis of this topic see Cultural Heritage Partners 2012). In addition, a series of Executive Orders, guidelines, and policies require federal agencies to make the results of research that they conduct or fund more easily accessible to the public (e.g., Holdren 2013).

All, or nearly all, of these archaeological research results are in digital formats. Digital files require different care and procedures than physical collections to ensure that they are properly preserved and accessible for appropriate uses. The nature of digital curation is not necessarily more complicated or expensive than the curation of physical collections. However, it is specialized and agencies are obligated to take affirmative steps to ensure that the archaeological data about their resources and from their projects are deposited in an archive or repository where the expert care, principles, standards, and techniques of digital curation are followed (ADS and Digital Antiquity 2013; Richards et al 2010; McManamon 2014; Kintigh et al. 2015).

Careful curation of digital files is important because most data and information created by contemporary research in many subjects is created, stored, and most easily shared in digital formats. Even documents and other texts that are still published on paper are most commonly accessed and shared electronically as digital files. Digital data and documents are far easier to share than the same information in paper format. However, like paper records, although in different ways, digital files can be damaged or destroyed if handled inappropriately. Importantly, unlike paper records, unless properly curated, digital data can become obsolete and inaccessible rapidly.

In order to take advantage of the wealth (or “deluge” to cast it in a different light) of digital data, appropriate procedures are needed to care for the data after it is created (e.g., see Hey and Trefethen 2003; Lord et al. 2004; Seltzer and Zhang 2009; and, Faniel and Zimmerman 2011). The appropriate activities and procedures can be grouped under the general heading of “digital curation.” Good digital curation is not simply a legal and regulatory requirement in many circumstances. By making digital data easily discoverable and more accessible, digital curation greatly enhances the ability of other researchers to test and build upon work that has been done by others. Replication or refinement of research results by subsequent studies is a hallmark of scientific knowledge. Studies invariably build upon what has been learned from research done in the past. By improving discoverability and access to existing data, digital curation also enables current research projects to avoid unnecessarily redundant studies and to build upon results that are available (Center for Digital Antiquity 2015).

For these reasons, agencies, foundations, and other funders of archaeological research, commonly require a data management plan (DMP) as part of proposals seeking funding (e.g., National Science Foundation 2010). The DMP template provided in this guide will help you complete your data management plan. For example, you can use the template to address the requirement for such plans for NEH and NSF proposals, or as part of a CRM or public archaeology proposal to address requirements to provide for the curation of digital data generated by the project.

A Data Management Plan Template for Projects Using tDAR

Below is a template that describes how data that are proposed to be collected or created for particular projects will be curated when placed in the tDAR repository. The estimated number of files and overall file size should be filled in when the template is used as part of a proposal for a grants program or in responding to Requests for Proposals.

Data created by this project will be deposited for long-term access and preservation in the tDAR digital repository. We estimate up to xxx files and xx GB of data being added to tDAR as part of the project. Within tDAR we will organize the data (documents, data sets, images, and other types of digital files) as projects and collections with sub-collections, as appropriate. These organizational tools will enable both project researchers and outside investigators to easily access and report on their digital research products.

Project datasets and records will be uploaded and thoroughly documented with technical and semantic metadata using tDAR’s interactive Web forms. Standard metadata categories used by tDAR include numerous cultural, geographic, temporal, material, bibliographic and other archaeological fields. Dataset columns are individually documented in terms of their type and content; nominal fields are documented value by value.

In tDAR all digital files are stored in the original submission format and, as needed, in a preservation format to maintain stability though long-term migration. Preservation formats are determined in consultation with the Library of Congress and the Archaeology Data Service procedures. Files may also be copied into a dissemination format to facilitate usability. tDAR’s digital objects are thoroughly documented by administrative and technical metadata for preservation, descriptive metadata for effective resource discovery, and detailed semantic metadata needed to permit sensible scientific reuse of the data. Web forms guide data contributors through a comprehensive process of metadata entry and file upload.

tDAR’s descriptive metadata includes general and bibliographic components incorporating Dublin Core and Library of Congress’s MODS metadata standards. It also includes fields covering information unique to archaeology, such as: site types, investigation types, and detailed, column-by-column, and table-by-table information documenting datasets. tDAR’s administrative and technical metadata utilize components of the Library of Congress’s PREMIS metadata standard for capturing, technical, preservation, and rights information.

Access to tDAR content is provided through a Web interface with basic and advanced (including spatial) search capabilities. tDAR content is indexed by Google and other major search engines. Excepting legally protected confidential data and data temporarily embargoed by their contributors, all tDAR data are freely available over the web. Files deposited in tDAR are all accompanied with DOIs and persistent URLs. Data may be reused, redistributed, or transformed, subject only to the provision of appropriate credit to the data creators and indication of any changes made (Creative Commons Attribution 3.0 Unported License). All data downloads include appropriate citation information.

tDAR is served by Linux virtual machines operating within Arizona State University’s Server-On-Demand facility. Each virtual machine has fully resilient data storage on a redundant disk behind enterprise-class Network Attached Storage filers. Daily backups are maintained in a separate building to provide protection and rapid restore from any data loss and are retained for two-weeks. Regular offsite (bi-weekly, and quarterly) backups are also performed. Additional secure, encrypted offsite backups of tDAR use Amazon Web Services’ Glacier Cloud Storage.

Should Digital Antiquity at any time cease to exist, the ASU University Library has agreed to continue to provide access to tDAR data. tDAR operates under the organizational umbrella of the Center for Digital Antiquity, a multi-institutional organization (based at ASU) designed to ensure the long-term financial, technical, and social sustainability of tDAR. It is governed by an independent Board of Directors and supported by an external Professional Advisory Panel. Start-up funding for tDAR has been provided by the National Science Foundation (0433959 and 0624341) and the Andrew W. Mellon Foundation.

Additional guidance on good digital curation practices and data management can be found in Inter-university Consortium for Political and Social Research (2012).

References can be found in the PDF

Download

Problems of Access, Preservation, and Storage

The Phoenix Area Office (PXAO) of the Bureau of Reclamation (BoR) is responsible for extensive data and records collected and created since the 1970s as part of archaeological work carried out on the lands and canal reaches managed by the federal agency. These paper and digital reports, other documents, images, and related data occupy considerable space in PXAO BoR offices and are difficult to locate, search, use, and share among BoR staff, contractors, researchers, and the interested public. Locating and retrieving documents and data is inefficient and requires an investment of time from both PXAO BoR staff and the person seeking the item. Furthermore, since the reports and other paper records exist in limited numbers, maintaining a single copy of these records in one place raises risks for preservation and the potential for catastrophic loss. Increasingly, new projects include digital as well as paper records, but, without an appropriate tool to manage the digital files, they are either stored on a shelf or printed, adding to the existing problems.

In sum, PXAO had two problems with how to manage the archaeological data and information it is responsible for: (1) how to make the large amount of legacy data and information useful and ensure its long-‐term preservation economically; and (2) how to treat new data and information in a way that could make it immediately useful and in a system that would ensure its preservation.

Why tDAR?

To solve these problems, PXAO turned to the Center for Digital Antiquity (DA; www.digitalantiquity.org) and its repository, tDAR (the Digital Archaeological Record; www.tdar.org). A 2011 cooperative agreement between DA, located administratively at Arizona State University, and PXAO was set up for PXAO to utilize DA’s services and tDAR’s features to enhance access to valuable archaeological information and to guarantee the long term preservation of the digital archaeological materials that the agency manages.

tDAR as a Solution

tDAR is a digital archive and repository designed for archaeological documents, images, data sets, and other digital resources. tDAR was developed and is maintained by DA. Users of tDAR can search for documents, images, data sets, and other materials from archaeological projects throughout the world. tDAR users can easily deposit documents, images, and other data. Uploading data to tDAR facilitates broader and easier access and sharing for future uses, such as analysis for decision-making, background studies, public interpretation and outreach, project management and research. These access and preservation functions were just what PXAO was seeking. tDAR provides for archaeologically specific metadata to help users manage and locate information efficiently and quickly. It also meets the occasional challenges of restricting access to confidential information, such as specific site locations. By placing data into tDAR, users can ensure that the information will be preserved and accessible in the future as new technologies replace current platforms. Learn more about tDAR’s functionality and design at www.tdar.org.

The PXAO has committed to digitizing their legacy records as a necessary first step in reducing on‐site paper storage and increasing the discovery of their records. The ongoing project with the PXAO followed three primary phases: 1) digitization of existing paper records, cataloging, and data entry 2) development of digital curation strategies and standards for on-call and future contracts, and 3) management of digital archive on tDAR.

Part 1: Dealing with the Legacy Data and Information

The PXAO digitized many of the paper records related to its older archaeological project, in particular from various parts of the Central Arizona Project. It was decided to deposit these digitized records in tDAR where they could be organized and managed more easily. The intent is to ensure that the digital materials are curated to meet the needs of PXAO cultural resource staff to care for these resources, as well as to comply with federal regulations governing preservation and management of archaeological data, such as 36 CFR 79 (see the DA briefing statement on this topic and the Cultural Heritage Partners 2012 report describing the requirement http://www.tdar.org/why-tdar/compliance/). The PXAO archaeological collections include some of the most extensive and important archaeological work performed within Arizona during the past century, including:

  • The Lower Verde Archaeological Project
  • Salt-Gila Aqueduct Archaeological Project
  • Roosevelt Dam Archaeological Studies
  • Historical Archaeology of Dam Construction Camps in Central Arizona
  • Ak Chin Farm Archaeological Project
  • Tucson Aqueduct Archaeological Project

Data curators at Digital Antiquity worked with PXAO staff to check the digital documents for completeness and appropriate formatting, accurately describe each document, and organize the digital archive so that documents could be easily retrieved for various uses.

DA and PXAO staff created a simple workflow to manage the deposit of records into tDAR. Digital documents were sent to DA curators, who reviewed them for accuracy, appropriate formatting, and accessibility. They also ensured that the PDF files were passed through an optical character reader (OCR) program so that they can be easily searched. Curators also reviewed the texts and illustrations to identify confidential information that might need to be redacted or designated as “confidential.” When encountered, this information, mainly detailed site location information, was removed using Adobe Acrobat Pro’s redaction tools. The full report was marked as confidential in tDAR, and access was restricted to users authorized by PXAO. Finally, digital curators created appropriate descriptive metadata for each file and then uploaded them to tDAR as “Drafts” for the PXAO to review. Once the PXAO reviewed and signed off on the tDAR records, curators changed the records’ status to “active.”

The Lower Verde Archaeological Project (LVAP), provides an example of the contents of the PXAO digital archive. This project was a large-‐ scale, four-‐ year data recovery project conducted during the 1990s. The project archives include 38 separate documents: chapters from a final, synthetic volume; detailed, technical chapters from three data reporting volumes; five full technical reports; and several appendices that further document the extensive excavations and analysis. These records are available at http://core.tdar.org/project/5831. In some instances, the BoR designated digital files associated with some projects as restricted. The metadata for the restricted files are still visible to all tDAR users, but the file itself is marked as “confidential” to control access to qualified, PXAO-‐approved professionals.

DA continues to work with the BOR to curate additional projects and associated digital documents, data sets, and images from their legacy collections.

Part 2: Managing Data and Information from New Investigations

In May 2012, at the request of the PXAO cultural resource staff, DA curators began to provide assistance for the organization, creation of new tDAR records, and uploading of digital files to tDAR from new BoR archaeological projects. The new documents, images, and data sets were created by three cultural resource management firms as part of an “on-call” contract established by PXAO. The terms of this contract include a digital curation stipulation regarding guidelines for creating digital records and metadata for deposit into tDAR as part of the execution of the archaeological studies. As a result, future work carried out on the lands and canal reaches managed by the agency will be deposited automatically to tDAR, organized in a way that is useful to PXAO internal functions, and preserved for long term access and use by PXAO staff, contractors, researchers, and the interested public. Metadata creation by the same individuals who produce the digital records, in this case the staff of the cultural resource management (CRM) firms doing the archaeology, is an effective way to produce accurate and detailed metadata for the tDAR records. This work not only continues to populate the archive, but ensures that staff time and funds are spent on more critical tasks. By requiring contractors to enter the data, and placing the requirement in the on-call contract, the PXAO is able to reduce the amount of staff time and cost over time. This arrangement also eliminates any additional growth in a backlog of data not properly curated.

Part 3: Use of Records in tDAR and Long-‐term Preservation

PXAO archaeological records archived in tDAR are organized in a way that makes sense to BoR staff and can easily be discovered, located, and utilized by contractors, researchers, and the interested public. For example, the Lower Verde Archaeological Project record in tDAR has received nearly 800 views since being added to tDAR. Clearly, many more people are able to access and make use of the digital records in tDAR than the paper records in the PXAO offices.

The digital records entrusted to tDAR are properly preserved by following a number of procedures:

  • regularly and systematically checking the files in the tDAR repository to ensure that no deterioration has occurred.
  • taking action to remedy deterioration if it is detected.
  • migrating and/or refreshing digital files to provide for their long-‐term accessibility and preservation.
  • planning for obsolete technology.
  • maintaining files in open and preferable formats, accommodating new industry standards for archaeological information
  • storing the rich, descriptive metadata with the digital objects to which they are related.

All of these procedures are a regular part of the ongoing services provided by DA for the digital data deposited in tDAR.

Download

Background

In the United States, federal and other public agencies have legal responsibilities to care for archaeological collections resulting from investigations that they conduct or require. Digital data and records are a part of these collections and also must be curated properly (for a detailed legal analysis of this topic see Cultural Heritage Partners 2012). In addition, a series of Executive Orders, guidelines, and policies require federal agencies to make the results of research that they conduct or fund more easily accessible by the public (e.g., Holdren 2013).

All, or nearly all, of these archaeological research results are in digital formats. Digital files require different care and procedures than physical collections to ensure that they are properly preserved and accessible for appropriate uses. The nature of digital curation is not necessarily more complicated or expensive than the curation of physical collections. However, it is specialized and agencies are obligated to take affirmative steps to ensure that the archaeological data about their resources and from their projects are deposited in an archive or repository where the expert care, principles, standards, and techniques of digital curation are followed (Richards et al 2010; McManamon 2014; Kintigh et al. 2015).

Digital files have become a focus for curation because most data and information created by contemporary research in all subjects is created, stored, and most easily shared in digital formats. Even documents and other texts that are still published on paper are most commonly accessed and shared electronically as digital files. Digital data and documents are far easier to share than the same information in paper format. However, like paper records, although in different ways, digital files can be damaged or destroyed if handled inappropriately. Importantly, and unlike paper records, unless properly curated digital data can become obsolete and inaccessible rapidly.

In order to take advantage of the wealth (or “deluge” to cast it in a different light) of digital data, appropriate procedures are needed to care for the data after it is created (e.g., see Hey and Trefethen 2003; Lord et al. 2004; Seltzer and Zhang 2009; and, Faniel and Zimmerman 2011). The appropriate activities and procedures can be grouped under the general heading of “digital curation.” Good digital curation is not simply a legal and regulatory requirement in many circumstances. By making digital data easily discoverable and more accessible, digital curation greatly enhances the ability of other researchers to test and build upon work that has been done by others. Replication or refinement of research results by subsequent studies is a hallmark of scientific knowledge. Studies invariably build upon what has been learned from research done in the past. By improving discoverability and access to existing data, digital curation also enables current research projects to avoid unnecessarily redundant studies and to build upon results that are available.

Agencies and other organizations need to adopt guidelines like those summarized in this guideline and the references that it cites in order to ensure that the archaeological data for which they are responsible is cared for properly.

Guidelines for Good Digital Curation

Agency guidance on proper curation of its digital archaeological files should be strongly stated and apply to data from all of the archaeological activities in which the agency is involved. The guidance needs to include a description of the kinds of activities, procedures, and standards associated with appropriate digital curation such as provided in this section. The agency guidance should contain a requirement that digital data and information produced by agency-related archaeological investigations be deposited in a digital archive or repository. Digital archaeological data includes: documents (e.g., field/lab notes, interim/final reports, specialist reports, correspondence, etc.), images (e.g., maps, drawings, photographs, etc.), data sets, geospatial data, scanned data files (e.g., 3D, LiDAR, etc.), and other types of digital files.

The process of digital archiving for archaeological data can be divided into two general sets of actions. The first includes activities to be taken by the individual(s) or organization(s) that create the digital data, documents, image, or other types of digital files. The second set of activities includes those undertaken by the digital archive or repository where the digital data are deposited.

For Digital Data Creators
Actions by digital data creators (depending on the agency, these may include agency archaeologists, contractors working for the agency, contractors working for other organizations that are required by the agency to conduct archaeological investigations, or some combination of these categories):

  1. Plan for the creation and subsequent management of the digital resources as part of archaeological investigations.
  2. Produce the digital resources as part of the project, creating administrative, substantive and technical descriptions of the digital objects, commonly referred to as “metadata.”
  3. Provide the means for others to access and make use of the digital files.
  4. Evaluate the continuing importance of digital files during the project and select objects that merit long-term preservation for future uses, in consultation with the appropriate agency official.
  5. Dispose of the digital files not selected for long-term preservation, in consultation with the appropriate agency official.
  6. Deposit the digital files selected for long-term preservation in a digital repository where the data can be discovered, accessed (with appropriate controls), and preserved for future uses.

For Digital Data Repositories
A digital repository is one established and operated for the express purpose of providing access to and long-term preservation of digital data. Digital files related to archaeological resources or studies can be curated by the organization that generates the files or by a different domain or institutional digital repository, as so long as the required criteria for services are met. Such a repository is organized so that it can be sustained and function in its curation role indefinitely. A digital archive or repository has a professional staff that carries out activities necessary to ensure the long-term preservation of and appropriate access to the digital files it curates. More detailed descriptions of the activities, policies, procedures, and standards of such archives or repositories are available from the Digital Curation Centre (2010) and the Center for Research Libraries and Online Computer Library Center (2007).

Actions by digital repositories that hold archaeological data should include:

  1. Upon deposit of the digital files, test the objects to ensure that they have the characteristics described by the creators and can be adequately preserved in the repository.
  2. Implement the policies and procedures necessary for the long-term public access and preservation of digital records (Cultural Heritage Partners 2012:2-11) as provided in federal curation regulations (36 C.F.R. Part 79) and records management requirements (e.g., 36. C. F. R. 1220), such as the following:
    • Regularly and systematically checking the files in the repository to ensure that no deterioration has occurred.
    • Taking actions to remedy deterioration if it is detected.
    • Periodically migrate the digital files to new file types or in order to conform to new hardware and/or software standards.
    • Regularly backing-up and storing e files in multiple locations for security.
  3. Ensure cross-referencing between physical collections and digital records.
  4. Provide the means for access and use of the digital files, within any constraints placed on use by the depositors (e.g., permits access to “confidential” digital data to be restricted).
  5. Enable data depositors to easily manage their data in the repository.

Procedures for Ensuring the Deposit of Data in an Appropriate Digital Curation Repository

In the United States, most Requests for Proposals (RFPs), Scopes of Work (SOWs), contracts, and other types of agreements for archaeological investigations undertaken, funded, or required by federal agencies require the curation of physical collections (including associated documents) that these investigations create. The ever-increasing amount of digital data generated by such projects (ADS and Digital Antiquity 2013:9-20) emphasizes the importance of effective curation for these data in digital repositories where they can be accessed, cared for, and preserved properly for future uses. This is essential because there may be no other record of the investigations and the archaeological resources investigated. Digital files also are far easier to share. They can be made available widely (if appropriate) via the Internet, which enables the public agencies responsible for the data to meet their requirement to make such information easily available for the public (Cultural Heritage Partners 2012).

To implement policies for proper digital curation, it essential that RFPs, SOW, contracts, and other agreement documents require that as part of archaeological projects, digital data that has been generated by the investigation is placed in an appropriate digital archive or repository.

To ensure the accessibility and preservation of digital archaeological data, agency officials preparing RFPs, SOW, contracts, and other types of agreements must include specific requirements to ensure that digital curation is an explicit project deliverable, along with the curation of artifacts and other physical materials generated by the project.

Current and future RFPs, SOW, contracts, and other agreements must specify the requirements for digital curation. Digital data, such as field records, images, laboratory records, data sets resulting from field and laboratory analyses, and Geographic Information System (GIS) files and maps, should not be stored on CDs or other digital media as the sole or primary means of providing access and preservation within a physical curatorial facility that focuses on curating material remains. The digital records cannot be treated the same as paper records and artifacts. Such curation practices neither preserve digital data nor make it accessible. CDs and other digital media deteriorate over time. Data stored on these media are not readily discoverable or accessible to users and will eventually become obsolete as computer hardware and software change (Digital Curation Centre 2010; ICPSR 2012; ADS and Digital Antiquity 2013).

The archaeologists responsible for the investigations that create the digital data should incorporate activities that are necessary to ensure that good data curation can be undertaken easily at the end of their projects. Such actions include: planning from the beginning of each project for the creation and management of the digital data; providing clear descriptions of the digital files (the “metadata”); using digital file formats that are standard and open source; evaluating the likely future importance of various types of data and select for curation those files that merit long-term access and preservation, in consultation with agency officials; and, finally, deposit in a digital repository the digital files selected for long-term preservation.

In order to ensure proper curation of the digital products from archaeological investigations, the data should be deposited in a repository that has appropriate staff expertise, computer hardware and software capacity, and established digital curation policies and procedures. Specific functions and services of such repositories include the following:

  1. Overall Repository Functions
    • Long-term preservation
      • The repository has a sound business plan to ensure its continuation into the long term future.
    • Access to the deposited files into the future
      • The repository ensures that deposited files will be findable. The archival content of the repository should be discoverable and accessible not only within the repository itself, but also via general search engines, from outside of the repository. The repository should have in place procedures to ensure that deposited files will remain readable into the long term future.
  2. Active File Maintenance Services
    • Authentication
      • The repository monitors all digital files and can guarantee that the file a user sees is that same as the file you deposited.
    • Permanent identifier
      • The repository assigns a unique identifier to each file which allows users to both access and accurately and uniquely cite the work (e.g., persistent URLs; digital object identifiers; or other).
    • Maintain files in open and preferable formats
      • The repository stores the original file, but also has policies and systems in place to create new derivative files (copies) in open, non-proprietary formats to ensure accessibility into the future.
    • Rich administrative, descriptive, and technical metadata are linked to the deposited files
      • The repository has an easy to use, but robust system of text, keywords, and other descriptive information to ensure that your files will be found when users perform a web search.
    • Indexed by major search engines for discoverability
      • The repository content should be indexed by major search engines, such as Google, to ensure that your files are discoverable.
    • Collect file and project statistics and metrics
      • The repository collects usage statistics and other metrics (e.g., page views, downloads) for deposited files to allow tracking of content usage.
    • Provide options to manage privacy and confidentiality of files
      • The repository provides services such as embargo dates, or marking files “private” to allow depositors to manage access to files in the event that some files require limited access.
    • Regularly Check File Integrity
      • The repository has automated and/or human supervised systems in place to regularly open and review stored files, much like a museum periodically inspects physical objects to prevent decay.
  3. Long-term Preservation Services and Planning
    • Migrate file formats as they become obsolete
      • The repository stores the original file you contribute but creates a derivative copy as needed to keep up with changing software and hardware.
    • Plan for obsolete technology
      • The repository monitors changing software and hardware systems and updates its systems to stay current. The repository can retrieve files stored on obsolete media and/or in obsolete software systems.

Agreement Documents, Scopes of Work, or Project Deliverables for Digital Curation

The following is example language on digital curation that can be included in an agreement, contract, RFP, or SOW for archaeological investigations or other types of cultural resource studies. This example requires use of tDAR (the Digital Archaeological Record). A modified version of this text could serve for seeking general archaeological repository services, in the event that the use of tDAR cannot be a specific requirement. However, if an agency is unable to require use of a specific digital curation repository, a more general reference to the required capabilities of the potential digital repository must be included in the solicitation. In such cases, officials drafting the RFP or proposed SOW must include sufficient description of the characteristics of an appropriate digital archive or repository. These characteristics are summarized in the two preceding sections.

  1. Name of entity conducting the archaeological work] shall deposit the digital data listed as deliverables for this project in [location of description of digital project deliverables in RFP, scope of work, contract, etc.], in tDAR (the Digital Archaeological Record, www.tdar.org).
  2. [Name of entity conducting the archaeological work] shall thoroughly document all digital data with the following archaeological, administrative, and technical metadata, using the tDAR metadata creation and file upload web pages available at: http://www.tdar.org/why-tdar/contribute/.
  3. [Name of agency/office] will not consider the project complete until the project’s digital records in tDAR have been reviewed by [name of agency official and/or position title], approved, and made active.
  4. Any file containing information that is “confidential,” for example as defined in Section 9 of the Archaeological Resources Protection Act (16 U.S.C. 470hh), or “restricted,” as defined in consultation with [name of agency/office] during the execution of this project shall be deposited in its complete form and marked in tDAR as confidential and shall also be deposited in a redacted, public form, with redactions of all confidential information identified.

Using tDAR (the Digital Archaeological Record) for Digital Curation

The Digital Archaeological Record (tDAR) is a digital repository for archaeological information with a focused and skilled professional staff. The Center for Digital Antiquity, a unit of Arizona State University with an independent Board of Directors, developed and maintains tDAR. tDAR has a growing number of registered users (over 8,000 at present) and virtual visitors (about 50,000 page views and over 1,000 downloaded files per month). Content of tDAR includes nearly 30,000 documents, images, data sets, 3D/scan and geospatial data, as well as over 365,000 citation records incorporated from the National Archaeological Database [NADB] with enhanced metadata.

For archaeological data from the United States and many international contexts, there is no viable alternative to tDAR as a disciplinary digital repository. At the University of York in England, the Archaeology Data Service (ADS) maintains an archaeological digital repository, but it includes only data from the United Kingdom (UK) archaeological contexts or data that are generated by UK researchers. ADS and tDAR do not compete and have partnered on several projects.

There are general-purpose, “institutional,” digital repositories, including those operated by universities for data their faculty or students create or utilize. However, many of these either do not accept or do not effectively document the more complex data types that archaeologists collect. Because of their general-purpose nature, these repositories cannot offer the functionality or the disciplinary specific metadata that tDAR provides for archaeological data. While they maintain standard technical metadata, they include only very general substantive metadata, seriously limiting both information discovery and reuse. tDAR, on the other hand, allows for the inclusion of detailed substantive metadata specifically tailored for archaeology and for the administrative and management needs of the federal agency. This metadata is essential for data discovery, reuse and preservation, especially for systematically recorded databases. tDAR structures information and provides a user interface designed for archaeologists and the managers of archaeological information.

Digital data that are not curated effectively are highly fragile, subject to complete loss due to media degradation, software and hardware evolution, and inadequate metadata. Few if any traditional artifact curation facilities are providing or capable of providing anything like the federally mandated level of digital data curation and access. tDAR was explicitly designed to fill this void; records in tDAR are preserved and made accessible in accordance with federal laws and regulations (Cultural Heritage Partners 2012).

References included in PDF

Download

There is currently no system at the DoD Service or Command levels for preserving and disseminating digital data generated by archaeological work on military installations. Records of archaeological investigations increasingly are created and stored in digital form only. Archaeological curation repositories are not able to act as digital archives. Digital files are vulnerable to corruption, hardware failure, and format obsolescence if not properly maintained, preserved, and migrated. Without suitable management and preservation of digital data, the results of expensive archaeological work may be lost altogether, wasting money and leaving installations unable to factor significant archaeological resources into their activities, developments, and training plans.