This paper outlines the development of an interactive data set of sites for the Pais Caranqui. In workshops on cyberinfrastructure, panelists commonly discuss the need for case studies that highlight the value of contributing to and developing cyberinfrastructure. The data set discussed in this paper is intended to meet this need, as its development required a consideration of a multitude of issues due to the variety of circumstances under which its data was obtained. Common problems that have been identified in workshops and publications are discussed, along with the methods utilized to address them. Methods for handling issues related to both new and legacy data sets are examined, as the author’s own digitization procedures are discussed alongside the methods developed to include data sets that are no longer actively used. Issues related to the use of CRM-generated data sets also are touched upon. The value of contributing to cyberinfrastructure is then demonstrated through an examination of trade patterns in the Pais Caranqui. It is the hope of the author that the methods outlined serve as a model for other researchers interested in developing large settlement pattern data sets.

I. Introduction
    A. Paper has two purposes
         a. Detail my own digitization procedures for dissertation data
         b. Outline the initial steps taken in the development of a method to create an interactive regional data set
    B. Goal: develop a method using this small-scale case study that can be applied elsewhere, to both large- and small-scale projects
    C. Valuable example because it deals with issues of preservation for both new and legacy datasets
II. My original research
    A. Study Area
    B. Relevant aspects of methodology
         a. Coordinate system
         b. Transect size
         c. Site definition
    C. Data and Metadata
         a. In-field note taking procedures
              i. Site Record Form
              ii. Additional Notes/Observations
         b. Digitizing data in lab
              i. GIS procedures
              ii. GPS spreadsheets
     D. Accessibility of data
         a. Moving beyond simply answering requests for data
         b. Online archive
     E. Setting a standard for digitization and availability of future research
III. Developing an interactive regional data set
     A. Brief description of other surveys done in the Pais Caranqui
         a. Tamara Bray’s dissertation
         b. CRM work
         c. Non-profit regional preservation research
     B. Compatability Issues
         a. Differences in methodology
              i. Transect size
              ii. Site definitions
              iii. Level of coverage
         b. Differences in data
              i. Coordinate system
              ii. Changes due to shifting usage of the landscape
         c. Differences in metadata
              i. Differing amounts (and types)
              ii. Access issues
         d. Drawing the line – how much information should be included
              i. How much information is useful beyond site size and location?
              ii. Just because I have the information, should it be included?
         e. To standardize or not to standardize?
     C. Access Issues
         a. Gaining access to other researcher’s data
         b. Who should be given access?
     D. Encouraging future additions to the data set
     E. Method of combining data sets
         a. Solutions (and justification) to issues outlined in previous sections
         b. Long-term considerations
              i. File types – GIS file types are commercial
              ii. Storage issues
IV. Conclusion
     A. Example of usefulness: study of trade in the region
     B. Model for development of larger settlement pattern data sets
     C. Unresolved issues
     D. Future directions

