An Early Prototype of the Comprehensive Extensible Data Documentation and Access Repository (CED2AR)

Block, William C., Jeremy Williams, John M. Abowd, Lars Vilhuber, and Carl Lagoze. An Early Prototype of the Comprehensive Extensible Data Documentation and Access Repository (CED2AR). Cornell University Preprint 1813:30922, 2012, available at http://hdl.handle.net/1813/30922.
An Early Prototype of the Comprehensive Extensible Data Documentation and Access Repository (CED2AR) Block, William C.; Williams, Jeremy; Abowd, John M.; Vilhuber, Lars; Lagoze, Carl This presentation will demonstrate the latest DDI-related technological developments of Cornell University’s $3 million NSF-Census Research Network (NCRN) award, dedicated to improving the documentation, discoverability, and accessibility of public and restricted data from the federal statistical system in the United States. The current internal name for our DDI-based system is the Comprehensive Extensible Data Documentation and Access Repository (CED²AR). CED²AR ingests metadata from heterogeneous sources and supports filtered synchronization between restricted and public metadata holdings. Currently-supported CED²AR “connector workflows” include mechanisms to ingest IPUMS, zero-observation files from the American Community Survey (DDI 2.1), and SIPP Synthetic Beta (DDI 1.2). These disparate metadata sources are all transformed into a DDI 2.5 compliant form and stored in a single repository. In addition, we will demonstrate an extension to DDI 2.5 that allows for the labeling of elements within the schema to indicate confidentiality. This metadata can then be filtered, allowing the creation of derived public use metadata from an original confidential source. This repository is currently searchable online through a prototype application demonstrating the ability to search across previously heterogeneous metadata sources. Presentation at the 4th Annual European DDI User Conference (EDDI12), Norwegian Social Science Data Services, Bergen, Norway, 3 December, 2012