64th ISI World Statistics Congress - Ottawa, Canada

64th ISI World Statistics Congress - Ottawa, Canada

Description of uses of DDI at the US Bureau of Labor Statistics

Abstract

DDI-Lifecycle and DDI-CDI (Cross Domain Integration) are being used at the US Bureau of Labor Statistics to describe some surveys and the public-use microdata they produce. The largest effort so far is with the Consumer Expenditure Surveys (CE) and serious plans are underway to use DDI standards to describe a new cohort for the National Longitudinal Survey of Youth (NLSY).

For CE the main issues have been to relate similar metadata across surveys (CE is based on 2 surveys), over time (incorporating changes to surveys), and through processing (4 distinct processing systems are in place). Historically, descriptions of data were separated by all those dimensions. The public use microdata had a 100 page PDF documentation file associated with it. A separate database described the data in each of the processing systems. Many inconsistencies and inefficiencies were the consequence. The current system vastly improves this situation. Details will be discussed.

NLSY is a typical longitudinal survey in that new waves are conducted periodically for the current cohorts. New variables are created to account for the new data each time. The current user interface allows those looking for NLSY data to find variables based on subject, but sometimes this results in over 100 variables that all mean essentially the same thing. Further, some event data, such as those for employment or education, might be structured more effectively in an event history format. There are other issues as well. Decisions have not yet been made; however we will show the complexity from the 1997 cohort to illustrate the problems.