WSS Seminar on Administrative Records for Best Possible Estimates
- Date/Time:September 18, 2014 1:00 pm - 4:30 pm
- Location: Bureau of Labor Statistics Conference Center
- WebEx event address for attendees: https://dol.webex.com/dol/j.php?MTID=m8fd58b3edba64f3bfaf63eeb8e63edfe
- For audio:
Call-in toll-free number (Verizon): 1-866-747-9048 (US)
Call-in number (Verizon): 1-517-233-2139
(US) Attendee access code: 938 454 2 - Note: Particular computer configurations might not be compatible with WebEx.
- Presentation materials:
Graton Gathright, Census
Phil Kott, RTI
Shelly Martinez, OMB
Shawn Bucholtz, HUD
Connie Citro, CNSTAT
Amy O'Hara, Census
Schedule
Time | Speaker | Point of Contact |
1:00 | Mike Fleming, WSS | charles.fleming@bhox.com |
1:10 | Connie Citro, CNSTAT | CCitro@nas.edu |
1:35 | Phil Kott, RTI | pkott@rti.org |
2:00 | Graton Gathright, Census | graton.m.gathright@census.gov |
2:25 | Intermission | |
2:45 | Shelly Martinez, OMB | rmartinez@omb.eop.gov |
3:10 | Shawn Bucholtz, HUD | shawn.j.bucholtz@hud.gov |
3:35 | Amy O'Hara, Census | amy.b.ohara@census.gov |
4:00 | Questions and Answers |
Abstracts
From Multiple Modes for Surveys to Multiple Data Sources for Estimates: The Role of Administrative Records in Federal Statistics
Users, funders, and providers of federal statistics want estimates that are wider, deeper, quicker, better, cheaper (channeling Tim Holt, former head of the UK Office for National Statistics), to which I would add more relevant and less burdensome. Each of these adjectives poses challenges and opportunities for those who produce statistics. Since World War II, we have relied heavily on the probability sample survey as the best we could do, and that best being very good, indeed, to meet these goals for estimates in many areas, including household income and unemployment, self-reported health status, time use, crime victimization, business activity, commodity flows, consumer and business expenditures, and so on. Faced with secularly declining unit and item response rates and evidence of reporting error, we have responded in many ways, including the use of multiple survey modes, more sophisticated weighting and imputation methods, adaptive design, cognitive testing of survey items, and other means to maintain data quality. For statistics on the business sector, in order to reduce burden and costs, we long ago moved away from relying solely on surveys to produce needed estimates and have made extensive use of administrative records, but, to date, we have not done that for household surveys. I argue that we can and must move from a paradigm of producing the best estimates possible from a survey to that of producing the best possible estimates to meet user needs from multiple data sources. Such sources include administrative records, as well as, increasingly, transaction and Internet-based data. I provide several examples, including household income and household plumbing facilities, to illustrate my thesis. I conclude by suggesting ways to inculcate a culture of federal statistics that focuses on the end result of relevant, timely, accurate, and cost-effective statistics and treats surveys, along with administrative records and other data sources, as means to that end.
– Constance F. Citro, Director Committee on National Statistics, The National Academies
Presentation Materials
A Different Paradigm Shift: Combining Administrative Data and Survey Samples for the Intelligent User
In her startling WSS President's Invited Lecture, Connie Citro called for the slow and careful implementation of a paradigm shift in the way government agencies produce federal statistics. She provided a number of reasons for a shift away from a primary reliance on the survey sampling, chief among them were unaddressed measurement error, decreasing response rates, and increasing cost, both financial and psychic (e.g., dealing with irate Congressmen complaining about the burden of government surveys on their constituents).
Measurement error often results from attempting to measure quantities on a survey that the sample respondents cannot adequately provide. This is not a new problem, but a growing one as the demands for more policy-relevant data increase. Still, surveys aimed at collecting specific information are often better suited for providing relevant data than external sources like administrative data.
Measuring and removing the potential biases from increasing nonresponse rates is a concern that, it seems to me, has been adequately addressed in the survey sampling literature, although theory is not always quickly translated into practice. That leaves the cost of sample surveys in an era of tight budgets and increasing demands for data products. By itself, the cost of providing defensible estimates at ever lower levels of aggregation provides reason enough to replace or at least modify the current paradigm. This talk will review recent literature on combining simple linear models and probability- sampling principles when combining administrative data with survey samples to produce useful estimates at levels of aggregation where using the latter alone would be inadequate. Multiple imputation fails as a method of variance estimation in this context. Jackknife variance estimation can be used in its place, but a jackknife requires the generation of multiple data set-much more that the standard five with multiple imputation.
Since the approach described above depends on a model, statistical tests will be proposed to assess the viability of the model and also to inform users of potential biases in the estimates. There are issues of Type 1 and Type 2 error which often separate survey sampling from the rest of statistics that need to be conveyed to the user as do all the other problems associated with the estimation process. Indeed, that is the paradigm shift I am proposing: government statistical agencies should stop treating most users like they are dumber than dirt and cater more to intelligent users of their statistics.
– Phillip S. Kott, RTI International
Presentation Materials
The Role of Linked Administrative Data in the Evaluation and Improvement of the Survey of Income and Program Participation
The Census Bureau uses Federal and State administrative data to evaluate and improve the quality of data from the Survey of Income and Program Participation (SIPP). This talk will describe how the SIPP program uses administrative data to investigate bias from unit and item nonresponse, to validate survey responses, to improve imputation of missing data, and to augment the SIPP data.
I will discuss the challenges that have been faced in accessing, linking, interpreting, and protecting the administrative data; what has been learned about SIPP data quality and what SIPP improvements have been achieved; and the potential new uses of administrative data for SIPP improvement that are currently being investigated.
- Graton Gathright, Social, Economic, and Housing Statistics Division; U.S. Census Bureau
Presentation Materials
One piece of the Multiple Data Sources Paradigm Shift: New Policy on Accessing and Use Administrative Data for Statistical Purposes
In her 2013 WSS President's Invited Lecture, Connie Citro called for implementation of a paradigm shift in the way government agencies produce federal statistics. She discussed a number of challenges to making this shift. This talk describes one set of challenges - those to accessing and successfully using government administrative data in statistical programs-and how 2014 guidance issued by OMB is designed to address those challenges. The talk will also discuss some agency implementation activities to date, and some early lessons learned from those efforts.
–Shelly Wilkie Martinez, Statistical and Science Policy, Office of Management and Budget
Presentation Materials
Integrating Administrative Records and Commercial Data Sources into HUD's Housing Surveys: Past, Present, and Future
HUD, in partnership with the Census Bureau, is redesigning the American Housing Survey and the Rental Housing Finance Survey for 2015. A major part of the redesign effort is to maximize the use of administrative records and commercial data. This talk will: (1) summarize how HUD and Census have used HUD administrative records in prior American Housing Surveys; (2) summarize our current research into administrative records and commercial data for sample frame improvement, imputation, response replacement, and question replacement; and (3) discuss challenges and unresolved issues in integrating administrative records and commercial data into our housing surveys.
–Shawn Bucholtz, Director, Housing and Demographic Analysis, Office of Policy Presentation Materials
Development and Research, U.S. Department of Housing and Urban Development
Fully Leverage External Data Sources: A Census Bureau Change Principle
The Census Bureau is investigating the strategic reuse of Administrative Records and Third
Party Data to improve data quality, reduce costs and respondent burden, and develop new
data products. This talk will describe how administrative records are featured in the
Census Bureau's Strategic Plan, addressing Connie Citro's call for vision and long-term
planning. Efforts to integrate administrative records into frame construction, contact
approaches, adaptive design, and imputation are underway for the decennial census and
multiple household surveys. This talk discusses progress on these uses, and highlights the
need to understand data quality, design data systems to accommodate "big" data, and
collaborate with program and statistical agencies.
–Amy O'Hara, Center for Administrative Records Research and Applications, U.S. Census Bureau
Presentation Materials