Washington Statistical Society on Meetup   Washington Statistical Society on LinkedIn

WSS Seminar on Administrative Records for Best Possible Estimates


Time Speaker Point of Contact
1:00 Mike Fleming, WSS charles.fleming@bhox.com
1:10 Connie Citro, CNSTAT CCitro@nas.edu
1:35 Phil Kott, RTI pkott@rti.org
2:00 Graton Gathright, Census graton.m.gathright@census.gov
2:25 Intermission  
2:45 Shelly Martinez, OMB rmartinez@omb.eop.gov
3:10 Shawn Bucholtz, HUD shawn.j.bucholtz@hud.gov
3:35 Amy O'Hara, Census amy.b.ohara@census.gov
4:00 Questions and Answers  


From Multiple Modes for Surveys to Multiple Data Sources for Estimates: The Role of Administrative Records in Federal Statistics

Users, funders, and providers of federal statistics want estimates that are wider, deeper, quicker, better, cheaper (channeling Tim Holt, former head of the UK Office for National Statistics), to which I would add more relevant and less burdensome. Each of these adjectives poses challenges and opportunities for those who produce statistics. Since World War II, we have relied heavily on the probability sample survey as the best we could do, and that best being very good, indeed, to meet these goals for estimates in many areas, including household income and unemployment, self-reported health status, time use, crime victimization, business activity, commodity flows, consumer and business expenditures, and so on. Faced with secularly declining unit and item response rates and evidence of reporting error, we have responded in many ways, including the use of multiple survey modes, more sophisticated weighting and imputation methods, adaptive design, cognitive testing of survey items, and other means to maintain data quality. For statistics on the business sector, in order to reduce burden and costs, we long ago moved away from relying solely on surveys to produce needed estimates and have made extensive use of administrative records, but, to date, we have not done that for household surveys. I argue that we can and must move from a paradigm of producing the best estimates possible from a survey to that of producing the best possible estimates to meet user needs from multiple data sources. Such sources include administrative records, as well as, increasingly, transaction and Internet-based data. I provide several examples, including household income and household plumbing facilities, to illustrate my thesis. I conclude by suggesting ways to inculcate a culture of federal statistics that focuses on the end result of relevant, timely, accurate, and cost-effective statistics and treats surveys, along with administrative records and other data sources, as means to that end.

– Constance F. Citro, Director Committee on National Statistics, The National Academies
Presentation Materials

A Different Paradigm Shift: Combining Administrative Data and Survey Samples for the Intelligent User

In her startling WSS President's Invited Lecture, Connie Citro called for the slow and careful implementation of a paradigm shift in the way government agencies produce federal statistics. She provided a number of reasons for a shift away from a primary reliance on the survey sampling, chief among them were unaddressed measurement error, decreasing response rates, and increasing cost, both financial and psychic (e.g., dealing with irate Congressmen complaining about the burden of government surveys on their constituents).

Measurement error often results from attempting to measure quantities on a survey that the sample respondents cannot adequately provide. This is not a new problem, but a growing one as the demands for more policy-relevant data increase. Still, surveys aimed at collecting specific information are often better suited for providing relevant data than external sources like administrative data.

Measuring and removing the potential biases from increasing nonresponse rates is a concern that, it seems to me, has been adequately addressed in the survey sampling literature, although theory is not always quickly translated into practice. That leaves the cost of sample surveys in an era of tight budgets and increasing demands for data products. By itself, the cost of providing defensible estimates at ever lower levels of aggregation provides reason enough to replace or at least modify the current paradigm. This talk will review recent literature on combining simple linear models and probability- sampling principles when combining administrative data with survey samples to produce useful estimates at levels of aggregation where using the latter alone would be inadequate. Multiple imputation fails as a method of variance estimation in this context. Jackknife variance estimation can be used in its place, but a jackknife requires the generation of multiple data set-much more that the standard five with multiple imputation.

Since the approach described above depends on a model, statistical tests will be proposed to assess the viability of the model and also to inform users of potential biases in the estimates. There are issues of Type 1 and Type 2 error which often separate survey sampling from the rest of statistics that need to be conveyed to the user as do all the other problems associated with the estimation process. Indeed, that is the paradigm shift I am proposing: government statistical agencies should stop treating most users like they are dumber than dirt and cater more to intelligent users of their statistics.

– Phillip S. Kott, RTI International
Presentation Materials

The Role of Linked Administrative Data in the Evaluation and Improvement of the Survey of Income and Program Participation

The Census Bureau uses Federal and State administrative data to evaluate and improve the quality of data from the Survey of Income and Program Participation (SIPP). This talk will describe how the SIPP program uses administrative data to investigate bias from unit and item nonresponse, to validate survey responses, to improve imputation of missing data, and to augment the SIPP data.

I will discuss the challenges that have been faced in accessing, linking, interpreting, and protecting the administrative data; what has been learned about SIPP data quality and what SIPP improvements have been achieved; and the potential new uses of administrative data for SIPP improvement that are currently being investigated.

- Graton Gathright, Social, Economic, and Housing Statistics Division; U.S. Census Bureau
Presentation Materials

One piece of the Multiple Data Sources Paradigm Shift: New Policy on Accessing and Use Administrative Data for Statistical Purposes

In her 2013 WSS President's Invited Lecture, Connie Citro called for implementation of a paradigm shift in the way government agencies produce federal statistics. She discussed a number of challenges to making this shift. This talk describes one set of challenges - those to accessing and successfully using government administrative data in statistical programs-and how 2014 guidance issued by OMB is designed to address those challenges. The talk will also discuss some agency implementation activities to date, and some early lessons learned from those efforts.

–Shelly Wilkie Martinez, Statistical and Science Policy, Office of Management and Budget
Presentation Materials

Integrating Administrative Records and Commercial Data Sources into HUD's Housing Surveys: Past, Present, and Future

HUD, in partnership with the Census Bureau, is redesigning the American Housing Survey and the Rental Housing Finance Survey for 2015. A major part of the redesign effort is to maximize the use of administrative records and commercial data. This talk will: (1) summarize how HUD and Census have used HUD administrative records in prior American Housing Surveys; (2) summarize our current research into administrative records and commercial data for sample frame improvement, imputation, response replacement, and question replacement; and (3) discuss challenges and unresolved issues in integrating administrative records and commercial data into our housing surveys.

–Shawn Bucholtz, Director, Housing and Demographic Analysis, Office of Policy Presentation Materials

Development and Research, U.S. Department of Housing and Urban Development

Fully Leverage External Data Sources: A Census Bureau Change Principle
The Census Bureau is investigating the strategic reuse of Administrative Records and Third Party Data to improve data quality, reduce costs and respondent burden, and develop new data products. This talk will describe how administrative records are featured in the Census Bureau's Strategic Plan, addressing Connie Citro's call for vision and long-term planning. Efforts to integrate administrative records into frame construction, contact approaches, adaptive design, and imputation are underway for the decennial census and multiple household surveys. This talk discusses progress on these uses, and highlights the need to understand data quality, design data systems to accommodate "big" data, and collaborate with program and statistical agencies.

–Amy O'Hara, Center for Administrative Records Research and Applications, U.S. Census Bureau
Presentation Materials