Washington Statistical Society Seminar Archive: 1998
WSS Home | Newsletter | WSS Info | Seminars | Courses | Employment | Feedback | Join!
BLS STATISTICAL SEMINAR
Title: Reinterviews and Reconciliation using CAPI
- Speaker: Nancy Bates, Survey Statistician, Office of the Director, U.S. Census Bureau
- Date/Time: Monday, January 12, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab
Postal Square Building, Room 2990
2 Massachusetts Avenue, NE
Washington, DC
(Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue.
(Note: please read BLS Building Visitors in the January, 1998 newsletter)
CASIC technology can be applied to improve the quality of survey data. This is particularly true in reinterview surveys. A computerized instrument makes it easy to "hide" original interview data and reveal them only at the appropriate time. Additionally, the computer helps relieve the interviewer of the time-consuming and error-prone task of comparing answers between the first and second data collection. This presentation describes the Integrated Coverage Measurement (ICM) survey, a CAPI reinterview survey with on-the-spot reconciliation. Using behavior coding of tape recorded interviews, we evaluated the question wording and flow of the instrument to help uncover design flaws which might impact response and/or reconciliation bias. The evaluation led to several recommendations. First, the instrument should be programmed with quality checks to increase the accuracy of information matched against the original data. Second, the instrument should contain a "freeze" feature prohibiting interviewers from changing their answers once the original data are revealed. Finally, the reconciliation should flow naturally as part of the interview by using an instrument-driven indirect reconciliation approach that allows most discrepancies to be resolved before the interviewer or respondent are aware they exist.
* NOTE TO NON-BLS EMPLOYEES WHO WANT TO ATTEND THIS LECTURE:
Special security procedures are in effect for non-BLS employees. Non-BLS employees who want to attend this lecture should call Karen Jackson (202-606-7524) in order to be placed on the visitor's list. Please call by noon of Friday, January 9 (please provide the date/title of lecture, your name, and name of the organization where you work). Return to top
Topic: Estimating Small Sample Bias in Two Price Index Formulas
- Speakers:
Robert McClelland, Bureau of Labor Statistics
Marshall Reinsdorf, FDIC - Discussant: Janice Lent, Bureau of Labor
Statistics
- Chair: Linda Atkinson, Economic Research
Service
- Day/Time: Tuesday, January 13, 1998, 12:30 - 2:00
p.m.
- Location: BLS Cognitive Lab
Postal Square Building, Room 2990
2 Massachusetts Avenue, NE
Washington, DC
(Metro Red Line - Union Station). Enter at Massachusetts Avenue and North Capitol Street. Call Linda Atkinson (202-694-5046) by January 9 to be placed on the visitors list.
(Note: please read BLS Building Visitors in the January, 1998 newsletter) - Sponsor: Economics Section.
BLS is able to estimate expenditures during the base period on individual items in particular stores, but a lack of information on base period prices prevents it from using these expenditures to find the quantity weights called for by the CPI's Laspeyres index formula. Using the later "link month" price both as the starting point for measuring price change and as a proxy for the unknown base period price leads to a positive covariance between errors in weighting and price changes. This effect has become known as "formula bias." Avoiding formula bias when estimating the price index for a particular item stratum in a particular area requires the use of a non-linear estimator whose expected value rises as the sample size falls. To get empirical evidence on formula bias and small sample bias, we draw samples from a simulated "population" formed by pooling CPI sample data from many areas. For long run indexes, the expected value of the geometric mean index is much more sensitive to sample size than the expected value of the "seasoned" index formula that BLS has adopted. Return to top
Topic: An Application of Multiple-List-Frame Sampling for Multi-Purpose Surveys
- Discussant: Pedro Saavedra, MACRO International
- Chair: Karol Krotki, Education Statistics Services Institute (ESSI)
- Date/Time: Wednesday, January 14, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab
Postal Square Building, Room 2990
2 Massachusetts Avenue, NE
Washington, DC
(Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
(Note: please read BLS Building Visitors in the January, 1998 newsletter) - Sponsor: Methodology Section
The National Agricultural Statistics Service (NASS) does a quarterly survey to estimate various crop acreages, production, and stocks. This multi-purpose survey estimates aggregates for different sets of crops each quarter. NASS is experimenting with a new estimation strategy that begins with a set of overlapping list frames. Each frame is associated with a specific crop or type of crop and consists of farms having had one or more of the associated crops in the past. A Poisson sample is drawn from each frame using the same assignments of Permanent Random Numbers (PRNs) in every frame. The PRN technique increases the likelihood that the same farms will be sampled from each frame, while Poisson sampling makes it possible to target samples from different frames in different quarters as needed. The use of calibrated weights rather than simple expansion weights removes the statistical inefficiency often associated Poisson sampling. NASS conducted a parallel test of this new estimation strategy last June. In most cases, it produced estimates with smaller variances than the "priority stratification" strategy currently in place. Return to top
Topic: Generalized Variance Functions - An Establishment Survey Application
- Authors: Chris Moriarity, NCHS, Sarah Gousen, NCHS, David Chapman, FDIC
- Speaker: Chris Moriarity, NCHS
- Discussant: Richard Valliant, BLS
- Chair: Brenda Cox, Mathematica
- Date/Time: Thursday, January 22, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building,
Room 2990
2 Massachusetts Avenue, NE
Washington, DC
(Red Line--Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) 259 or send e-mail to jackson_k@bls.gov at least 2 days before talk to be placed on the visitor list.
(Note: please read BLS Building Visitors in the January, 1998 newsletter) - Sponsor: Methodology Section
The National Center for Health Statistics (NCHS) conducted the National Employer Health Insurance Survey (NEHIS) in 1994. The 1994 NEHIS is an establishment survey that collected data on the health insurance provided by employers to their employees, including the number and types of plans offered (if any), the number of employees eligible and enrolled in a health plan, and specific plan characteristics (e.g., premiums and covered services).
The first NCHS publication that contains NEHIS estimates has been prepared. Generalized variance functions (GVFs) are included in the publication so that data users can obtain rough approximations of the variances of survey estimates presented in the report.
Our presentation will describe our research on various alternatives for providing GVFs for the NEHIS publication. Our research began with an investigation of the applicability of the GVF model typically used for demographic survey totals,
relvar(x) = a + b/x.
When attempts to fit this model to NEHIS estimates of totals were unsuccessful, we explored other models, such as those provided in Chapter 5 of Kirk Wolter's 1985 text, Introduction to Variance Estimation. We eventually chose two basic models, one for approximating variances for totals, and the other for approximating variances for proportions. In addition to describing these models and the development process, we discuss the criteria we used to select these models over competing ones. Return to top
Title: Matlab and Octave: Matrix Manipulation Languages and Their Application
- Speaker: Przemek Klosowski, National Institute of Standards and Technology
- Chair: Mike Fleming, National Agricultural Statistics Service
- Date/Time: Wednesday, January 28, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building,
Room 2990
2 Massachusetts Avenue, NE
Washington, DC
(Red Line--Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) 259 or send e-mail to jackson_k@bls.gov at least 2 days before talk to be placed on the visitor list.
(Note: please read BLS Building Visitors in the January, 1998 newsletter)
Matlab is a linear algebra computations language, written by numerical mathematician Cleve Moler. It is commercially available on most platforms (Unix/Windows/VMS) and quite popular. Octave is a freely available (Unix and Windows) Matlab clone.
Their primary strength is their concept of a numerical array as a first-class data object; all linear algebraic operations are defined on those objects. For instance, one can write expressions like (A'*A)^(-1) * A', which involves two matrix multiplications and computation of an inverse matrix. This abstraction turns out to be quite convenient to describe numerical algorithms from almost every branch of mathematics. Because of this high-level abstraction, Matlab/Octave are quite efficient even though they are interpreted and not compiled---the actual computations are performed on binary data at machine language speeds. The Matlab matrix language is well-designed, elegant and quite popular. In addition to built-in functionality, there are several user-contributed algorithm libraries, including many routines for the statistical science.
See the following references for further details:
- http://www.mathworks.com/
- http://www.che.wisc.edu/octave/
Topic: Application of GEE Procedures for Sample Size Calculations in Repeated Measures Experiments
- Speaker: James Rochon
The Biostatistics Center
George Washington University
Rockville, MD, 20852. - Date/Time: Wednesday, February 4, 11:00 a.m. - 12:00 p.m.
- Location: Conference Room G, Executive Plaza North (EPN), 6130 Executive Blvd, Rockville, MD
- Sponsor: WSS and DCPC
Derivation of the minimum sample size is an important consideration in an applied research effort. When the outcome is measured at a single time point, sample size procedures are well known and widely applied. The corresponding situation for longitudinal designs, however, is less well developed. In this paper, we adapt the generalized estimating equation (GEE) approach of Liang and Zeger (1986) to sample size calculations for discrete and continuous outcome variables. The non-central version of the Wald chi-squared test is considered. The damped exponential family of correlation structures described in Mu–oz et al. (1992) is used for the "working" correlation matrix among the repeated measures. This model provides a rich family of correlation structures and includes as special cases the exchangeable correlation pattern and the autoregressive model of order 1. Several examples are discussed, and extensions accounting for unequal allocation, staggered entry and loss to follow-up are considered. Return to top
Topic: Examining Geographic Patterns of U.S. Mortality Rates
- Speakers: Linda W. Pickle and Catherine Cubbin
National Center for Health Statistics, CDC - Chair: Trena Ezzati-Rice, National Center for Health Statistics, CDC
- Date/Time: Thursday, February 5, 1998; 2:00 - 3:30 p.m.
- Location: National Center for Health Statistics, Presidential Building, 11th Floor Auditorium, Room 1110, 6525 Belcrest Road, Hyattsville, MD (Metro: Green line, Prince Georges Plaza, then approximately 2 blocks).
- Sponsor: Public Health & Biostatistics
The National Center for Health Statistics has produced an Atlas of United States Mortality which includes maps of rates for 18 leading causes of death in the United States for the period 1988-92. Many aspects of statistical mapping have been re-examined through cognitive experimentation to maximize the Atlas' effectiveness in conveying accurate mortality patterns to public health practitioners. Multiple maps are included for each cause of death to answer different questions posed by the reader. These include maps of rates adjusted for age differences, smoothed rates for specific age groups, and statistical significance. Mortality data from the new Atlas are being layered with data representing socio-demographic, environmental, and behavioral risk factors using ArcView to create a geographic information system (GIS) that will allow researchers to explore correlations between the patterns of mortality and those of suspected risk factors. The usefulness of mapping in public health research will be discussed and the new GIS system will be demonstrated using examples from the new Atlas for illustration.
POC: Trena M. Ezzati-Rice (tme1@cdc.gov) Return to top
Topic: An Evaluation of the SIPP's Oversampling Design
- Speakers: Vicki J. Huggins and Karen Ellen King
Demographic Statistical Methods Division
U.S.Bureau of the Census - Discussant: Ramal Moonesinghe, Westat
- Chair: Brenda Cox, Mathematica
- Date/Time: Wednesday, February 11, 1998; 12:30 - 2:00p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
- Sponsor: Methodology Section
The 1996 panel of the Survey of Income and Program Participation (SIPP) has an oversample design of near-poverty cases with near-poverty households being those with total income less than or equal to 150% of poverty threshold for the unit. The 1996 sample was selected five years ago using characteristics of households collected in the 1990 Decennial Census of Population and Housing that were highly correlated with poverty. With six years between the Census collection and the 1996 collection of SIPP, we expected some changes in the characteristics of the households residing in sample addresses, but hoped the movement would not deteriorate the oversample substantially. This paper presents results of an evaluation to measure the actual deterioration of the oversample versus the expected deterioration. Preliminary results suggest that the oversample remained intact and provided more near-poverty cases than a self-weighting design. Return to top
Topic: Reducing Nonresponse in Business Surveys
- Speaker: Young Chun, BLS
- Discussant: Carol House, NASS
- Chair: Karol Kr¢tki, Education Statistics Services Institute (ESSI)
- Date/Time: Wednesday, February 18, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
- Sponsor: Methodology Section
Efforts to reduce nonresponse errors require an understanding of the process of survey compliance. Useful theories of business survey compliance should consider a set of micro-to-macro factors that influence the extent to which an informant complies.
A large scale randomized experiment (n = 6,000) conducted in 1995 by the Bureau of Labor Statistics showed that the combined use of advance letters and reminder/thank you letters significantly reduced nonresponse in a typical establishment survey.
We also learned that both early additional contacts helped identify nonviable business units and refusals. It is intuitive that informed resource allocation will make a survey more cost effective. To investigate this concept, we examined a term we call the Information Rate (IR). The information (e.g., early identification of ineligibles, refusals, and wrong addresses) is valuable as it helps us allocate resources up-front in a more knowledgeable, effective manner in order to focus on establishments that are still eligible to respond. An evaluation of the index shows that the short-term stimulus provided by the additional contacts has a wider impact on the survey than we originally thought.
Analyses of both the response rate and the IR are discussed at the industry and employment size level, the two most important correlates of compliance. The statistically significant results from this study allowed the Bureau to redesign the data collection process accordingly. Return to top
Topic: Modeling Dietary Measurement Error and Its Impact on the Analysis of Nutritional Epidemiological Studies
- Speaker: Victor Kipnisj Biometry Branch, Division of Cancer Prevention, National Cancer Institute, NIH.
- Date/Time: Tuesday, March 3, 1998, 11:00 - 12:00 p.m.
- Location: Conference Room G, Executive Plaza North (EPN), 6130 Executive Blvd, Rockville, MD
- Sponsor: DCPC
Dietary measurement is subject to substantial error that can have a profound impact on the estimated effect of an exposure variable on disease. Recently suggested methods of correction for measurement error, such as regression calibration approach, rely on availability of validation/calibration data set where exposure is measured without error. If such "gold standard" is itself imperfect and contains error, the regression calibration method remains valid if any such error meets the requirement of the "classical" measurement error model, i.e. is independent of the true value and of error in the primary instrument used in the main study. In nutritional epidemiology, growing evidence indicates that this simple model for the reference instrument, such as multiple day food record or 24-hour recall, may break down for at least two reasons: (i) systematic underreporting depending on a person's body mass index (BMI); and (ii) a subject specific component of bias, so that the error structure is the same as in a one-way random effect model. Our results demonstrate that while systematic bias due to BMI appears to have little effect, the subject specific bias may have a potentially very important impact on the overall results. However, this impact is shown via examples to depend on the data set being used. Indeed, some of our validation/calibration data sets suggest that dietary measurement error may be masking a strong risk of fat on breast cancer, while for some other data sets this masking is not so clear. Until further understanding of dietary measurement is available, measurement error corrections must be done on a study-specific basis, sensitivity analyses should be conducted, and even then results of epidemiologic studies relating diet to disease risk should be cautiously interpreted. Return to top
Title: Applications of Dynamic Statistical Graphics within a GIS and a VR Environment
- Speaker: Juergen Symanzik, George Mason University Center for Computational Statistics
- Chair: Mike Fleming, National Agricultural Statistics Service
- Date/Time: Wednesday, March 4, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Red Line--Union Station). Enter at 1st Street and Massachusetts Avenue. Please call Karen Jackson at 202-606-7524 or send e-mail to jackson_karen@bls.gov to be placed on the visitor's list.
- Sponsor: Statistical Computing Section
This talk describes two current projects in highly interactive dynamic statistical graphics: (1) The ArcView/XGobi/XploRe environment links the Geographic Information System (GIS) ArcView, the dynamic statistical graphics program XGobi, and the statistical computing environment XploRe and allows spatial data analysis within one single software environment. (2) The C2, a highly immersive Virtual Reality (VR) environment, has been adapted for the visualization and manipulation of high-dimensional statistical data in a geographic framework. Return to top
Topic: Using A Time Use Approach to Measure Non-market Work: Problems and Solutions
- Speaker: Linda Stinson, Bureau of Labor Statistics
- Chair: Michael Horrigan, Bureau of Labor Statistics
- Date/Time: Thursday, March 5, 1998, 12:30 - 2:00 p.m.
- Location: Room 2990, Bureau of Labor Statistics, Postal Square Building, 2 Massachusetts Avenue, N.E., Washington, D.C. (Red Line - Union Station). Call Glenda Driggers (606-7391) or Yvonne Jenkins (606-6402) to be placed on the visitors' list.
- Sponsor: Social and Demographic Section
Recently, the Bureau of Labor Statistics began to investigate the feasibility of estimating the amount of unpaid, non-market work in the United States using a time-use approach. Certain characteristics, however, seem to typify much of non-market work. In many cases, it tends to (1) occur frequently, if not daily (e.g., food preparation, making beds), (2) be routine and almost automatic (e.g., putting food away after a meal), and (3) take place almost continuously or at least simultaneously with other activities (e.g., caring for children while cooking, cleaning, or shopping). As a result, these activities are sometimes overlooked by respondents and under-reported even within time-use surveys.
The first step in our project was to conduct a series of cognitive interviews. The results of the cognitive interviews are presented here. Based on this exploratory work, we then developed an Aenhanced@ interview protocol that specifically asked respondents to report simultaneous activities and included an additional probe question for child care activities.
The next step was a field experiment based on the comparison of two versions of the interview guide. In the control condition, we used a Astandard@ time-use interview, asking respondents to report the starting/stopping times and locations of their activities yesterday and who was with them. The Astandard@ approach was contrasted with the results from our Aenhanced@ interview protocol. This paper will also present the dependent variables by which we judged the two experimental conditions. Return to top
Topic: The Measurement of Poverty
- Speaker: Robert Michael, University of Chicago
- Chair: Michael Horrigan, Bureau of Labor Statistics
- Date/Time: Thursday, March 12, 1998, 12:30-2:00 p.m.
- Location: Room 2990, Bureau of Labor Statistics, Postal Square Building, 2 Massachusetts Avenue, N.E., Washington, D.C. (Red Line C - Union Station). Call Glenda Driggers (606-7391) or Yvonne Jenkins (606-6402) to be placed on the visitors-list.
- Sponsor: Social and Demographic Section
The current official measure of poverty in the US has been criticized as inadequate on several fronts. The basic data requirements to measure the concept of poverty adequately are one of the reasons that official measure has not been replaced or modified over the past few decades. The talk will review some of the data needs that could provide a more adequate measure of poverty today and will encourage a discussion of what existing data might address those needs. Return to top
Topic: Graphical Comparisons of Subpopulations Based on Complex Sample Survey Data
- Speaker: John L. Eltinge, Texas A&M University
- Discussant: Alan Dorfman, BLS
- Chair: Brenda Cox, Mathematica
- Date/Time: Tuesday, March 17, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
- Sponsor: Methodology Section
Analyses of complex survey data often involve formal tests that compare means, regression coefficients, or similar parameters across subpopulations. However, these tests are often motivated by substantive interest in broader questions regarding the comparability of subpopulation-level distributions. In many cases, we can address these questions through the use of quantile plots, other graphical methods, and related formal tests. Implementation of this idea involves technical issues related to several components of error. Depending on the specific comparisons of interest, the proposed methods can be applied either to the distributions of individual observations, or to the distributions of related regression residuals. These ideas are illustrated with applications to the Third National Health and Nutrition Examination Survey (NHANES III). We close with some comments on the interpretation of results obtained from graphical methods and other exploratory analyses of complex survey data. Return to top
Topic: Information Seeking Behavior on Statistical Websites: Theoretical and Design Implications
- Speakers: Gary Marchionini, University of
Maryland
Carol A. Hert, Indiana University - Discussant: Mick Couper, JPSM
- Chair: Karol Krotki, Education Statistics Services Institute (ESSI)
- Date/Time: Tuesday, March 24, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
- Sponsor: Methodology Section
The Federal Government is increasingly interested in providing access to data via the World Wide Web. This has resulted in an increased interest in understanding how the public seeks and uses that data in order to enhance the design of various Websites. In 1996, the two presenters were contracted by the Bureau of Labor Statistics to investigate these questions in the context of the BLS website, the CPS website co-sponsored by BLS and the Bureau of the Census, and the FedStats website sponsored by the Interagency Council on Statistical Policy. The project is now in its second year.
This presentation will present the preliminary results and report on current activities of the study with a particular focus on the implications for statistical website design, organizational change, and statistical literacy. The findings of the first year are detailed in the final report of the first stage available at URL: http://www.glue.umd.edu/~dlrg/blsreport/mainbls.html. Return to top
Topic: Data Mining for Dummies
- Speaker: Daryl Pregibon, AT&T Labs-Research Chair, Statistical Computing Section of ASA
- Chair: Robert W. Jernigan, American University
- Date/Time: Thursday, April 2, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC. (Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
- Sponsor: Statistical Computing Section
Data mining is the buzzword throughout business and in many scientific communities. Is data mining = statistics + hype? Or is there more to the topic that deserves serious consideration from the statistical community? I will introduce the topic and discuss the types of problems being attacked, he people involved, and what's at stake for the field of statistics. Return to top
Topic: The Use of a Variant of Poisson Sampling to Reduce Sample Size in a Multiple Product Price Survey
- Speakers: Pedro J. Saavedra, Macro International Inc. Paula Weir, Energy Information Administration
- Discussant: Phil Kott, US Department of Agriculture
- Date/Time: Tuesday, April 7, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC. (Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
- Sponsor: Agriculture & Natural Resources
The EIA survey that produces monthly petroleum product prices has been based on a stratified sample with linked selection and simulated probabilities of selection for the last 12 years. With declining budgets, the focus of the design was forced to shift from fixed Coefficients of Variation (C.V.) with resulting sample allocations and total sample sizes to fixed sample sizes with resulting C.V.s. Using a fixed sample size variant of Poisson Sampling which yields a Probability Proportional to Size (PPS) approximation, and simulating estimates of C.V.s, the possible sampling scenarios could be studied more easily than in the historical design. More significantly, the new design led to a major reduction in sample size for the given historical targeted C.V.s with only minor exceptions. Return to top
Topic: What Have We Learned from Constructing Panels of Tax Returns?
- Speaker: John Czajka, Mathematica Policy Research
- Discussant: TBA
- Date/Time: Wednesday, April 15, 1998; 12:30 - 2:00 p.m.
- Location: Room 2990, Bureau of Labor Statistics, 2 Massachusetts Ave. NE. Visitors outside the BLS, please call Karen Jackson at (202) 606-7524 to have your name place on the guard's list for admittance.
- Sponsor: Economics Section
Earlier this year, Mathematica Policy Research (MPR) delivered two longitudinal databases to the Statistics of Income (SOI) Division of the Internal Revenue Service (IRS). Each of these databases contains records abstracted from the tax returns filed by taxpayers who were selected more than a decade ago from a nationally representative sample of tax returns. The Sales of Capital Assets (SOCA) Panel started with 13,000 returns filed in 1986 while the Family Panel was initiated with a sample of 90,000 returns filed in 1988. Each database includes all returns filed through 1994 by the members of each respective panel and any separately filing spouses. The Family Panel also includes the returns filed by dependents of panel members.
These panels were designed to serve the needs of tax policy analysts in the Treasury Department. While their research with these two files is just beginning, a number of useful findings have emerged from the effort to construct these massive files. This presentation reviews what we have learned about the methodology of panel data design and construction in this particular context and what we have learned about taxpayer behavior. Topics that will be addressed include the quality of the social security numbers (SSNs) provided for filers and their dependents; other problems associated with the use of SSNs to track a sample of persons; changes in the income distribution of the panel and some of the consequences of stratifying on income; declining representativeness of the panel samples over time; results of our efforts to link the returns of separately filing spouses and to link the returns of "parents" and the dependents that they claim. Return to top
Topic: Social Security: Privatization and Progressivity
- Speaker: Jan Walliser and Kent Smetters Congressional Budget Office
- Chair: Michael Horrigan, Bureau of Labor Statistics
- Date/Time: Thursday, April 16, 1998, 12:30 - 2:00 p.m.
- Location: Room 2990, Bureau of Labor Statistics, Postal Square Building, 2 Massachusetts Avenue, N.E., Washington, D.C. (Red Line - Union Station). Call Glenda Driggers (606-7391) or Yvonne Jenkins (606-6402) to be placed on the visitors' list.
- Sponsor: Social and Demographic Section
This paper uses an enhanced version of the Auerbach-Kotlikoff Dynamic Life Cycle Model to simulate the economic effects of privatizing social security in the United States. The model's enhancements include intragenerational heterogeneity, kinked budget constraints, and a more realistic formulation of income taxation. The privatization of social security is modeled as a policy that a) forces workers to contribute to private accounts, b) gives retirees and workers social security benefits equal to only those they have accrued as of the time of the reform, and c) finances social security benefits during a transition period using either wage taxation, income taxation, consumption taxation, or a combination of income taxation and deficit finance.
In the absence of deficit finance, all methods of financing the transition leave the economy in the same long-run position-one with a 40 percent higher capital stock, a 7 percent larger supply of labor, a 14 percent higher level of per capita income, a 7 percent higher real wage, and a 19 percent lower real interest rate. Although the long-run positions of the economy end up the same under the three financing alternatives, the short-run positions are quite different. Consumption-tax finance produces much more rapid economic gains than does either wage-or income tax finance. Indeed, income-tax finance actually reduces output per person in the first decade of the transition.
Notwithstanding the elimination of social security's highly progressive benefit schedule, all members of future generation-poor and rich alike-gain substantially from privatization. These long-run gains come at the cost of welfare losses experienced by those generations that are elderly or middle aged at the time of the reform. Return to top
Topic: The Great American Time Paradox
- Speaker: John Robinson, University of Maryland
- Discussant: Barbara Forsyth, Westat
- Date/Time: Thursday, April 16; 12:30-2:00 p.m.
- Location: Bureau of Labor Statistics, BLS Training and Conference Center, Room 6 Postal Square Building, 2 Massachusetts Avenue, NE Washington, DC Please call Karen Jackson at (202) 606-7524 to be placed on the visitors' list. Guards will ask for a photo ID and must see your name on the list before allowing admittance.
- Co-Sponsors: WSS Data Collection Methods Section and the American Association of Public Opinion Research, Washington/Baltimore Chapter
It is possible that Americans have more free time than they did thirty years ago? Research based on careful records of how we actually spend our time shows that Americans have almost 5 hours more free time per week than in the 1960s. Time use expert John Robinson explains this surprising trend and how it has come about, drawing from his recent book (co-authored with Geoffrey Godbey), Time for Life: The Surprising Ways Americans Use Their Time. He argues that our sense of "time famine" stems from the increased emphasis on the "consumption" of experiences and from the phenomenon of "time deepening," doing more and doing things more quickly and simulataneously. His unique source of time-use information, the American's Use of Time Project, is the only such detailed historical data archive in the U.S. Every ten years the project has been asking thousands of Americans to report their daily activities on an hour-by-hour basis in time diaries. He discusses the numerous time paradoxes facing Americans, such as feeling more rushed and stressed when we actually have more free time, having free time in periods when it is least useful, and investing time in activities that bring us minimal enjoyment or fulfillment (e.g., television watching). Overall, he assures us that we indeed have "time for life."
Barb Forsyth will reflect on Robinson's work in the context of time diaries as a survey method, drawing from her own work as a cognitive psychologist. Return to top
Topic: Access to Federal Statistical Data: Reaching Customers in the Next Millennium
- Speakers: Mike Fortier, Census Bureau
Alan Tupek, National Science Foundation
Forrest Williams, Department of Commerce - Discussant: Rebecca Sutterlin, American Association of Retired Persons
- Chair: Ed Spar, COPAFS
- Date/Time: Thursday, April 16, 1998, 3:00 - 4:30 p.m.
- Location: Meeting Rooms #1 and #2, Conference Center, Postal Square Building, 2 Massachusetts Avenue, N.E. Washington, D.C. (Red Line - Union Station). Call Ed Spar (703-836-0404) to be placed on the visitors list.
- Sponsor: Statistics and Public Policy Section
How the federal government disseminates its data in the next century may change dramatically from current approaches. To ensure that user needs are met, there are many questions that have to be answered.
For example, agencies will need to know:
- Who are the future customers?
- Will new systems requirements and capabilities match
those of the future customers?
- Will anyone be left out?
- Who will pay for the systems of the
future?
- What will be the best mix of delivery
systems?
- How do we insure that the data are used
properly?
- Where will metadata fit in?
- What s the role (if any) of the private
sector?
- Will the emphasis be on data tables or data
records?
- How will confidentiality be insured?
- What s the role of geography in future systems?
Topic: What Does Financial Planning Software Say about Americans' Preparedness for Retirement?
- Speakers: Mark J. Warshawsky and John Ameriks, TIAA-CREF, New York American University
- Discussant: Eric Engen, Federal Reserve Board
- Chair: Arthur Kennickell, Federal Reserve
- Date/Time: Thursday, April 23, 1998; 12:30 - 2:00 p.m.
- Location: Room B-3234, Eccles Building, Federal Reserve, 20th and C Streets, NW. People outside the Federal Reserve Board who are interested in attending, please call Linda Pitts at (202) 736-5617 at least a week before the seminar to have your name included on the guard's list for admission.
- Sponsor: Economics Section
In the new age of "individual responsibility," a question has been asked frequently by policy makers, the media, and the public: "Will current generations of workers be able to retire comfortably?" Some prominent economists have recently answered this question in the negative with reference to models of rational life cycle saving and various sources of data on household saving, net worth, and pension coverage. The answer, however, has also generated controversy and debate.
This paper takes a more prosaic approach to the question. We employ almost all of the features of the Quicken Financial Planner, a comprehensive software package available to the general public which has achieved some commercial success, as our assessment "model." We then pass through the Planner the most comprehensice and representative source of data available on American households' financial and economic status--the Survey of Consumer Finances.
The main objective of the paper is to assess the preparedness of the public for retirement. Because the primary methodology employed in this assessment is the Quicken Financial Planner model, we also comment on he strengths and weaknesses of the Planner, particularly as compared to more traditional economic models. Finally, we suggest questions to be added to surveys of household economic status and attitudes to improve the conduct of future studies. Return to top
Topic: Outliers in Survey Sampling
- Speakers: Hyunshik Lee, Westat
- Discussant: Ron Fecso, National Science Foundation
- Chair: Brenda Cox, Mathematica
- Date/Time: Wednesday, April 29, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Use the First St. NE entrance. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list and bring photo id.
- Sponsor: Methodology Section
The problem of outliers in survey sampling is quite different and it is in general more difficult to handle than the problem encountered in other statistical disciplines. This is owing to the two distinct features of design-based survey sampling: the finite nature of survey populations and lack of parametric model assumption on the population distribution. Because of this difficulty, we try to prevent the occurrence of outliers using an efficient sample design such as size stratification. Complete prevention, however, is not feasible and it is necessary to provide sound methodology for handling outliers in sample surveys. This presentation reviews the unique nature of outlier problem in survey sampling and existing methods to handle the problem. Recent developments in this area are also reviewed. Return to top
POSTPONED UNTIL A LATER DATE
Topic: Longitudinal surveys: Why are they different from all other surveys?
- Speaker: David Binder, Statistics Canada
- Discussant: Graham Kalton, Westat
- Chair: Karol Kr¢tki, Education Statistics Services Institute
- Date/Time: TBA
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Use the First St. NE entrance. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list and bring photo id.
- Sponsor: Methodology Section
We review the current status of various aspects of the design and analysis of studies where the same units are investigated at several points in time. These studies include longitudinal surveys, and longitudinal analyses of retrospective studies and of administrative or census data. The major focus is the special problems posed by the longitudinal nature of the study. We discuss four of the major components of longitudinal studies in general; namely, Design, Implementation, Evaluation and Analysis. Each of these components requires special considerations when planning a longitudinal study. Some issues relating to the longitudinal nature of the studies are: concepts and definitions, frames, sampling, data collection, nonresponse treatment, imputation, estimation, data validation, data analysis and dissemination. Assuming familiarity with the basic requirements for conducting a cross-sectional survey, we highlight the issues and problems that become apparent for many longitudinal studies. Return to top
Topic: Genocide and Population Statistics: The case of the Holocaust and the Nuremberg Trials
- Speaker: William Seltzer, Senior Research Fellow, Institute for Social Research, Department of Sociology and Anthropoligy, Fordham University
- Discussants: Sybil Milton, Vice President, Independent Experts Commission and Independent Historian and Jean-Claude Milleron, International Monetary Fund
- Chair: Tom Jabine, Statistical Consultant
- Date/Time: Tuesday, May 12, 1998, 12:30 to 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC. (Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
- Sponsor: Social and Demographic Statistics Section
Drawing on primary and other sources, the first part of this presentation discusses how population statistics were used by the Nazis in planning and implementing the Holocaust and how the data systems that gathered these statistics and other information were also employed to assist in carrying out the Holocaust. The review covers experience in Germany, Poland, France, the Netherlands, and Norway. Attention is also given to the role played in this work by some of those then professionally active in demography and statistics. The next section examines the use and impact of perpetrator-generated Holocaust mortality data and other estimates of Jewish losses at the Nuremburg Trials. The last sections discuss several present-day implications of the historical experience reviewed earlier. These issues include: (a) lessons, if any, for formulating prudent national statistical policies; (b) approaches to investigating future genocides and prosecuting those believed to be responsible; (c) the need for increased attention by statisticians and demographers to the ethnical dimensions of their work; and (d) the importance of further research in this area. Return to top
Topic 1: The Census Bureau's Business Register: Basic
Features and Quality Issues
Topic 2: Quality of the BLS Business Establishment List as
a Sampling Frame
- Speaker 1: Ed Walker, Census Bureau
- Speakers 2: Michael Searson and Tracy Farmer, Bureau of Labor Statistics
- Chair: Brenda G. Cox, Mathematica Policy Research
- Date/Time: Wednesday, May 13, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC. (Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
- Sponsor: Methodology Section
This paper describes the Census Bureau's business register, also known as the Standard Statistical Establishment List (SSEL). An initial section will present an overview covering basic topics such as purpose, scope, statistical units, information content, data sources, and maintenance procedures; further it will highlight selected characteristics of the United States business population, as the register represents it, to illustrate key aspects of SSEL composition and structure. A second section will identify and briefly discuss business register quality issues, including coverage, accuracy of classifications and other critical content, timeliness, relevance, accessibility, and cost.
Abstract 2:
The Bureau of Labor Statistics uses the administrative records of the Unemployment Insurance system of the State Employment Securities Agencies as the primary component of its Business Establishment List. This paper will discuss the strengths and weaknesses of this approach and the BLS efforts to enhance and improve the quality of these data for use as a sampling frame. Topics that will be addressed are the use of standardized program software in the states; quality assurance procedures in the industrial coding of establishments; supplemental data collection efforts for multiple establishment firms within a state; and, establishing and maintaining a uniform set of standards for the states. Return to top
Topic: The National Compensation Survey
- Speakers: Stephen Cohen, Jason Tehonica, Lawrence Ernst, Bureau of Labor Statistics
- Discussant: Easley Hoy, Census Bureau
- Chair: Karol Krotki, Education Statistics Services Institute
- Date/Time: Wednesday, May 20, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Use the First St. NE entrance. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list and bring photo id.
- Sponsor: Methodology Section
The National Compensation Survey (NCS) is the new BLS survey for measuring employee wages by skill level. The NCS will replace the Occupational Compensation Survey Program (OCSP) and integrate it with the Employment Cost Index (ECI), which measures quarterly change in employer compensation costs, and the Employee Benefits Survey (EBS), which measures participation rates and gives details of employer provided benefits. This seminar discusses the integration of the surveys and design changes, such as selecting a pps sample of occupations in each establishment and matching skill level to generic definitions rather than surveying a fixed list of benchmark occupations as was done in the OCSP. When fully integrated, ECI will also be able to measure quarterly change in employer costs by occupational skill level in addition to industrial and occupational detail and EBS will link participation and provisions data with employer cost.
Also addressed is the sample design and estimation process for the NCS. This covers the following three stages of sample selection and the weights associated with each of these stages: the area based PSUs, the establishments, and the occupations. The weighting discussion includes a new weight at the occupation level which will produce estimates reflecting the current employment, instead of a weight, like the weight currently used in the ECI, that reflects employment at the time of initiation. Nonresponse adjustment for establishment and occupation nonresponse will be addressed. Also included are data collection issues at the establishment and occupation levels that require weight adjustments, as for example when data is collected from a unit that differs from the assigned establishment. Return to top
Topic: Proposed Change in the Population Standard for Age Adjusting Death Rates and Plan to Implement ICD-10 for Mortality
- Speakers: Harry M. Rosenberg, PhD., Chief of the Mortality Statistics Branch, Division of Vital Statistics, National Center for Health Statistics, NCHS, and Robert N. Anderson, PhD., Statistician, Mortality Statistics Branch, Divison of Vital Statistics, NCHS
- Chair: Joe Fred Gonzalez, Jr., NCHS, CDC
- Date/Time: Wednesday, May 27, 1998, 2:30 - 4:00 p.m.
- Location: National Center for Health Statistics, Presidential Building, 11th Floor Auditorium, Room 1110, 6525 Belcrest Rd, Hyattsville, MD (Metro: Green Line, Prince George's Plaza, then approximately 2 blocks)
- Sponsor: Public Health & Biostatistics
- POC: Joe Fred Gonzalez, Jr. (jfg2@cdc.gov)
A proposal is being made to change the standard population used to age adjust death rates from 1940, widely used by CDC, by the states, and by other agencies, to the year 2000. The change would be made for data on deaths occurring in 1999. Also being proposed is the uniform adoption of this standard within the Department of Health and Human Services for routine presentation of mortality statistics. Age-adjusted death rates are one of the key measures used in mortality statistics to take into account the changing age distribution of the population, and thereby to make meaningful comparisons of mortality risk over time and among groups. In this seminar we discuss the process by which these proposals were developed, the rationale for the proposed change, and issues associated with implementing the change. In addition to the proposed change in population standards effective with 1999 mortality data, the United States will be implementing ICD-10 for mortality also effective with the 1999 data year. We briefly describe the ICD-10 implementation process that will occur for mortality within the Department and the states. Return to top
Topic: Collecting Information on Disabilities in the 2000 Census: An example of Interagency Cooperation
- Speaker: PANEL Discussion by Terry DeMaio, Census, Louisa Miller, Census, Scott Brown, Department of Education, Bob Clark, Department of Health and Human Services, and Michele Adler, Social Security
- Chair: Nancy Kirkendall, OMB
- Date/Time: Thursday, May 28, 1998, 12:30 to 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Enter at 1st and Massachusetts Avenue. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitor list.
- Sponsor: Social and Demographic Statistics Section
Various legislative initiatives affecting a broad range of federal agencies require data on persons with disabilities which can only be collected in a large survey such as the decennial census. The data required to implement the legislation and regulatory requirements are not uniform and, in fact, in some cases they may represent conflicting requirements. The questions originally planned for measuring disability in the 2000 census did not meet those needs. This is a panel discussion to review the process that agency representatives went through under the leadership of OMB to study and revise the disability questions for the year 2000 census. The process was unusual for government agencies -- responsive and cooperative in the face of competing needs. The panel members will talk about the needs, the research that was done, the cooperative effort, and the final resolution. Return to top
Topic: Evaluating the age to begin periodic breast cancer screening using data from a few regularly scheduled screens
- Speaker: Stuart Baker, National Cancer Institute, NIH
- Date/Time: Wednesday, June 3, 1998 at 11AM
- Location: Conference Room G, Executive Plaza North (EPN), 6130 Executive Blvd, Rockville
- Sponsor: DCPC
To evaluate various ages to begin periodic breast cancer screening, we propose a method of analysis which can be applied to either a nonrandomized or a randomized study involving only a few screens at regular intervals. For the analysis of data from a nonrandomized study, we assume (i)once breast cancer can be detected on screening and confirmed by biopsy, it will stay that way (ii) given age, the probability of breast cancer detection does not depend on year of birth, and (iii) subjects who refuse screening have the same rates of breast cancer mortality following diagnosis as screened subjects had they not received screening. The key idea is that older screened subjects are controls for younger screened subjects. For the analysis of data from a randomized study, we relax assumption (iii). Based on the HIP randomized trial and assumptions (i) and (ii), we estimate that starting periodic breast cancer screening with mammography and physical examination at age 40 instead of age 50 reduces breast cancer mortality by 14 per 10,000 with 95% confidence interval of (-3/10,000, 33/10,000). This must be weighed against an estimated increase in the number of biopsies which do not detect cancer of 580 per 10,000 with 95% confidence interval of (520/10,000, 650/10,000). Return to top
Topic: Building a statistical metadata repository at the U.S. Bureau of the Census
- Speakers: Daniel Gillman, Martin Appel and Samuel Highsmith, Jr., U.S. Bureau of the Census
- Discussant: Sameena Salvucci, Synectics
- Chair: Karol Krotki, Education Statistics Services Institute
- Date/Time: Wednesday, June 17, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Use the First St. NE entrance. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitors' list and bring photo id.
- Sponsor: Methodology Section
This paper describes the results of continuing research at the U.S. Bureau of the Census (BOC) into the content, design, population, query, maintenance, and implementation of a statistical metadata repository. The goals of the research are many, but the ultimate goal is to prove the need and feasibility for agency wide statistical metadata management at the BOC. In support of this goal a multi-dimensional effort has been launched. The major parts of this effort include the development of detailed models for describing the content and organization of a statistical metadata repository; building an agency standard for statistical metadata; development of tools for the collection, registration, and query of metadata; and the integration of a repository into other statistical information systems. This paper will describe the models which were developed, the BOC statistical metadata standard, the tools which are under development, the efforts to link the repository with other statistical information systems, and the efforts to build prototypes. Return to top
Topic: Poverty Measurement Research: Defining Thresholds and Family Resources
- Speaker: Thesia Garner and Kathleen Short
- Chair: Michael Horrigan, Bureau of Labor Statistics
- Date/Time: Tuesday, June 23, 1998; 12:30 - 2:00 p.m.
- Location: Room 2990, Bureau of Labor Statistics, Postal Square Building, 2 Massachusetts Avenue, N.E., Washington, D.C. (Red Line Union Station). Call Glenda Driggers (606-7391) or Yvonne Jenkins (606-6402) to be placed on the visitors' list.
- Sponsor: Social and Demographic Section
Poverty Measurement Research: Defining Thresholds
Thesia I. Garner, Research Economist, Bureau of Labor Statistics
In its 1995 report, the National Academy of Sciences Panel on Poverty and Family Assistance made recommendations to revise the official poverty measure. The report included suggestions for defining the reference threshold using Consumer Expenditure Survey (CEX) data, updating the threshold over time, accounting for households with varying compositions, and adjusting for inter-area price differences. In this presentation Thesia Garner will describe the procedure used by BLS researchers to estimate thresholds using CEX data, focusing on these recommendations.
Poverty Measurement Research: Family Resources
Kathleen Short, Chief, Poverty and Health Statistics Branch, Census Bureau
Measuring poverty consists of comparing available family resources to a poverty threshold in order to determine the family's ability to meet basic needs. In this presentation, Kathleen Short will describe the National Academy of Sciences recommendations on measuring family resources and discuss measurement issues encountered in that effort. Poverty rates based on these experimental measures will be presented. She will also discuss the Census Bureau's plans for an upcoming report on this topic. Return to top
Topic: Methodological issues surrounding the application of cognitive psychology in survey research
- Speakers: Clyde Tucker, U.S. Bureau of Labor Statistics
- Discussant: Elizabeth Martin, U.S. Bureau of the Census
- Chair: Brenda Cox, Mathematica Policy Research
- Date/Time: Wednesday, June 24, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Use the First St. NE entrance. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitors' list and bring photo id. Sponsor: Methodology Section
In the rush to apply cognitive psychology, and its methods, in survey research over the last decade, not enough attention has been given to scientific principles. This paper provides a framework for correcting this problem by focusing on methods to improve both the validity and the reliability of data gathered using cognitive psychology. These methods include better experimental designs and measurement techniques. Experimental designs which facilitate comparisons between alternative cognitive procedures and different experimenters are presented.
Another feature of these designs is that they will require more care in the development of the experimental protocol. As for measurement, techniques for making the use of qualitative data more systematic will be discussed, and methods for constructing ordinal and interval indicators will be offered. Return to top
Topic: Approximations to mean squared errors of estimated best linear unbiased predictors in small area estimation with an application to median income for states
- Speaker: Gauri S. Datta, University of Georgia, BLS, and USBC
- Discussant: David Marker, Westat
- Chair: Karol Krotki, Education Statistics Services Institute
- Date/Time: Wednesday, July 8, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line - Union Station). Use the First St. NE entrance. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitors' list and bring photo id.
- Sponsor: Methodology Section
There is a growing demand by many U.S. and other government agencies to produce reliable statistics for various subgroups of a population. It is now widely recognized that direct survey estimates for these subgroups are likely to yield unacceptably large standard errors since only a few samples for the subgroups can be obtained from the surveys. Model based inference is gaining popularity in small area estimation because it can effectively use information from various sources in conjunction with the survey data. In this talk, we consider second order accurate approximations to the mean squared errors of estimated best linear unbiased predictors (EBLUP) of mixed effects in a normal mixed linear setup. This setup covers many important small area models including the nested error regression model of Battese, Harter, and Fuller (BHF) (1988). We extend BHF, Kackar-Harville, and Prasad-Rao approximations to MSE of EBLUP when variance components are estimated by maximum likelihood and residual maximum likelihood methods. As an alternative to estimated MSE based on Prasad-Rao-type approximations, we propose, following Laird and Loius (1987) and Butar and Lahiri (1997), a bootstrap approximation to the estimated MSE. This result is similar in spirit to the work of Fuller (1989). A naive bootstrap in an unbalanced Fay-Herriot model fails to account for the bias in variance estimation. We demonstrated the effectiveness of our procedure in estimating the median income of four-person families for the 50 states and the District of Columbia. Return to top
Topic: A Method to Analyze the Prostate Cancer Prevention Trial Adjusting for a Missing Binary Outcome, an Auxiliary Variable, and All-or-None Compliance
- Speaker: Stuart G. Baker, National Cancer Institute
- Date/time: Wednesday, September 2, 1998, 11:00 a.m.
- Location: Conference Room G, Executive Plaza North, 6130 Executive Blvd., Rockville, MD
- Sponsor: Biostatistics and Public Health Section
The National Cancer Institute is conducting a randomized trial to compare the effect of daily finasteride versus placebo on prostate cancer determined bybiopsy. Investigators have scheduled a biopsy at the end of the trial in seven years or following a positive PSA (prostate-specific antigen) on annual screening. The analysis will need to adjust for two likely complications. First, some subjects will not receive a biopsy, partly depending on whether or not they had a positive PSA. The indicator of positive PSA is called an auxiliary variable, which is a variable observed after randomization and prior to outcome. Second, starting soon after randomization, some subjects randomized to finasteride will stop taking their tablets and some subjects randomized to placebo will obtain finasteride outside of the trial. This type of noncompliance is called all-or-none. To adjust for these complications we formulate the appropriate likelihoods and obtain closed-form maximum likelihood estimates and variances. Without these adjustments, estimates may be biased,two-sided type I errors above nominal levels, and coverage of confidence intervals below nominal levels. Return to top
U.S. BUREAU OF THE CENSUS
STATISTICAL RESEARCH DIVISION SEMINAR SERIES
Topic: Building an Automated Computer Assisted Personal Interviewing Environment to Support Current Surveys and Integrated Coverage Measurement
- Speakers:
Samuel N. Highsmith/Statistical Research Division
Steve Tornell/Technologies Management Office (U.S. Bureau of Census) - Date/Time: Wednesday, September 2, 1998, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, Bldg. 3. Enter at Gate 5 on Silver Hill Road. Please call Barbara Palumbo at (301) 457-4892 to be placed on the visitors' list. A photo ID is needed for security purposes.
In February of 1994, the Field Division began using a new Computer Assisted Personal Interviewing system for the Current Population Survey. The Field Representatives (FR) set up their laptops at home to automatically connect to the communications server located at headquarters. FRs began using these laptops and CASES survey software to perform interviews rather than paper and pencil. All completed cases were then transmitted to headquarters and any new software and cases were downloaded to the FR laptop.
The initial system was a remarkable advance but depended on retransmissions in the event the server could not be reached during the evening. We had little redundancy to protect us in the event of power or telephone outages, network difficulties, hardware failures, or other problems. A team was formed to research building fault tolerance into the system and ensuring FR phone calls succeeded.
A pilot using a communications management system from XcelleNet was first procured and tested in 1995. A complete server environment was then designed and implemented. In the process, we redesigned the CAPI applications to take advantage of the XcelleNet product.
Beginning in 1997, the FRs were moved from the old to the new CAPI environment automatically in stages. The move was completed and the old server retired in May of 1998. This software is planned to be used to support 15,000 ICM FRs in support of the Decennial Census.
This talk will discuss selection of a commercial replacement CAPI communications management system, design and implementation of an automated, fault tolerant client/server system. Discussion items will include the trials and tribulations of the client/server environment, the many capabilities now available to support CAPI surveys, security, error logging and reporting, encryption, automatic survey failover, pager notification of error conditions, and scalability.
This program is physically accessible to persons with disabilities. Requests for sign language interpretation or other auxiliary aids should be directed to Barbara Palumbo (SRD), (301) 457-4892 (v), (301) 457-3675 (TDD). Return to top
STATISTICAL RESEARCH DIVISION SEMINAR
SERIES
PART 1 OF 3
Topic: Designing the User Interface: the Case for Information Visualization
- Speaker: Ben Shneiderman, Human-Computer Interaction Laboratory, University of Maryland
- Date/Time: Wednesday, September 9, 1998, 10:30 - 11:30a.m.
- Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, Bldg. 3. Enter at Gate 5 on Silver Hill Road. Please call Barbara Palumbo at (301) 457-4892 to be placed on the visitors' list. A photo ID is needed for security purposes.
Designers are recognizing a grand opportunity to substantially improve contemporary user interfaces, thus dramatically altering computer-oriented workplaces on the world wide web and elsewhere. Improved development methodologies, expert review methods, and usability testing strategies coupled with novel information visualization strategies can greatly increase productivity, substantially reduce fatigue and errors, enable users to generate creative solutions to their problems, and lead to attractive products and services.
Human perceptual skills are remarkable, but largely underutilized by current graphical user interfaces. The next generation of animated GUIs and visual data mining tools can provide users with remarkable capabilities, if designers follow the Visual Information-Seeking Mantra:
Overview first, zoom and filter, then details-on-demand.
But this is only a starting point in the path to understanding the rich set of information visualizations that have been proposed. Two other landmarks are (1) direct manipulation: visual representation of the objects and actions of interest and rapid, incremental, and reversible operations; and (2) dynamic queries: user controlled query widgets, such as sliders and buttons, that update the result set within 100msec.
These principles will be discussed and are shown in the HomeFinder, FilmFinder, NASA EOSDIS (for environmental data), and Spotfire (for multi-dimensional data) applications.
Note: Paula Schneider, Principal Associate Director for Programs at the Bureau of the Census, will introduce this seminar which is the first one in a three-part series sponsored by the SRD Usability Laboratory. The series will inform attendees of the integral part of usability in information technology system development and transition. The second seminar will be presented by Kent Norman on Thursday, September 17 and the third seminar will be presented by Catherine Plaisant on Thursday, September 24.
This program is physically accessible to persons with disabilities. Requests for sign language interpretation or other auxiliary aids should be directed to Barbara Palumbo (SRD), (301) 457-4892 (v), (301) 457-3675 (TDD). Return to top
Topic: Latent Class Models-A New Tool to Evaluate Data Quality
- Speakers:
John M. Bushery, U.S. Bureau of the Census
Paul P. Biemer, RTI
Patrick E. Flanagan, U.S. Bureau of the Census - Chair: Brenda G. Cox, Mathematica Policy Research, Inc.
- Date/Time: Wednesday, September 16, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line-Union Station). Use the First St. NE entrance. Call Karen Jackson (202-606- 7524) at least 2 days before talk to be placed on the visitor list and bring photo ID.
- Sponsor: Methodology Section
The Census Bureau has long used reinterview programs to measure the reliability (response variance) of survey and census data. But a fundamental limitation of reinterview programs is the inability to estimate bias accurately. Latent class modeling is a relatively new tool for evaluating data quality. By fitting models to estimate the probabilities of erroneous responses, one can estimate both reliability and bias. This paper describes the use of latent class models (LCM) to evaluate data quality in the Current Population Survey. We discuss issues such as model assumptions and validity, goodness of fit, and weighting and variance estimation. Return to top
Topic: Filling in the Gaps for a Partially Discontinued Data Series
- Speaker: James Knaub, Energy Information Administration
- Discussant: Philip Steel, U.S. Census Bureau
- Chair: Linda Atkinson, Economic Research Service
- Day/Time: Tuesday, September 22, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Red Line -- Union Station). Enter at Massachusetts Avenue and North Capitol Street. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitors' list and bring photo id.
- Sponsor: Economics Section
Data on US coal production, imports, producer and distributor stocks, consumption, exports, consumer stocks, and, by default, losses and unaccounted for coal, have been collected, and, considering changes in stock levels, all data have been 'balanced' on a quarterly basis. That is, the relationship between these variates is written as a single equation. These data have been published in the Energy Information Administration (EIA) Quarterly Coal Report for more than sixteen years. Producer and distributor stocks (p/d stocks) will no longer be collected on a quarterly basis, due to budgetary constraints, but will be observed annually. The EIA still wishes to publish these data quarterly, with 'estimates' given for p/d stocks for the first, second and third quarters of each year, and the observed value given for the fourth quarter. (Note that these "estimates" are actually "predictions." However, unlike the usual case with predictions, no values will ever be observed for first, second or third quarter p/d stocks after 1997. Also, we are only interested in a value to approximate the current conditions for each of these publications, not forecasts of the future.) This paper explores the use of weighted linear regression modeling, prediction and the variance of the prediction error, and the combination of 'predictions' from different models, to help fill in these unobserved p/d stock quarterly values in a reasonable manner, and provide estimated standard errors. The procedures found here have substantial potential for use whenever one might consider reducing the frequency of periodic data collection for an established data series. Return to top
Title: Current Internet Technology & Statistics - Blessing or Curse?
- Speaker: Juergen Symanzik, George Mason University Center for Computational Statistics
- Chair: Mike Fleming, National Agricultural Statistics Service
- Date/Time: Tuesday, October 6, 1998, 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Red Line--Union Station). Enter at 1st Street and Massachusetts Avenue. Please call Karen Jackson at 202-606-7524 or send e-mail to jackson_karen@bls.gov to be placed on the visitor's list.
- Sponsor: Statistical Computing Section
In this talk, we provide a general overview on the existing Internet technology that affects today's research and education in Statistics. Examples are electronic journals, statistical software packages, and teaching software accessible through the WWW and on-line access to data, organizations, and individuals - to mention only a few.
In the second part, we provide a critical discussion on advantages, disadvantages, and even dangers of this technology. Return to top
Title: Estimating Incidence of Dementia Sub-Types: Assessing the Impact of Missed Cases
- Speaker(s):
Grant Izmirlian, and Dwight Brock
Epidemiology, Demography, and Biometry Program, National Institute on Aging, National Institutes of Health and Lon White
Honolulu Asia Aging Study, Kuakini Medical Center, Honolulu - Date/Time: Wednesday, October 7, 1998, 12:00 - 1:00 p.m.
- Location: Conference Room H, Executive Plaza North (EPN), 6130 Executive Blvd, Rockville, MD
- Sponsor: WSS and NIA
In many community based studies on the incidence of dementia, a target population is screened and a sub-sample is clinically evaluated at baseline and follow-up. Incidence rates are affected by missed cases at both exams and this complicates the estimation of these rates. Recent work proposes a regression based technique for joint estimation of prevalence and incidence and suggests the use of surrogate information obtained on the entire cohort at both times to calculate the expected score equation contribution (mean score imputation MSI)for individuals missing clinical exams at one or both times. This helps to quantify the impact of missed diagnosis upon the incidence estimates and their confidence intervals. We have extended this technique to the setting of sub-types of dementia for use in the Honolulu-Asia Aging study on incidence of dementia, three year study of incident dementia.
We estimate the incidence of all cause DSMIII-R dementia to be 16 cases per thousand per year, 95 % CI (10,24) overall, with the highest age-group 85+, showing a significant increase over the others <75, 75-80, 80-85. The incidence of pure DSMIII-R Alzheimer's disease was estimated to be 5.4 cases per thousand per year, 95% CI (2.8,10) overall, with a significant difference in the highest age group. The incidence of pure DSMIII-R vascular dementia was estimated to be 4.1 cases per thousand per year, 95% CI (1.9,9.0) overall with no significant differences across the age groups due to lack of cases.
Our estimates and confidence intervals give a more realistic picture of our level of certainty in these estimates given lack of perfect predictive power in screening (about twice as large). In addition, by comparing our model based estimates which use surrogate information with crude rates calculated assuming perfect predictive values of screening we see the following trends which give insight to the possible age patterns of missed diagnosis in an incidence of dementia study. Return to top
1998 Herriot Award Session
Seminar: Research on Racial Coding
- Chair: Daniel Kasprzyk, National Center for Educational Statistics
- Speakers:
Nampeo McKinney, U.S. Bureau of the Census
Bob Groves, University of Maryland
Katherine Wallman, Office of Management and Budget - Date/time: Wednesday, October 14, 1998, 12:30 - 2:30 p.m. (note extended time)
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Red Line -- Union Station). Enter at Massachusetts Avenue and North Capitol Street. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitors' list and bring photo id.
This year the two winners of the Roger Herriot Award for innovation in federal statistics are Roderick Harrison (Bureau of the Census) and Clyde Tucker (Bureau of Labor Statistics). They are being honored for their leadership role in the just completed research which has resulted in a major reform of racial/ethnic coding in federal censuses and surveys.
The WSS session will summarize the research work which they led. A full program listing all the speakers will be provided next month. There will be a reception at the beginning of the session, so come hungry. Return to top
Topic: Measuring the Impact of Welfare Reform: The New Survey of Program Dynamics
- Speakers: Stephanie Shipp and Jennifer Hess (U.S. Bureau of the Census)
- Discussant: Barbara Gault (Institute for Women's Policy Research)
- Date/Time: Thursday, October 15, 1998, 12:30 - 2:00 p.m.
- Where: BLS Cognitive Lab, Room 2990, Postal Square Building, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line-Union Station). Use the First St. NE entrance. Call Karen Jackson(202-606-7524) at least two days before talk to be placed on the visitor's list and bring photo ID.
- Sponsor: Data Collection Methods Section
The Census Bureau has begun the Survey of Program Dynamics (SPD), a large, longitudinal, nationally-representative survey designed to provide information on intervals of actual and potential program participation over a ten-year period and to examine the causes of program participation and its long-term effects on the well-being of individual recipients, their families, and their children. We will describe the survey and then address how the Bureau plans to construct a consistent longitudinal data series from three distinct survey instruments that use different recall periods, different accounting periods, and sometimes different questions. Preliminary results from the 1997 "Bridge" Survey will also be presented.
The 1998 SPD instruments include an interviewer-administered automated instrument for adults and a self-administered instrument for adolescents. We will describe the challenges we faced in developing, designing, and testing the SPD questionnaires and discuss how each was resolved. Development issues included defining the content of this omnibus survey to meet the needs of the legislation and limiting the scope so as not to overburden respondents and exceed budgetary constraints. Design issues included incorporating both household- and person-level questions to improve the efficiency of collecting income data, and administering questions for the adolescent questionnaire via audio-cassette player (with headphones) to ensure privacy for the adolescent in answering potentially sensitive questions. Testing issues included conducting cognitive interviews of the automated adult questionnaire on a compressed schedule, conducting cognitive interviews on an instrument designed to be administered by cassette player with adolescents and, after a field pretest, making recommendations for questionnaire revisions based on information from limited questionnaire evaluation sources. Return to top
The 1998 Morris Hansen Lecture
Topic: Some Current Trends in Sample Survey Theory and Methods
- Speaker: J. N. K. Rao, Carleton University
- Discussants: James M. Lepkowski, University of Michigan, and Robert E. Fay, U. S. Bureau of Census
- Chair: Robert M. Groves, Joint Program in Survey Methodology, University of Maryland
- Date/Time: Monday, October 19, 1998, 3:30 - 5:30 p.m.
- Location: The Jefferson Auditorium, USDA South
Building, between 12th and 14th Streets on Independence
Avenue S.W., Washington DC. The Independence Avenue exit
from the Smithsonian METRO stop is at the 12th Street
corner of the building, which is also where the
handicapped entrance is located.
Please allow enough time to clear security after reaching the South Building. Present security procedures require everyone without a USDA employee ID to register when entering the building. - Sponsors: The Washington Statistical Society, WESTAT, and the National Agricultural Statistics Service.
- Reception: The lecture will be followed by a reception from 5:30 to 6:30 in the patio of the Jamie L. Whitten Building, across Independence Ave.
Beginning with the pioneering contributions of Neyman, Hansen, Mahalanobis and others, a large part of survey sampling theory has been directly motivated by practical problems encountered in the design and analysis of large scale sample surveys. We have seen major advances in handling both sampling and nonsampling errors as well as data collection and processing. In this lecture, I will present some current trends in sample survey theory and methods. After a brief discussion of developments in survey design, data collection and processing, I will focus on inferential issues, resampling methods for analysis of survey data and small area estimation. I will demonstrate the advantages of a conditional design-based approach to inference that allow us to restrict the set of samples to a relevant' subset. Quasi-score tests of hypotheses based on the jackknife method will be presented. I will also discuss issues related to model-based methods for small area estimation.
Learn more about the history of this lecture series along with information about previous lectures when you visit Morris Hansen Lectures. Return to top
Topic: Longitudinal Surveys: Why Are They Different from All Other Surveys?
- Speaker: David Binder, Statistics Canada
- Discussant: Graham Kalton, Westat
- Chair: Brenda G. Cox, Mathematica Policy Research, Inc.
- Date/Time: Wednesday, October 21, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line-Union Station). Use the First St. NE entrance. Call Karen Jackson (202-606- 7524) at least 2 days before talk to be placed on the visitor list and bring photo ID.
- Sponsor: Methodology Section
We review the current status of various aspects of the design and analysis of studies where the same units are investigated at several points in time. These studies include longitudinal surveys, and longitudinal analyses of retrospective studies and of administrative or census data. The major focus is the special problems posed by the longitudinal nature of the study. We discuss four of the major components of longitudinal studies in general: namely, design, implementation, evaluation and analysis. Each of these components requires special considerations when planning a longitudinal study. Some issues relating to the longitudinal nature of the studies are: concepts and definitions, frames, sampling, data collection, nonresponse treatment, imputation, estimation, data validation, data analysis, and dissemination. Assuming familiarity with the basis requirements for conducting a cross- sectional survey, we highlight the issues and problems that become apparent for many longitudinal studies. Return to top
Return to top
Topic: Exploratory Data Analysis of Property Value and Property Tax Data from the 1996 American Community Survey Test
- Speaker: Adeline Wilcox, U.S. Bureau of the Census
- Discussant: David Kellerman, U.S. Bureau of the Census
- Chair: Linda Atkinson, Economic Research Service
- Day/Time: Thursday, October 22, 1998, 12:30-2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Red Line -- Union Station). Enter at Massachusetts Avenue and North Capitol Street. Call Karen Jackson (202-606-7524) at least 2 days before talk to be placed on the visitors' list and bring photo id.
- Sponsor: Economics Section
I began exploratory data analysis (EDA) to visually examine the effects of data editing. The data were collected by the U.S. Bureau of the Census in the 1996 American Community Survey (ACS) test. I used data from the four sites in the 96 ACS test; Fulton County, PA, Rockland County, NY, Brevard County, FL and Multnomah County, OR including those portions of the City of Portland which extend into Washington and Clackamas Counties. I plotted residential property tax values against residential property values. My exploratory scatter plots revealed evidence of different tax rates among sites.
This EDA was made possible by a change in the way property value data were collected. In the 1990 decennial census, property value was collected as a categorical variable. To save space on the ACS 96 form, property value was collected as a continuous variable. That decision created new opportunities for EDA. I also plotted residuals, and made side-by-side box plots and wandering schematic plots. To maintain confidentiality, reasonable outlying values will not be shown. EDA led to ideas for confirmatory analysis and areas of study for improving the editing of these data. Return to top
Title: Bayesian Analysis of Mortality Rates for the U. S.Health Service Areas
- Speaker: Joseph X. Sedransk, Case Western Reserve University, ASA/NSF/BLS Research Fellow
- Discussant: Michael P. Cohen, National Center for Education Statistics
- Chair: Stuart Scott, Bureau of Labor Statistics
- Date/Time: Tuesday, November 17, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Red Line--Union Station). Enter at 1st Street and Massachusetts Avenue. Please call Karen Jackson at 202-606-7524 or send e-mail to jackson_karen@bls.gov to be placed on the visitor's list, and bring photo ID.
- Sponsor: Methodology Section
This talk summarizes research on alternative models for estimating age-specific and age-adjusted mortality rates for one of the disease categories, all cancer for white males, presented in the Atlas of United States Mortality, published in 1996. We use Bayesian methods, applied to four different hierarchical models. Each assumes that the number of deaths, dij, in health service area i, age class j has a Poisson distribution with mean nij ij where nij is the population at risk. The alternative specifications differ in their assumptions about the variation in log ij over health service areas and age classes. We use several different methods to evaluate the concordance between the models and the observed data. These include cross-validation, graphical representations and posterior predictive p-values. The models captured both the small area and regional effects sufficiently well that no remaining spatial correlation of the residuals was detectable, thus simplifying the estimation. We summarize by presenting point estimates, measures of variation and maps. Return to top
Title: Statistics, Equity, and the Law
- Speakers:
Charles R. Mann, Charles R Mann Associates Inc
Phil Ross, Environmental Protection Agency
TerriAnn Lowenthal, Lowenthal Group - Chair: Edward J. Spar, COPAFS
- Date/Time: Tuesday, November 17, 3:30-5:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Red Line--Union Station). Enter at 1st Street and Massachusetts Avenue. Please call Carolyn Shettle (202) 973-2820 or send e-mail to cshettle@erols.com to be placed on the visitor's list.
- Sponsor: Statistics and Public Policy Section, Carolee Bush (DOT) and N. Phillip Ross (EPA), Co-Chairs
Statisticians are often asked to work on equity issues that have important legal and social implications. What should the statistician's role be in such situations? How do we maintain professional neutrality when dealing with such emotionally charged issues? How can we be most useful to our legal and political colleagues? How do we measure equity? These questions will be discussed by representatives fromthe statistical, legal, and political communities, using examples in the areas of air pollution control and enforcement of Equal Economic Opportunity law. Return to top
Title: The Eyes Have It: User Interfaces for Information Visualization
- Speaker: Ben Shneiderman, Department of Computer
Science
Director, Human-Computer Interaction Laboratory, University of Maryland - Chair: Mike Fleming, National Agricultural Statistics Service
- Date/Time: Tuesday, December 1, 1998, 12:30-2:00 p.m.
- Location: A. V. Williams Building Room 3174, University of Maryland, College Park
- Parking: Go to Parking Garage 2 ( bring lots of quarters). Walk 2 blocks down Stadium Drive to Paint Branch Drive. Left turn and then you will see A. V. Williams Building on your right (4 story red brick).
- Sponsor: Statistical Computing Section
Abstract:
Human perceptual skills are remarkable, but largely
underutilized by current graphical user interfaces. The
next generation of animated GUIs and visual data mining
tools can provide users with remarkable capabilities if
designers follow the Visual Information-Seeking Mantra:
Overview first, zoom and filter, then
details-on-demand.
But this is only a starting point in the path to
understanding the rich set of information visualizations
that have been proposed. Two other landmarks are: (1)
Direct manipulation: visual representation of the objects
and actions of interest and rapid, incremental, and
reversible operations; and (2) Dynamic queries: user
controlled query widgets, such as sliders and buttons, that
update the result set within 100msec both of which are
shown in the FilmFinder, Visible Human Explorer (for
National Library of Medicine's anatomical data), NASA
EOSDIS (for environmental data), and LifeLines (for medical
records and personal histories).
As a guide to research, information visualizations can be
categorized into 7 datatypes (1-, 2-, 3-dimensional data,
temporal and multi-dimensional data, and tree and network
data) and 7 tasks (overview, zoom, filter,
details-on-demand, relate, history, and extract). Research
directions include algorithms for rapid display update with
millions of data points, strategies to explore vast
multi-dimensional spaces of linked data, and design of
advanced user controls.
We will demonstrate our visualizations of multi-dimensional
data with Spotfire, temporal data with LifeLines, and tree
structured data with treemaps…geographic application
too.
After the presentation, there will be a tour of the
Human-Computer Interaction Laboratory at UMD.
See www.cs.umd.edu/hcil for
our work and for detailed "Travel Directions" and maps.
Return to top
U.S. BUREAU OF THE CENSUS
STATISTICAL RESEARCH DIVISION SEMINAR SERIES
Topic: New SAS Procedures for Analysis of Sample Survey Data
- Speakers: Anthony An and Maura Stokes, SAS Institute
- Date/Time: Wednesday, December 2, 1998, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, Bldg. 3. Enter at Gate 5 on Silver Hill Road. Please call Barbara Palumbo at (301) 457-4974 to be placed on the visitors' list. A photo ID is needed for security purposes.
Researchers use sample surveys to obtain information on a wide variety of issues. Many of these sample surveys are based on probability-based complex sampling designs, including stratified selection, clustering, and unequal weighting. To make statistically valid inferences from the sample to the study population, researchers must analyze the data, taking into account the sampling design. In Version 7 of the SAS System, new SAS procedures are available for the analysis of data from complex sample surveys. The new SAS procedures use input describing the sampling design to produce the appropriate statistical analysis for the survey data.
Three SAS procedures for sample surveys will be presented. PROC SURVEYSELECT selects probability samples using various sampling designs, including stratified sampling and sampling with probability proportional to size. PROC SURVEYMEANS computes descriptive statistics for sample survey data, including means, totals, and their standard errors. PROC SURVEYREG fits linear regression models and produces hypothesis tests and estimates for survey data.
In addition, this talk includes a brief overview of other new features in SAS/STAT software available with Version 7.
This program is physically accessible to persons with disabilities. Requests for sign language interpretation or other auxiliary aids should bedirected to Barbara Palumbo (SRD), (301) 457-4974 (v), (301) 457-3675 (TDD). Return to top
Topic: Statistical Methods for Soil Survey Updates
- Speaker: Jay Breidt*, Department of Statistics, Iowa State University, ASA/NSF BLS-Census Research Fellow
- Discussant: Mark Otto, Fish & Wildlife Service
- Chair: Stuart Scott, Bureau of Labor Statistics
- Date/Time: Wednesday, December 2, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line-Union Station). Use the First St. NE entrance. Call Karen Jackson (202-606- 7524) at least 2 days before talk to be placed on the visitor list and bring photo ID.
- Sponsor: Methodology and Agriculture & Natural Resources Sections
We review the current status of various aspects of the design and analysis of Soil mapping projects currently underway in two western Iowa counties involve four phases of data collection: a sample on which only soil names are recorded, a subsample on which surface horizon data are collected, a further subsample on which multiple horizons of data are observed, and a final subsample on which horizon-specific laboratory analyses are conducted. A ``horizon'' is a layer of soil which differs from the adjacent layers in physical, biological, or chemical properties. A horizon does not have a fixed depth. The number and type of horizons varies by location. Consequently, the amount of information obtained at a point varies.
A "profile" is a description of a soil characteristic (such as clay content) as it changes over depth. Estimation of soil profiles by soil type or by horizon series is a key analysis goal of soil survey updates. Two methods of estimating soil profiles will be described. The first is a partially parametric method which uses a linear measurement error model to relate laboratory and field measurements, and employs an imputation procedure to alleviate some of the difficulties of irregularly spaced, horizon-specific data. The second method uses a hierarchical model to attempt to account for all uncertainties, and obtains full posterior distributions of profiles via Markov Chain Monte Carlo.
* This is joint work with Pamela J. Abbitt, Wayne A. Fuller, Sarah M. Nusser, and others at Iowa State. Return to top
Special Presidential Address
Topic: Statistical Literacy and Statistical Competence in the 21st Century
- Speaker: David S. Moore, Purdue University
President, American Statistical Association - Chair: Dwight B. Brock, National Institute on
Aging
President, Washington Statistical Society - Date/Time: Thursday, December 3, 1998, 4:00-5:30 p.m.
- Location: BLS Conference and Training Center G-440, Meeting Rooms 1 and 2, Postal Square building, 2 Massachusetts Avenue, NE, Washington, DC (Metro Red Line Union Station). Use the First Street, NE entrance. Please send e-mail to Karen Jackson at least two days before the talk to be placed on the visitor's list and bring a photo ID: jackson_karen@bls.gov
- Sponsor: The Washington Statistical Society
Abstract:
Educated people face a new environment at century's end:
work is becoming intellectualized, formal higher education
more common, technology almost universal, and information
(as well as mis- and dis-information) a flood. In this
setting, what is statistical literacy, what every educated
person should know? What is statistical competence, roughly
the content of a first course for those who must deal with
data in their work? If the details are automated, are the
concepts and strategies that guide us enough to maintain
"statistics" as a distinct discipline?
[Editor's note: this is Professor Moore's ASA Presidential
Address, originally delivered at the Joint Statistical
Meetings in Dallas, Texas in August 1998. It is
non-technical and should be accessible to a general
audience. We encourage our members to invite their
colleagues and friends to hear this outstanding speaker.]
Return to top
U.S. BUREAU OF THE CENSUS
STATISTICAL RESEARCH DIVISION SEMINAR SERIES
Topic: How the Political Process Dovetails with Strategic Planning for the Census Bureau
- Speaker: Margo Anderson,University of Wisconsin, Milwaukee, Woodrow Wilson Fellow
- Date/Time: Tuesday, December 8, 1998, 10:00 - 11:30 a.m.
- Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - The Morris Hansen Auditorium, Bldg. 3. Enter at Gate 5 on Silver Hill Road. Please call Barbara Palumbo at (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
The 2000 census count is fast approaching and faces a host of interrelated technical, political and legal challenges. How those challenges will be addressed is far from clear. In some cases it is not clear who can, should, or is willing to address them. So what does the bureau do, and how does it respond both as corporate entity, and in the actions of the individuals who make up the agency?
The speaker cannot claim to have sure answers to these questions, but the speaker can provide context, some historical analysis and analogy, and thus hopefully add to the ongoing dialogue on these issues within the bureau. By retrieving past census controversy and challenges, and analyzing how they were addressed, the speaker hopes to provide some tools for dealing with the current situation.
Note:This is the second in a two series jointly sponsored by the Statistical Research Division Seminar Series and the 2010 Planning Staff. Jay Keller, the Assistant Division Chief of the 2010 Planning Staff will introduce the purpose of the 2010 Planning Staff and The New Millennium Speakers Series. Ruth Ann Killion, the Division Chief of Planning, Research, and Evaluation Division (PRED) will introduce Professor Anderson.
This program is physically accessible to persons with disabilities. Requests for sign language interpretation or other auxiliary aids should be directed to Barbara Palumbo (SRD)(301) 457-4974 (v), (301) 457-3675 (TDD). Return to top
Topic: The 1997 Survey of Minority-Owned Business Enterprises Techniques for Constructing a Sampling Frame from Multiple Sources of Administrative Data
- Speaker: Richard A. Moore, Bureau of the Census
- Discussant: Ken Robertson, Bureau of Labor Statistics
- Chair: Stuart Scott, Bureau of Labor Statistics
- Date/Time: Tuesday, December 8, 1998; 12:30 - 2:00 p.m.
- Location: BLS Cognitive Lab, Postal Square Building, Room 2990, 2 Massachusetts Avenue, NE, Washington, DC, (Metro Red Line-Union Station). Use the First St. NE entrance. Call Karen Jackson(202-606-7524) at least 2 days before talk to be placed on the visitor list, and bring photo ID.
- Sponsor: Methodology Section
The 1997 Survey of Minority-Owned Business Enterprises (SMOBE) provides data on the number, receipts, payroll, and employment of minority-owned companies. This sample survey is based on a design that consists of stratifying businesses by state, industry, and race of the owner(s). Business registers contain the state and industry codes. However, very little accurate data is available on the race of owners, an obstacle to designing an efficient sample for this survey. Administrative information, obtained from 12 different sources, is used to infer the most likely race of the owner(s) of each business enterprise. The accuracy of each inference is then evaluated and incorporated into the sample design.
This talk will first examine the sources of administrative data available to the 1997 SMOBE, the methodology used to assess the reliability of each inference, and the alternative sample designs spawned by this approach. With simple examples, we will illustrate the major logistical and methodological concerns encountered while creating the 1997 SMOBE sampling frame. The talk will conclude with elementary examples of more complex data analysis techniques (automated record linkage, stepwise discrimination, logistic regression, variance replication) which we hope to incorporate into the frame construction of the next SMOBE. Return to top
Topic: Multiple Imputation of Income in the Consumer Expenditure Survey: Evaluation Of Statistical Inferences
- Speakers: Trivellore E. Raghunathan, University of
Michigan, ISR
Geoffrey D. Paulin, Bureau of Labor Statistics - Chair: Arthur Kennickell, Federal Reserve Board
- Day/Time: Thursday, December 10, 1998, 12:30-2:00 p.m.
- Location: BLS, Postal Square Building, Room 4245, 2 Massachusetts Avenue, NE, Washington, DC (Red Line -- Union Station). Enter at Massachusetts Avenue and North Capitol Street. Email Karen Jackson (jackson_karen@bls.gov), or call 202-606-7524 if you don't have email, at least 2 days before talk to be placed on the visitors' list and bring photo id.
- Sponsor: Economics Section
Non-response is a problem common to many surveys. For example, many respondents fail to report incomes for some or all working members of their families in the U.S. Consumer Expenditure Interview Survey. Because these data are so important to economic and other analyses, a complete set of data is desirable. The U.S. Bureau of Labor Statistics and U.S. Bureau of the Census have conducted investigations that use model-based, multiple imputation methods as a viable way of obtaining valid inferences when the data are subject to nonresponse. This paper compares results of several econometric applications in which first, data from only "valid" reporters (i.e., those who provide values for all sources of income reported) and second, multiply imputed data are used. Many differences in outcomes are described.
Also investigated is the role of expenditures in imputing incomes. Because income is often used to explain expenditures, endogeneity arises as a theoretical concern. However, the evidence strongly indicates that omitting expenditures in this case results in biased data, the very outcome imputation is attempting to correct. Return to top
Topic: A Two-Step Smoothing Method for Varying Coefficient Models with Repeated Measurements
- Speaker: Colin Wu, Johns Hopkins University
- Date/Time: Thursday, December 10, 1998, 12:30 pm
- Location: Conference Room J, Executive Plaza North (EPN), 6130 Executive Blvd, Rockville, MD.
- Sponsor: WSS and NIH
Abstract:
Datasets involving repeated measurements over time are common in medical and epidemiology trials, such as the study of growth curves. The outcome variables and related covariates are usually observed from a number of randomly selected subjects, each at a set of possibly unequally spaced time points. The relationship between the dependent variable and the covariates is assumed to be linear at a specific time point but the coefficients are allowed to change over time. We show that kernel estimators of the coefficient curves that are based on ordinary least squares may be subject to large biases and extremely unreliable in practice when the covariates are time-dependent. As a modification we propose a two-step kernel method that first centers the covariates and then estimate the curves based on some local least squares criteria and the centered covariates. The practical superiority of the two-step kernel method over the ordinary least squares kernel method is shown through a study of ultrasound measurements of fetal growth and Monte Carlo simulations. Theoretical properties of both two-step and ordinary least squares kernel estimators are developed through their large sample mean squared risks. Return to top
Topic: Forecasting Inflation: Experiences in Analyzing the CPI for Spain
- Speaker: Antoni Espasa, Department of Statistics and Econometrics, The University of Carlos III, Madrid, Spain (also visiting University of California, San Diego)
- Discussant: Jaime Marquez, Federal Reserve Board
- Chair: Stuart Scott, Bureau of Labor Statistics
- Date/Time: Monday, December 14, 1998; 12:30 - 2:00 p.m.
- Location: Conference Center, Meeting Room 2, Postal Square Building, 2 Massachusetts Avenue, NE, Washington, DC, (Metro Red Line-Union Station). Use the First St. NE entrance. Call Karen Jackson(202-606-7524) at least 2 days before talk to be placed on the visitor list, and bring photo ID.
- Sponsors: Methodology and Economics Sections
Nowadays, central banks, such as the Federal Reserve Board in the U.S., view control of inflation as a major aim in executing monetary policy. Issues and methods in forecasting a nation's consumer price index (CPI) will be presented. Both economic and statistical arguments will be given for forecasting inflation from component indexes, rather than directly from the overall index. The construction and use of "core" indexes will be described, along with considerations in the handling of seasonality. The methodology being presented has been effectively applied to the Spanish CPI since 1994 in the speaker's Bulletin of Inflation and Macroeconomic Analysis. This monthly bulletin also contains forecasts for the U.S.
With the January 1, 1999 implementation of the euro as currency in selected European countries, the European Central Bank will have added responsibility for monetary policy. Eurostat will be providing an overall consumer price index (CPI) constructed as a weighted average from price information for individual countries. However, this measure may be too aggregate a measure for effective analysis, since European markets are not so strongly integrated yet. Consequently, forecasting from disaggregated inflation measures again appears attractive. Return to top
Seminar Archives
2024 | 2023 | |||
2022 | 2021 | 2020 | 2019 | |
2018 | 2017 | 2016 | 2015 | |
2014 | 2013 | 2012 | 2011 | |
2010 | 2009 | 2008 | 2007 | |
2006 | 2005 | 2004 | 2003 | |
2002 | 2001 | 2000 | 1999 | |
1998 | 1997 | 1996 | 1995 | |