Washington Statistical Society Seminar Archive: 2002

January 2002
15	Tues.	On Disclosure Protection for Non-Traditional Statistical Outputs
23	Wed.	U.S. Bureau Of Census Statistical Research Division Seminar An Investigation of Response Rates in Random Digit Dialed Telephone Surveys
23	Wed.	U.S. Bureau Of Census Statistical Research Division Seminar Politeness and Cross-cultural Communication
25	Fri.	The George Washington University Department of Statistics Seminar Semiparametric Bayesian Techniques for Problems in Circular Data
29	Tues.	Record Linkage and Machine Learning
31	THurs.	Adaptive and Link-Tracing Sampling Designs
February 2002
4	Mon.	University of Maryland Statistics Program, Department of Mathematics Seminar Estimation of the Self-Similarity Parameter When Data Has Finite Variance or is Heavy-Tailed
6	Wed.	U.S. Bureau Of Census Statistical Research Division Seminar GIS and Image Based Approaches to TIGER Enhancement
8	Fri.	The George Washington University Department of Statistics Seminar A Transaction Price Index for Air Travel
14	Thurs.	University of Maryland Statistics Program, Department of Mathematics Seminar Small Area Estimation for U.S. States, Counties, and School Districts
14	Thurs.	U.S. Bureau Of Census Statistical Research Division Seminar Memorial Session: Wray Jackson Smith
21	Thurs.	U.S. Bureau Of Census Statistical Research Division Seminar Masking and Re-identification Methods for Public-Use Microdata
25	Mon.	Global Atmospheric Changes: Statistical Trend analyses of Ozone and Temperature Data
March 2002
1	Fri.	The George Washington University Department of Statistics Seminar Maximum Likelihood Estimation for Fractional Diffusions
6	Wed.	Estimating Output Growth with Labor Market Indicators: A Kalman Filter Approach to Interpolation and Prediction of GDP with Noisy Data
13	Thurs.	U.S. Bureau Of Census Statistical Research Division Seminar Overview of the Fellegi-Holt Model of Statistical Data Editing: Current Methods and Research Problems
15	Fri.	The George Washington University Department of Statistics Seminar Probabilistic Analysis of Algorithms by the Contraction Method
20	Thurs.	U.S. Bureau Of Census Statistical Research Division Seminar Two-Sided Coverage Intervals For Small Proportions Based On Survey Data
27	Thurs.	U.S. Bureau Of Census Statistical Research Division Seminar Machine Learning Methods for Text Classification
April 2002
3	Wed.	Weighted Likelihood, Mixture Models and Model Assessment
4	Thur.	The SAR Procedure: A Diagnostic Analysis of Heterogeneous Data
4	Thur.	University of Maryland Statistics Program, Department of Mathematics Seminar Efficiency of Monte Carlo EM and Simulated Maximum Likelihood in Two-Stage Hierarchical Models
5	Fri.	The George Washington University Department of Statistics Seminar Optimal Designs For Phase I Clinical Trials
9	Tues.	Incentives in Internet Surveys
11	Thur.	University of Maryland Statistics Program, Department of Mathematics Seminar Beyond Black-Scholes: Probability Distribution of Stock Price Changes in a Model with Stochastic Volatility
12	Fri.	The George Washington University Department of Statistics Seminar Asymptotics of Brownian and Diffusion Sample Paths
16	Tues.	U.S. Bureau Of Census Statistical Research Division Seminar Web-Based Surveys: Questions, Answers, and Designs
18	Thur.	University of Maryland Statistics Program, Department of Mathematics Seminar Mean Squared Error of Empirical Predictor
18	Thur.	The George Washington University Department of Statistics Seminar The Information Content of Trades: A Class of Market Microstructure Models
23	Tues.	The Medicare Current Beneficiary Survey (MCBS)
23	Tues.	U.S. Bureau Of Census Statistical Research Division Seminar A Weighted Jackknife Method for the Fay-Herriot Model with an Application in the Saipe Program
24	Wed.	Combination of Information from Several Sources: The Case of t and F Tests
25	Thur.	Including Families with Limited English Proficiency in the Early Childhood Longitudinal Study, Birth Cohort (ECLS-B)
25	Thur.	University of Maryland Statistics Program, Department of Mathematics Seminar Monte Carlo Approximation and the Bootstrap
25	Thur.	University of Maryland Statistics Program, Department of Mathematics Seminar On the Correlation Structure of Transformed Gaussian Random Fields
May 2002
2	Thur.	University of Maryland Statistics Program, Department of Mathematics Seminar Application of the Sanov Large Deviation Theorem to the Density Estimation and Screening Significant Factors
8	Wed.	U.S. Bureau Of Census Statistical Research Division Seminar YOU ARE HERE: Information Architecture and Web Navigation
10	Fri.	Survey Automation: The Promise and the Reality
14	Mon.	U.S. Bureau Of Census Statistical Research Division Seminar The One-Way Fixed and Random Models under Heteroscedasticity
20	Mon.	U.S. Bureau Of Census Statistical Research Division Seminar An "Optimal" Data Swapping Procedure
24	Fri.	Analyzing patterns of killings and migration flow in Kosovo, March-June 1999
June 2002
13	Thur.	Why Are Semiconductor Prices Falling So Fast? Industry Estimates and Implications for Productivity Measurement
18	Tue.	WSS Annual Dinner Statistics For A New Century: Meeting The Needs Of A World Of Data
20	Thur.	U.S. Bureau Of Census Statistical Research Division Seminar Leonardo's Laptop: Human Needs and the New Computing Technologies
August 2002
7	Wed.	U.S. Bureau Of Census Statistical Research Division Seminar Bootstrap Approximation to Prediction MSE for State-Space Models with Estimated Parameters
16	Tue.	Confidentiality Audit On Suppressed Entries in Multi-Dimensional Contingency Tables
19	Mon.	U.S. Bureau Of Census Statistical Research Division Seminar Parameter Estimation in Logistic Regression -- Not an Easy Matter
20	Tue.	U.S. Bureau Of Census Statistical Research Division Seminar Combined Survey Sampling Inference: Compromise or Consummation?
September 2002
20	Fri.	The George Washington University Department of Statistics Partial Volume Correction for Neuroimaging using Tensor Based Statistical Algorithms
26	Thur.	Robust Seasonal Adjustment using Heavy-Tailed Distributions
October 2002
4	Fri.	The George Washington University Department of Statistics Bayesian Group Testing
15	Tues.	Synthetic Tabular Data To Limit Statistical Disclosure Of Sensitive Information
17	Thur.	Afghan Refugee Camp Surveys: Pakistan, 2002
18	Fri.	The George Washington University Department of Statistics The Value of Standardization - Software and Current Best Methods
November 2002
12	Tues.	The 2002 Roger Herriot Award For Innovation in Federal Statistics
12	Tues.	Correcting for Omitted-Variables and Measurement-Error Bias in Autoregressive Model Estimation with Panel Data
19	Tues.	Morris Hansen Lecture Privacy and Confidentiality - A New Era?
21	Thur.	Confidentiality for a Mandatory Reporting System: Challenges and Solutions
December 2002
5	Thur.	The U.S. Census Bureau's Corporate Metadata Repository: An Overview of the Development Process and Current Status

Title: On Disclosure Protection for Non-Traditional Statistical Outputs

Speaker: Arnold Reznek, U.S. Census Bureau
Co-Author: David Merrell, U.S. Census Bureau
Chair: Virginia DeWolf, National Academy of Science
Discussant: Lawrence H. Cox, National Center for Health Statistics
Date/Time: Tuesday, January 15, 2002, 12:30 to 2:00 p.m.
Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Conference Room 1, 2 Massachusetts Ave., NE, Washington, DC. Please use the First Street entrance to the PSB.
Sponsor: WSS Methodology Section

Abstract:

By law, U.S. Federal statistical agencies must protect the confidentiality of the microdata provided by respondents on surveys and censuses. These agencies' "traditional" data products are aggregates (e.g., tables of frequency counts or totals) or public use microdata files. Disclosure limitation methods for these products (e.g., cell suppression; perturbation; topcoding; recoding) are well developed.

Researchers at the Census Bureau's Center for Economic Studies (CES) and its Research Data Centers (RDCs) commonly generate "nontraditional" statistical output; e.g., from linear and nonlinear regression models, semi-parametric and non-parametric estimation models, and simulations. We think current disclosure limitation methods are less appropriate for this type of output, but the literature provides little guidance. This can make it difficult to balance releasing meaningful research results with maintaining confidentiality protection.

For these nontraditional outputs, we discuss disclosure risks and the appropriateness of existing disclosure limitation methods. We use simulation methods and examples from past disclosure review. We also pose questions for future research. Return to top

Topic: An Investigation of Response Rates in Random Digit Dialed Telephone Surveys

Speakers:
Brenda Cox, Senior Vice-President, RoperASW
Daniel O'Connor, Mathematica Policy Research
Kathryn Chandler, NCES
Discussant: Clyde Tucker, Bureau of Labor Statistics
Date & Time: Wednesday, January 23, 2002, 12:30 - 2:00 p.m.
Location: BLS Conference and Training Center (basement level), Rooms #9 & #10, Postal Square Building, 2 Massachusetts Ave., NE, Washington, DC (Enter on First St., NE, and bring a photo ID.) Metro: Union Station, Red Line.
Co-sponsored by:
American Association for Public Opinion Research
Washington/Baltimore Chapter
& Washington Statistical Society Data Collection Methods Section

Abstract:

Conventional wisdom suggests that obtaining response in telephone surveys is becoming difficult. In describing current problems, interviewers mention the increasing use of answering machines and caller ID as well as the frequency with which households receive sales calls. This perception of declining response rates provided the impetus for this investigation of whether the 1990's had witnessed a decline in response rates for random digit dialed (RDD) telephone surveys. Response rate results were compiled for public and privately sponsored RDD surveys conducted since 1990. To allow comparisons across surveys, each survey's response rate was recalculated using the definition provided by the Council of American Survey Research Organizations (CASRO). This presentation summarizes the results of that investigation, including the wide variety of definitions the surveys used in defining response rates, the variation in response rates across surveys, and potential correlates of response rate differences.

Note: If you did not get an e-mail notice of this meeting but want one for future meetings, please contact dc-aapor.admin@erols.com. Return to top

Topic: Politeness and Cross-cultural Communication

Speaker: Yuling Pan, Georgetown University
Date & Time: January 23, 2002, 10:30 - 11:30 a.m.
Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Sponsor: U.S. Bureau Of Census, Statistical Research Division

Abstract:

This presentation explores the issue of linguistic politeness and its role in cross-cultural communication. Linguistic politeness refers to the appropriate way of using language to communicate with others in a given situation. It involves not only the use of language, but also the knowledge of cultural norms and behavioral patterns that govern the use of politeness strategies. Because of the cultural differences in history and worldview, socialization process, and the concept of human relationships, different cultural groups use or prefer certain politeness strategies for communication in a given social setting. Failure to recognize or use the right politeness strategies in cross-cultural communication will cause misunderstanding and/or misjudgment among people from different cultural backgrounds.

In this talk, the speaker will focus on the role of language and the following issues: 1) types of linguistic politeness and their functions, and how the social dimensions of power, distance, affect and formality influence the use of politeness strategies; 2) the relationship between politeness strategies and the social factors of participants, setting, topic and function; 3) situational variation and cultural variation in the use of politeness strategies and their implication in cross-cultural communication.

The study of politeness is important not only for sociolinguists, but also for those professionals whose work requires a constant use of spoken or written language to communicate with people from different cultural groups. The speaker will draw on consulting experience to illustrate how the study of politeness can be applied to professional communication and business communication in multicultural settings.

This program is physically accessible to persons with disabilities. For interpreting services, contact Yvonne Moore at TYY 301-457-2540 or 301-457-2853 (voice mail) Sherry.Y.Moore@census.gov. Return to top

Title: Semiparametric Bayesian Techniques for Problems in Circular Data

Speaker: Professor Kaushik Ghosh, Department of Statistics, George Washington University
Date & Time: January 25, 2002, 111:00 a.m - 12:00 p.m.
Location: Funger Hall 321. 2201 G Street NW. Foggy Bottom metro stop on the blue and orange line.
Sponsor: The George Washington University, Department of Statistics

Abstract:

Many scientific experiments generate observations that are two-dimensional directions or are periodic with a known period. Such data can be represented by points on a circle - hence the name circular data. In this work, we consider the problems of prediction and tests of hypotheses for circular data in a semiparametric Bayesian setup. Observations are assumed to be independently drawn from the von Mises distribution and uncertainty in the location parameter is modeled by a Dirichlet Process Prior. For the prediction problem, we present a method to obtain the predictive density of a future observation, and, for the testing problem, we present a method to obtain the posterior probabilities of the hypotheses under consideration. Incorporation of the semiparametric model gives us more flexibility and robustness against prior mis-specifications. While analytical expressions are intractable, the methods are easily implemented using the Gibbs sampler. We illustrate their use with examples from Medicine and Geology.

Note: For a complete list of upcoming seminars check the dept's seminar web site: http://www.gwu.edu/~stat/seminars/Fall2001.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359. Return to top

Topic: Record Linkage and Machine Learning

Speaker: William E. Winkler, Statistical Research Division, U.S. Bureau of Census
william.e.winkler@census.gov
Date/Time: Tuesday, January 29, 2002, 10:30 - 11:30 a.m.
Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call 301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Sponsor: U.S. Bureau Of Census, Statistical Research Division

Abstract:

The record linkage model of Fellegi and Sunter (1969) is equivalent to generalized Bayesian network models in machine learning (e.g., Mitchell 1997, Winkler 2000). The underlying computational model uses Maximum Entropy ideas (I.J. Good 1963, Dykstra 1985, Winkler 1990). The first part of this talk introduces an elementary version of the Fellegi-Sunter model that corresponds to naive Bayesian networks. For record linkage, the methods are extended to deal with approximate string comparison, automatic estimation of probabilities without training data, and missing or erroneous identifiers (Winkler 1988, 1990). Friedman (1997, 1999) presents related methods for Bayesian networks.

The second part of this talk provides methods for dealing with automatic estimation of probabilities when there is interaction between identifiers (Winkler 1989, 1990, 1993, Meng and Rubin 1993) and when affine constraints are used to predispose probabilities to certain regions of the parameter space. Efficient methods of accounting for 2-way interactions have been used for Bayesian networks (Sahami 1996, Dumais et al 1998, Sahami et al 1999). Record linkage has primarily been applied in situations where labeled training data are not available. Recent work has shown how general EM methods (Nigam et al. 2000, Winkler 2000) and general MCMC methods (Larsen and Rubin 2001) can yield suitable classification rules when combinations of labeled training and unlabeled test data are used for training.

This program is physically accessible to persons with disabilities. For interpreting services, contact Yvonne Moore at TYY 301-457-2540 or 301-457-2853 (voice mail) or Sherry.Y.Moore@census.gov. Return to top

Title: A Adaptive and Link-Tracing Sampling Designs Surveys

Speaker: Professor Steven K. Thompson, Ph.D.
Pennsylvania State University
Discussant: C Monroe G. Sirken, Ph.D.
Sr. Research Scientist, Office of the Director, National Center for Health Statistics (NCHS)
Chair: Myron J. Katzoff, Ph.D.
Office of Research and Methodology, NCHS
Date & Time: January 31, 2002, 10:00 - 11:00 a.m
Location: NCHS Auditorium (11th floor)
Metro III/Presidential Bldg.
6525 Belcrest Road
Hyattsville, MD
Sponsor: NCHS and WSS Methodology Sectionn

Abstract:

Adaptive sampling designs are those in which the procedure for selecting the sample depends on values of the variable of interest observed during the survey. They can be useful for surveys of populations or monitoring of health events that are highly clustered in space or time. For example, if cases of a rare, contagious disease are encountered in a survey unit, neighboring units can be added to the sample.

Link-tracing designs are used in studies of hidden human populations, such as populations of people at high risk for HIV infection or transmission. In such studies, social links are followed from one individual to another to add more members of the hidden population to the sample. Similarly, in national health surveys, link-tracing techniques could be used to increase the representation of underrepresented target groups.

In this talk, a variety of adaptive and link-tracing sampling methods will be described and their application to health surveys and surveillance programs discussed. Return to top

Title: Estimation of the Self-Similarity Parameter When Data Has Finite Variance or is Heavy-Tailed

Speaker: Vladas Pipiras, University of Boston
Time: Monday, February 4, 2002, 4:15 pm
Place: Room 3206, Mathematics Building, University of Maryland College Park. For directions, please visit the Mathematics Web Site: http://www.math.umd.edu/dept/contact.html
Sponsor: University of Maryland, Statistics Program, Department of Mathematics

Abstract:

The focus of this talk is on data whose statistical properties are preserved with respect to scaling in time and/or space. Such scaling (self-similar) datasets have been found in various applications, most notably in telecommunications several years ago. One of the key questions related to scaling data is estimation of its self-similarity parameter. We will discuss how this can be done by using wavelets. We will first go over wavelet-based estimation in finite variance data and then talk about extensions to data with heavy tails. Return to top

Topic: GIS and Image Based Approaches to TIGER Enhancement

Speakers:
Patricia Hu, Bruce Peterson, Demin Xiong
Center for Transportation Analysis
Oak Ridge National Laboratory
Date/Time: February 6, 2002, 10:30 - 11:30 a.m.
Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Sponsor: University of Maryland, Statistics Program, Department of Mathematics

Abstract:

In this seminar, we will discuss technical approaches for improving and updating TIGER databases, focusing on the use of GIS-based software tools and remotely sensed data. We will first propose a distributed mode of a geographic data system which is modular in space, scale, and function, and facilitates distributed responsibility for maintenance. We will then talk about a set of software tools that can be potentially used for the process. These tools will include: (1) image road network extraction tools for TIGER line file update; (2) map matching tools for road network data conflation and integration, address matching and geographic boundary correlation; and (3) generalization tools for multi-scale centerline road network representations. More importantly we would like to take this opportunity to learn more about existing capabilities and the previous experience for TIGER enhancement, as well as ideas and requirements regarding future directions.

This program is physically accessible to persons with disabilities. For interpreting services, contact Yvonne Moore at TYY 301-457-2540 or 301-457-2853 (voice mail) or Sherry.Y.Moore@census.gov. Return to top

Title: A Transaction Price Index for Air Travel

Speaker: Dr. Janice Lent, U. S. Bureau of Labor Statistics
Joint work with Alan Dorfman, U. S. Bureau of Labor Statistic
Date & Time: February 8, 2002, 11:00 a.m - 12:00 p.m.
Location: Funger Hall 321. 2201 G Street NW. Foggy Bottom metro stop on the blue and orange line.
Sponsor: The George Washington University, Department of Statistics

Abstract:

We present research undertaken to develop a price index estimator based on data from the U.S. Transportation Department's (DOT's) Origin and Destination (O&D) Survey. Through this survey, the DOT collects prices actually paid by consumers for air travel; these may differ considerably from "list prices" (used in the official U.S. airfare CPI) due to the airlines' use of complex pricing structures. Since the O&D survey was not designed to provide data for price index estimation, however, the research involves testing unique imputation and across-time matching procedures. After a brief introduction to the general field of price index estimation, we describe our methodology and compare our experimental index series to the official airfare CPI series. We illustrate their use with examples from Medicine and Geology.

Note: For a complete list of upcoming seminars check the dept's seminar web site: http://www.gwu.edu/~stat/seminars/spring2002.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359. Return to top

Title: Small Area Estimation for U.S. States, Counties, and School Districts

Speaker: William R. Bell, Ph.D. Senior Mathematical Statistician
Small Area Estimation SRD, Bureau of the Census, Washington, DC
Time: Thursday, February 14th, 2002, 12:10 pm - 1:00 pm
Place: Room 1208, LeFrak Hall, University of Maryland College Park. For directions, please visit the Mathematics Web Site: http://www.math.umd.edu/dept/contact.html
Sponsor: University of Maryland, Statistics Program, Department of Mathematics

Abstract:

In response to growing demand form small area estimates for public policy purposes, the Census Bureau has developed the Small Area Income and Poverty Estimates (SAIPE) program. This talk describes the SAIPE models and methods used to produce small area estimates of poor school-age children, which are used in allocations of over $7 billion of funds under Title I of the Elementary and Secondary Education Act. The state and county models use dependent variables obtained from direct poverty estimates from March current Population survey (CPS) data, and predictor variables formed from IRS tax return files, food stamp program data, 1990 census data, and updated Census Bureau population estimates. These models also allow for sampling error in the direct CPS estimates. Some statistical issues arising with these models are discussed. A Synthetic procedure is used to produce school district estimates of children in poverty by applying school district-to-county shares from the 1990 census to the SAIPE county model-based estimates of children in poverty. Evaluations of the school district estimates are discussed. Return to top

Title: Memorial Session: Wray Jackson Smith

Speakers:
Bette S. Mahoney, Consultant
Dhirendra Ghosh and Sameena Salvucci, Synectics for Management Decisions, Inc.
Nancy Kirkendall, Energy Information Administration
John L. Czajka, Mathematica Policy Research, Inc.
Chair: Elizabeth Margosches, Environmental Protection Agency
Date/Time: Thursday, February 14, 2002, 12:30 - 2:00 p.m.
Location: Bureau of Labor Statistics, Conference Center, Room 1, Postal Square Building (PSB), 2 Massachusetts Avenue, NE, Washington, DC. Please use the First St., NE, entrance to the PSB.
Sponsors: WSS Public Policy Section, ASA Government Statistics Section, ASA Caucus on Women in Statistics
Sponsor: U.S. Bureau Of Census, Statistical Research Division

Abstract:

This session honors Wray Jackson Smith for his contributions to the practice of statistics within the federal government and his service to the profession. Dr. Smith, who died unexpectedly in May 2000, was an active member of the Washington Statistical Society, a Fellow of the American Statistical Association (ASA), a founding member of the Government Statistics Section, and a long-time supporter of the Women's Caucus. Dr. Smith is remembered best for his work in federal program evaluation and his role in coordinating or helping to launch several major surveys for the Department of Health and Human Services and the Department of Energy. After retiring from the federal government in 1983, he maintained a very active role in federal statistics from the private sector.

In the first paper, Bette Mahoney reviews some of Dr. Smith's contributions to the application of statistical research to social policy, including his participation in early evaluations of Vista, the Job Corps, and Day Care and his role in some of the major social surveys of the 1970s. In the second paper, Dhirendra Ghosh and Sameena Salvucci discuss the influence of Dr. Smith's 1980 doctoral dissertation on the later work of Smith and others on the optimum periodicity of repeated surveys. Ghosh and Salvucci extend the generality of this work. The third paper, by Nancy Kirkendall, outlines a planned collaboration between herself and Dr. Smith on a text discussing the use of time series methods in periodic surveys. The paper reviews applications to assessing survey costs, analyzing survey design information, and designing edit and imputation procedures. The final paper, by Thomas Jabine and John Czajka, considers the problems that attend the use of periodic sample surveys as a data source in formulas for allocating federal program funds. The paper draws on Dr. Smith's work in the late 1970s as chair of the subcommittee that produced Statistical Policy Working Paper 1 for the Federal Committee on Statistical Methodology. In the month before his death Dr. Smith revisited this topic with a background paper that he delivered to keynote a Workshop on Formulas for Allocating Program Funds. Jabine and Czajka explore alternative approaches to addressing these problems.

This session kicks off the new Wray Smith Scholarship, sponsored by the Government Statistics Section of the ASA, with support from a broad range of organizations, including the Washington Statistical Society. The scholarship is intended to reward promising young statisticians for their diligence and, thereby, encourage them to consider a future in government statistics. A reception will follow this session. Return to top

Topic: Masking and Re-identification Methods for Public-Use Microdata

Speaker: William E. Winkler, Statistical Research Division, U.S. Bureau of Census
Date/Time: February 21, 2002, 10:30 - 11:30 a.m.
Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, FOB 3. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Sponsor: U.S. Bureau Of Census, Statistical Research Division

Abstract:

This talk describes methods for masking public-use microdata. A primary concern (analytic validity) is whether the masked microdata will allow reproduction of some of the analyses that could be produced by the original unmasked data. In some situations, special software routines may be required to analyze the masked microdata. We describe re-identification methods that can be used to evaluate the confidentiality of the microdata.

This program is physically accessible to persons with disabilities. For interpreting services, contact Yvonne Moore at TYY 301-457-2540 or 301-457-2853 (voice mail) or Sherry.Y.Moore@census.gov. Return to top

Title: Global Atmospheric Changes: Statistical Trend Analyses of Ozone and Temperature Data

Speakers: George C. Tiao, The University of Chicago
Chair: David Findley, U.S. Census Bureau
Date/Time: Monday, February 25, 2002, 12:30 to 2:00 p.m.
Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Conference Room 7, 2 Massachusetts Ave., NE, Washington, DC. Please use the First Street entrance to the PSB.
Sponsor: WSS Methodology Section

Abstract:

The ozone layer in the stratosphere plays an important role in the life cycle on earth. This is mainly because ozone absorbs the harmful ultraviolet radiation from the sun and prevents most of it from reaching the surface. In recent years, there has been considerable attention focused on the effect of the release of chlorofluoromethanes on the ozone layer. There has also been an intense interest in global warming due to man-made causes such as the burning of fossil fuels.

In this talk we present findings of an extensive statistical analysis of ozone and temperature data over the last thirty years from networks of ground stations and from satellites. The principal objectives of the analysis are (i) to assess trends in ozone and temperature, and (ii) to compare the estimated trends with predictions obtained from large scale chemical/dynamical models of the atmosphere. Some statistical issues related to trend detection and analyses will also be discussed. Return to top

Title: Maximum Likelihood Estimation for Fractional Diffusions

Speaker: Dr. Jay Bishwal, Department of Mathematics University of Cincinnati
Date & Time: March 1, 2002, 2:00 - 3:00 p.m.
Location: Funger Hall 307. 2201 G Street NW. Foggy Bottom metro stop on the blue and orange line.
Sponsor: The George Washington University, Department of Statistics

Abstract:

Recently, it has been empirically found that log share prices allow long range dependence between returns on different days. In view of this, it becomes necessary to extend the diffusion models to processes having long range dependence. One way is to model these data by stochastic differential equations with fractional Brownian motion (fBM) driving term, with Hurst index greater than 1/2. The fbm being not a Markov process and not a semi-martingale, except where the Hurst index equals 1/2, the classic Ito calculus cannot be used to develop the theory. First, recent developments in fractional stochastic calculus: stochastic integral with respect to fBM, fractional Ito formula and fractional Girsanov formula will be reviewed. The use of Volterra and Dirichlet stochastic calculus will be emphasized. The long time asymptotic behaviour of the maximum likelihood estimator in the drift parameter in the nonlinear SDE driven by fBM will be studied. Some further problems on estimation in fractional diffusions based on discrete observations will be discussed.

Note: For a complete list of upcoming seminars check the dept's seminar web site: http://www.gwu.edu/~stat/seminars/spring2002.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359. Return to top

Title: Estimating Output Growth with Labor Market Indicators: A Kalman Filter Approach to Interpolation and Prediction of GDP with Noisy Data

Speaker: Mark French, Federal Reserve Board
Date & Time: March 1, 2002, 2:00 - 3:00 p.m. Discussant: Peter Zadrozny, Bureau of Labor Statistics
Chair: Linda Atkinson, Economic Research Service, USDA
Date & Time: Wednesday, March 6 , 2002; 12:30 PM - 2:00 PM
Location: Bureau of Labor Statistics, Conference Center Room 10, Postal Square Building (PSB), 2 Massachusetts Ave. NE, Washington, D.C. Please use the First St., NE, entrance to the PSB. To gain entrance to BLS, please see "Notice" at the top of this web page.
Sponsor: Economics Section

Abstract:

This paper uses monthly labor-market data to estimate the unobserved month-to-month path of GDP, and to predict near-term growth of quarterly GDP. Past studies typically used generalized least squares methods, which rely on the exogeneity of the indicator variables. However, indicator variables are typically not exogenous: in this paper, the indicator variables, production-worker hours and the unemployment rate, are determined simultaneously with GDP. This paper uses a preferable state-space/Kalman Filter approach to the interpolation/prediction problem. The paper extends Zadrozny's 1990 work in several ways -- allowing for real-time analysis with noisy data. It incorporates the effects of data revisions, sampling error, and prior release of indicators relative to GDP, using several indicator variables. The resulting forecasts of GDP are marginally improved over least-squares estimates, and at the same time generate smoothed estimates of the indicator variables and the unobserved level of monthly GDP. Return to top

Topic: Overview of the Fellegi-Holt Model of Statistical Data Editing: Current Methods and Research Problems

Speaker:
William E. Winkler
Statistical Research Division
U.S. Bureau of the Census
william.e.winkler@census.gov
Date/Time: March 13, 2002, 10:30 - 11:30 a.m.
Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, Bldg. 3. Enter at Gate 5 on Silver Hill Road. Please call (301) 457-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
Sponsor: U.S. Bureau Of Census, Statistical Research Division

Abstract:

The editing paper of Fellegi and Holt (JASA 1976) provides a model that can be used for production edit/imputation systems. Two advantages of the model are that all edits are contained in easily modified tables and that each edit-failing record can be "corrected" in one pass through the data. Classic if-then-else systems cannot assure that a "corrected" record will satisfy all edits. They are difficult to maintain if there are many if-then-else edit rules in hundreds or thousands of lines of code. Progress on general purpose Fellegi-Holt systems has been slow because of the needed skills in operations research and computer science for the edit portion of systems and the lack of suitable general imputation software. This talk describes methods implemented in Canada and the U.S. and promising new methods being researched in the Netherlands and Italy. Some general background is given in research report srd98/01 at http://www.census.gov/srd/www/byyear.html.

PLEASE CALL (301) 457-4974 IF YOU PLAN TO ATTEND. A PHOTO ID IS REQUIRED FOR SECURITY PURPOSES.

This program is physically accessible to persons with disabilities. For interpreting services, contact Yvonne Moore at TYY 301-457-2540 or 301-457-2853 (voice mail) or Sherry.T.Moore@census.gov. Return to top

Title: Probabilistic Analysis of Algorithms by the Contraction Method

Speaker: Professor Ralph Neininger
School of Computer Science, McGill University, Montreal
Date/Time: Friday, March 15, 2002, 11:00 am - 12:00 pm
Location: Funger Hall 308. 2201 G Street NW. Foggy Bottom metro stop on the blue and orange line.
Sponsor: The George Washington University, Department of Statistics

Abstract:

The contraction method provides a framework to prove limit laws for sequences of random variables satisfying recurrence relations on the level of distributions as they arise for parameters of recursive algorithms or random tree structures. The name of the method refers to the characterization of the occurring limit distributions as fixed-points of maps between spaces of probability measures, which turn out to be contractions with respect to appropriate probability metrics. In this talk an overview of this method is given with particular emphasis on recent developments.

Note: For a complete list of upcoming seminars check the dept's seminar web site: http://www.gwu.edu/~stat/seminars/spring2002.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359. Return to top

Topic: Two-Sided Coverage Intervals For Small Proportions Based On Survey Data

Speaker: Phil Kott, National Agriculture Statistical Service
Chair: Mary Batcher, Ernst & Young
Date/Time: Wednesday, March 20, 2002, 12:30 to 2:00 p.m.
Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Conference Room 7, 2 Massachusetts Ave., NE, Washington, DC. Please use the First Street entrance to the PSB. To gain entrance to BLS, please see "Notice" at the top of this web page.
Sponsor: WSS Methodology Section

Abstract:

The standard two-sided Wald coverage interval for a small proportion, P, may perversely include negative values. One way to correct this anomaly when analyzing unweighted data from a simple random sample is to compute an asymmetric Wilson (or score) coverage interval. This approach has proven not only theoretically satisfying but empirically effective.

When P is estimated with a weighted estimator, p, using data from a complex sample, some have suggested computing an ad-hoc Wilson coverage interval by replacing the actual sample size in the Wilson formula with the effective sample size. We will examine this approach and an alternative based on a proposal by Andersson and Nerman (at ICES II). Their method focuses on removing the impact of the correlation between the numerator and denominator of the pivotal, (p - P)/v, where v is the estimated randomization standard error of p. When p is unweighted and the data comes from a simple random sample, the coverage interval generated by the Anderson-Nerman approach is asymptotically identical to the Wilson interval. Consequently, their approach appears to be a more theoretically grounded generalization of the Wilson method than replacing the sample size with the effective sample size.

An empirical study of weighted estimators under simple random sampling reveals that Anderson-Nerman coverage intervals are only slightly better than those derived using the ad-hoc Wilson approach. Both are much better than standard Wald intervals, but nowhere as good as the Wilson approach for an unweighted estimator. An investigation into the stability of the two methods reveals why and suggests an ad hoc remedy: computing the effective degrees of freedom and constructing t-based intervals. The effective degrees of freedom is not the same as the effective sample size. In fact, for unweighted p under simple random sampling, the effective degrees of freedom are infinite, which is why the Wilson interval works so well.

2025	2024	2023
2022	2021	2020	2019
2018	2017	2016	2015
2014	2013	2012	2011
2010	2009	2008	2007
2006	2005	2004	2003
2002	2001	2000	1999
1998	1997	1996	1995

Washington Statistical Society Seminar Archive: 2002

Title: On Disclosure Protection for Non-Traditional Statistical Outputs

Topic: An Investigation of Response Rates in Random Digit Dialed Telephone Surveys

Topic: Politeness and Cross-cultural Communication

Title: Semiparametric Bayesian Techniques for Problems in Circular Data

Topic: Record Linkage and Machine Learning

Title: A Adaptive and Link-Tracing Sampling Designs Surveys

Title: Estimation of the Self-Similarity Parameter When Data Has Finite Variance or is Heavy-Tailed

Topic: GIS and Image Based Approaches to TIGER Enhancement

Title: A Transaction Price Index for Air Travel

Title: Small Area Estimation for U.S. States, Counties, and School Districts

Title: Memorial Session: Wray Jackson Smith

Topic: Masking and Re-identification Methods for Public-Use Microdata

Title: Global Atmospheric Changes: Statistical Trend Analyses of Ozone and Temperature Data

Title: Maximum Likelihood Estimation for Fractional Diffusions

Title: Estimating Output Growth with Labor Market Indicators: A Kalman Filter Approach to Interpolation and Prediction of GDP with Noisy Data

Topic: Overview of the Fellegi-Holt Model of Statistical Data Editing: Current Methods and Research Problems

Title: Probabilistic Analysis of Algorithms by the Contraction Method

Topic: Two-Sided Coverage Intervals For Small Proportions Based On Survey Data

Topic: Machine Learning Methods for Text Classification

Title: Weighted Likelihood, Mixture Models and Model Assessment

Title: The SAR Procedure: A Diagnostic Analysis of Heterogeneous Data

Title: Efficiency of Monte Carlo EM and Simulated Maximum Likelihood in Two-Stage Hierarchical Models

Title: Optimal Designs For Phase I Clinical Trials

Topic: Incentives in Internet Surveys

Title: Beyond Black-Scholes: Probability Distribution of Stock Price Changes in a Model with Stochastic Volatility

Title: Asymptotics of Brownian and Diffusion Sample Paths

Topic: Machine Learning Methods for Text Classification

Title: Mean Squared Error of Empirical Predictor

Title: The Information Content of Trades: A Class of Market Microstructure Models

Topic: The Medicare Current Beneficiary Survey (MCBS)

Topic: A Weighted Jackknife Method for the Fay-Herriot Model with an Application in the Saipe Program

Title: Combination of Information from Several Sources: The Case of t and F Tests

Topic: Including Families with Limited English Proficiency in the Early Childhood Longitudinal Study, Birth Cohort (ECLS-B)

Title: Monte Carlo Approximation and the Bootstrap

Title: On the Correlation Structure of Transformed Gaussian Random Fields

Title: Application of the Sanov Large Deviation Theorem to the Density Estimation and Screening Significant Factors

Topic: YOU ARE HERE: Information Architecture and Web Navigation

Topic: Survey Automation: The Promise and the Reality

Topic: The One-Way Fixed and Random Models under Heteroscedasticity

Topic: An "Optimal" Data Swapping Procedure

Topic: Analyzing patterns of killings and migration flow in Kosovo, March-June 1999

Topic: Why Are Semiconductor Prices Falling So Fast? Industry Estimates and Implications for Productivity Measurement

WSS Annual Dinner

Statistics For A New Century: Meeting The Needs Of A World Of Data

Topic: Leonardo's Laptop: Human Needs and the New Computing Technologies

Topic: Bootstrap Approximation to Prediction MSE for State-Space Models with Estimated Parameters

Topic: Confidentiality Audit On Suppressed Entries in Multi-Dimensional Contingency Tables

Topic: Parameter Estimation in Logistic Regression -- Not an Easy Matter

Topic: Combined Survey Sampling Inference: Compromise or Consummation?

Title: Partial Volume Correction for Neuroimaging using Tensor Based Statistical Algorithms

Topic: Robust Seasonal Adjustment using Heavy-Tailed Distributions

Title: Baysian Group Testing

Title: Synthetic Tabular Data To Limit Statistical Disclosure Of Sensitive Information

Title: Afghan Refugee Camp Surveys: Pakistan, 2002

Title: The Value of Standardization - Software and Current Best Methods

Title: The 2002 Roger Herriot Award For Innovation in Federal Statistics

Title: Correcting for Omitted-Variables and Measurement-Error Bias in Autoregressive Model Estimation with Panel Data

MORRIS HANSEN LECTURE

Title: Privacy and Confidentiality A New Era?

Title: Confidentiality for a Mandatory Reporting System: Challenges and Solutions

Title: The U.S. Census Bureau's Corporate Metadata Repository: An Overview of the Development Process and Current Status

Seminar Archives