Washington Statistical Society Seminar Archive: 2004
Topic: An Implementation of Component Models for Seasonal Adjustment Using the SsfPack Software Module of Ox
- Speaker: John Aston, Statistical Research Division, U.S. Census Bureau
- Date/Time: January 7, 2004, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of the Census, 4700 Silver Hill Road, Suitland, Maryland - the Morris Hansen Auditorium, Federal Office Building #3. Please call (301) 763-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
- Sponsor: U.S. Bureau Of Census, Statistical Research Division
Abstract:
An alternative to traditional methods of seasonal adjustment is to use component time series models to perform signal extraction, such as the structural models of Andrew Harvey currently implemented in STAMP, or the ARIMA decomposition models of Hillmer and Tiao currently used in SEATS. A flexible implementation allowing easy specification of different models has been developed using the SsfPack software module of the Ox matrix programming language. This allows the incorporation of heavy tailed distributions into certain components within the model. Examples of robust seasonal adjustments for different model types using this method will be shown.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to S.Yvonne.Moore@census.gov.
Return to topTopic: Clive Granger, Cointegration, and the Nobel Prize in Economics
- Speakers: Neil R. Ericsson, Federal Reserve Board
- Chair: Anna Jan, Ernst & Young
- Date/Time: Wednesday, January 14, 2004; 12:30 2:00 p.m.
- Location: Bureau of Labor Statistics, Conference Center Room 9, Postal Square Building (PSB), 2 Massachusetts Ave. NE, Washington, D.C. Please use the First St., NE, entrance to the PSB.
- Sponsor: Economics Section
Abstract:
In 2003, the Nobel Prize in Economics was awarded to Clive Granger "for methods of analyzing economic time series with common trends (cointegration)" and to Rob Engle "for methods of analyzing economic time series with time-varying volatility (ARCH)". This WSS seminar examines Clive's contribution of cointegration; a subsequent WSS seminar will focus on Rob's contribution of ARCH.
Cointegration is a statistical property that characterizes a long-run relationship between two or more integrated time series. After examining the analytics and implications of cointegration, we consider testing procedures due to Engle and Granger (1987) and Johansen (1988). The Johansen procedure establishes a natural framework for testing hypotheses about multiple cointegrating vectors and about the adjustment coefficients. Cointegration is also isomorphic to the existence of an error correction mechanism in a set of dynamic behavioral equations, so we discuss error correction models, including tests for cointegration based on those models. The relationships between the Engle-Granger, Johansen, and error correction procedures for testing cointegration provide the basis for discussing their relative advantages and disadvantages. Empirical applications help illustrate these testing procedures.
Return to topTopic: Pesticide Epidemiology, Human Biomonitoring Data, and Risk Assessment
- Speaker: Ruth H. Allen, PhD, MPH
NHANES Analysis Team Leader and OPPTS Principal Collaborator for the Agricultural Health Study (AHS) Office of Prevention Pesticides and Toxic Substances (OPPTS) - Session Chair: Mel Kollander, Director, Washington Office, Institute for Survey Research of Temple University
- Date/Time: Thursday, January 15, 2004, 12:30 pm - 2:00 pm
- Location: Bureau of Labor Statistics Conference Center, Conference Room 9, 2 Massachusetts Ave., NE, Washington, DC. Use the Red Line to Union Station.
- Sponsor: WSS and Bureau of Labor Statistics
Abstract:
A shift toward weight-of-evidence approaches in risk assessment, and a more abundant supplies of chemical specific pesticide epidemiology, exposure measurement data, and population-level human biomonitoring information have created opportunities to rethink the role and uses of this inherently statistical information in science and policy. This presentation will review some recently published pesticide epidemiology articles as case studies. Emerging data on cancer, birth defects and infertility is used to illustrate current challenges, gaps, and opportunities for local, State, Regional and Federal partners, including Community Health and Nutrition Examination Surveys (CHANES). The non-trivial gap in quantitative education for all stakeholders will also be mentioned, along with some ideas on how to increase statistical literacy.
Return to topTitle: Calibration with Multiple Constraints
- Speaker: Stephen Ash, U.S. Census Bureau
- Chair: Miriam Rosenthal, U.S. Census Bureau
- Date/Time: Tuesday, January 20, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Methodology Section
Note: This is the third in a series of WSS seminars on calibration and related types of estimation.
Abstract:
Calibration in survey sampling can be a powerful tool for using auxiliary information to improve design-based estimates. In this seminar we will discuss the calibration of estimates with two constraints. An important example we will consider is using auxiliary information to improve estimates from a two-phase sample design. Three different scenarios for the two-phase sample design will be considered. We will also consider how generalized raking, i.e., raking with known marginal totals, can be considered within the framework of calibration with multiple constraints.
Return to topTopic: A Nonparametric Bayesian Analysis for Binary Data from a Small Area Under Nonignorable Nonresponse
- Speaker: Jai W. Choi, NCHS/CDC
- Date/Time: January 20, 2004, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of Census, 4700 Silver Hill Road,Suitland, Maryland - the Morris Hansen Auditorium, FOB 3. Please call (301) 763-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
- Sponsor: U.S. Bureau Of Census, Statistical Research Division
Abstract:
In small area estimation, it is standard practice to assume that the area effects are exchangeable. This is obtained by assuming that the area effects have a common parametric distribution, and a Bayesian approach is attractive. The Dirichlet process prior (DPP) has been used to provide a nonparametric version of this approach. The DPP is useful because it makes the procedure more robust, and the Bayesian approach helps to reduce the effect of nonidentifiability prominent in nonignorable nonresponse models. Using the DPP, we develop a Bayesian methodology for the analysis of nonignorable nonresponse binary data from many small areas, and for each area we estimate the proportion of individuals with a particular characteristic. Our DPP model is centered on a baseline model, a standard parametric model. We use Markov chain Monte Carlo methods to fit the DPP model and the baseline model, and our methodology is illustrated using data on victimization in ten domains from the National Crime Survey. Our comparisons show that it may be preferable to use the nonparametric DPP model over the parametric baseline model for the analysis of these data.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. Requests for sign language interpreting services or other auxiliary aids should be directed to Yvonne Moore at (301) 457-2540 text telephone (TTY), 301-763-5113 (voice mail), or by e-mail to S.Yvonne.Moore@census.gov.
Return to topTitle: A New Price Index for Air Travel
- Speaker: Janice Lent, Bureau of Transportation Statistics
- Discussant: Marshall Reinsdorf, Bureau of Economic Analysis
- Date/Time: Wednesday, February 11, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Methodology Section
Abstract:
The Bureau of Transportation Statistics (BTS) is preparing to begin scheduled production of a family of price index series for commercial air travel. The new index series will be based on data from the Passenger Origin and Destination (O&D) Survey, through which BTS collects information from the airlines on a 10% sample of air travel itineraries. Since the Survey was not originally designed to collect data for price index estimation, BTS developed new techniques to estimate Fisher indexes from the O&D Survey data. The large sample allows estimation of index series at geographically detailed levels. We will describe the research performed in developing and testing the new estimation techniques and examine some sample index series computed for research purposes. We will also discuss BTS' future plans for index production and continuous improvement of both the source data and the estimation methods.
Return to topTitle: Student-t Based Interval Estimation of Complex Statistics Under Calibration Weighting
- Speaker: Reid A. Rottach, U.S. Census Bureau
- Co-author: David W. Hall, U.S. Census Bureau
- Chair: David W. Hall, U.S. Census Bureau
- Date/Time: Tuesday, February 17, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 10, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Methodology Section
Note: This is the fourth in a series of WSS seminars on calibration and related types of estimation.
Abstract:
This seminar gives an overview of recent research for the Survey of Income and Program Participation (SIPP) into developing a linearization variance estimator, including an extension to Satterthwaite-like approximations of degrees of freedom. We provide background about weight calibration, particularly the raking ratio, and the residual technique of estimating variances. Some of the topics covered within this context are nonresponse adjustments, restricted weighting, and weight equalization (such as SIPP's constraint that husbands and wives have the same calibrated weight). A general outline of how to implement the method of constructing confidence intervals is given, along with details for several types of statistics, including totals, ratios, quantiles, and yearly changes. Numerical comparisons with a Balanced Repeated Replication estimator using data from the 1996 panel of SIPP show the two methods of estimating variances to be very close in most cases. Furthermore, we will illustrate circumstances where the degrees of freedom approximation, when compared with the nominal value, substantially affected the confidence interval width.
Return to topTopic: Biosurveillance Geoinformatics of Hotspot Detection and Prioritization for Biosecurity
- Speaker: G. P. Patil, Distinguished Professor and Director, Penn State Center for Statistical Ecology and Environmental Statistics
- Chair: Mel Kollander, Director, Washington Office, Institute for Survey Research of Temple University
- Date/Time: Wednesday, February 18, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Conference Center Room 9 and 10, Postal Square Building (PSB), 2 Massachusetts Ave. NE, Washington, D.C. Please use the First St., NE, entrance to the PSB.
Abstract:
Geoinformatic surveillance for spatial and temporal hotspot detection and prioritization is a critical need for the 21st century. A hotspot can mean an unusual phenomenon, anomaly, aberration, outbreak, elevated cluster, or critical area. The declared need may be for monitoring, etiology, management, or early warning. The responsible factors may be natural, accidental or intentional, with relevance to both infrastructure and homeland security.
This presentation describes a multi-disciplinary research project based on novel methods and tools for hotspot detection and prioritization, driven by a wide variety of case studies of potential interest to several agencies. These case studies deal with critical societal issues, such as carbon budgets, water resources, ecosystem health, public health, drinking water distribution system, persistent poverty, environmental justice, crop pathogens, invasive species, biosecurity, biosurveillance, remote sensor networks, early warning and homeland security.
Our methodology involves an innovation of the popular circle-based spatial scan statistic methodology. In particular, it employs the notion of an upper level set and is accordingly called the upper level set scan statistic sytem, pointing to the next generation of a sophisticated analytical and computational system, effective for the detection of arbitrarily shaped hotspots along spatio-temporal dimensions. We also propose a novel prioritization scheme based on multiple indicator and stakeholder criteria without having to integrate indicators into an index, using Hasse diagrams and partially ordered sets. It is accordingly called poset prioritization and ranking system.
We propose a cross-disciplinary collaboration to design and build the prototype system for surveillance infrastructure of hotspot detection and prioritization. The methodological toolbox and the software toolkit developed will support and leverage core missions of several agencies as well as their interactive counterparts in the society. The research advances in the allied sciences and technologies necessary to make such a system work are the thrust of this five year project.
The project will have a dual disciplinary and cross-disciplinary thrust. Dialogues and discussions will be particularly welcome, leading potentially to well considered synergistic case studies. The collaborative case studies are expected to be conceptual, structural, methodological, computational, applicational, developmental, refinemental, validational, and/or visualizational in their individual thrust.
A panel discussion will follow the speaker presentation. The panel invitees include the following: (1) Larry Brandt (NSF); (2) Larry Cox (NCHS); (3) Chuck Dull (USDA); (4) Jeff Frithsen (EPA); (5) John Kelmelis (USGS); (6) Martin Kulldorff (Harvard); (7) Rick Linthurst (EPA); (8) Betsy Middleton (NASA); (9) Linda Pickle (NIH); (10) Phil Ross (EPA); (11) Ashbindu Singh (UNEP); and (12) Lance Waller (Emory). Floor discussion will follow.
Return to topTopic: Reflections on 50+ Years as a Federal Statistician
- Speaker: Calvin Beale, U.S. Department of Agriculture
- Date/Time: February 25, 2004, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of Census, 4700 Silver Hill Road, Suitland,Maryland, the Hollerith Conference Room, FOB 3. Please call (301) 763-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
- Sponsor: U.S. Bureau Of Census, Statistical Research Division
Abstract:
Calvin Beale will reflect on such topics as the role of government statistics in a complex society, the rewards (and occasional frustrations) of a career in Federal statistics, and salient American population trends in one person's work life.
Beale is Senior Demographer with the Economic Research Service of the U.S. Department of Agriculture. Prior to joining USDA, he worked for several years in the Census Bureau's Population Division. His work at USDA has focused on the farm, rural, and small town populations. He is a recipient of USDA's Distinguished Service Award.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. To obtain Sign Language Interpreting services/CART (captioning real time) or auxiliary aids, please send your requests via e-mail to EEO Interpreting &CART: eeo.interpreting.&.CART@census.gov and Sherry.Y.Moore@census.gov to make arrangements. If you have any questions, you may contact the EEO office at 301-763-2853 (Voice) and 301-457-2540 (TTY).
Return to topTitle: Outliers: Identification and Treatment Through the Use of Chebyshev's Theorem
- Speaker: Richard Esposito, U.S. Bureau of Labor Statistics
- Discussant: Hyunshik Lee, Westat
- Chair: Fritz Scheuren, National Opinion Research Center
- Date/Time: Wednesday, March 3, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Methodology Section
Abstract:
Outliers have traditionally been presented as unwanted and troublesome elements whose influence we should protect against through robust methods of data handling and estimation. In this presentation, extreme representative outliers are seen both as inevitable and as providing necessary information about the tail ends of universe distributions. A new method for identifying and treating outliers based on an innovative use of Chebyshev's Theorem is introduced. The method presented avoids the overly wide ranges commonly associated with the use of this theorem, and, being based on Chebyshev's Theorem, does not depend on the normality or specific form of the underlying distribution. The speaker will present data and results demonstrating the importance of outliers, as well as test results that show the ability of the new method to predict corresponding universe values, using universe and drawn samples from national establishment employment data. In addition, a justification and derivation of Winsorizing that follows from considerations of the method will be presented.
Return to topTitle: Identifying Problems with Raking Estimators
- Speaker: Jill M. Montaquila and J. Michael Brick, Westat
- Co-authors: Shelley Brock Roth, Westat
- Discussant: Michael P. Cohen, Bureau of Transportation Statistics
- Date/Time: Wednesday, March 10, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Methodology Section
Note: This is the fifth in a series of WSS seminars on calibration and related types of estimation.
Abstract:
In sample surveys, raking is often used to calibrate survey weights to external totals and adjust for undercoverage. Raking may be particularly useful when control to several dimensions is desired but sample sizes are too small to use all the dimensions simultaneously as required with poststratification. However, raking may be problematic and these problems are not always easily identified. Problems may arise when two or more raking dimensions are highly correlated, there are many raking dimensions, there is measurement error in the variables used on one or more dimensions, or there are sparse tables. In this presentation, we give several examples illustrating the problems that may occur with raking. We also describe approaches for diagnosing potential problems with raking and discuss methods of addressing these problems when they occur.
Return to topTopic: Reconstructing Industrial Production: Conversion to NAICS
- Speakers: Kimberly Bayard, Norman Morin, and John Stevens, Federal Reserve Board
- Chair: William P. Cleveland, Federal Reserve Board
- Discussant: Dennis Fixler, Bureau of Economic Analysis
- Date/Time: Tuesday, March 16, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Economics Section
Abstract:
The Federal Reserve Board's monthly indexes of industrial production, capacity, and capacity utilization are principal indicators of economic activity in the US industrial sector. In December 2002, the Federal Reserve Board issued a historical revision of these indexes going back to 1972. This revision, unprecedented among statistical agencies, is the first significant historical restatement of industry-level economic time series under the new North American Industrial Classification System (NAICS). These presentations review the history and structure of the industrial production indexes, the steps to reclassify the plant-level historical census data from the Standard Industrial Classification System (SIC) to NAICS, the restructuring of major market and stage-of-progress groups based on NAICS series, and the overall effect of the changes on the industrial production and capacity series.
Return to topTopic: Clive Granger, Cointegration, and the Nobel Prize in Economics
- Speakers: Neil R. Ericsson, Federal Reserve Board
- Chair: Anna Jan, Ernst & Young
- Date/Time: Wednesday, March 17, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Economics Section
Abstract:
In 2003, the Nobel Prize in Economics was awarded to Clive Granger "for methods of analyzing economic time series with common trends (cointegration)" and to Rob Engle "for methods of analyzing economic time series with time-varying volatility (ARCH)". This WSS seminar examines Clive's contribution of cointegration; a subsequent WSS seminar will focus on Rob's contribution of ARCH.
Cointegration is a statistical property that characterizes a long-run relationship between two or more integrated time series. After examining the analytics and implications of cointegration, we consider testing procedures due to Engle and Granger (1987) and Johansen (1988). The Johansen procedure establishes a natural framework for testing hypotheses about multiple cointegrating vectors and about the adjustment coefficients. Cointegration is also isomorphic to the existence of an error correction mechanism in a set of dynamic behavioral equations, so we discuss error correction models, including tests for cointegration based on those models. The relationships between the Engle-Granger, Johansen, and error correction procedures for testing cointegration provide the basis for discussing their relative advantages and disadvantages. Empirical applications help illustrate these testing procedures.
Return to topTopic: General Model-Based Filters for Extracting Cycles and Trends in Economic Time Series
- Speaker: Thomas M. Trimbur, Statistical Research Division
- Date/Time: Thursday, April 15, 2004, 2:00 - 3:00 p.m.
- Location: U.S. Bureau of the Census, 4401 Suitland Road, Suitland, Maryland, Room 3225, Federal Office Building 4. Please call (301) 763-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
- Sponsor: U.S. Bureau Of Census, Statistical Research Division
Abstract:
A class of model-based filters for extracting trends and cycles in economic time series is presented. These lowpass and bandpass filters are derived in a mutually consistent manner as the joint solution to a signal extraction problem in an unobserved components model. The resulting trends and cycles are computed in finite samples using the Kalman filter and associated smoother. The filters form a class which is a generalization of the class of Butterworth filters, widely used in engineering. They are very flexible and have the important property of allowing relatively smooth cycles to be extracted. Perfectly sharp, or ideal, bandpass filters emerge as a limiting case.
Thomas M. Trimbur is currently a PostDoctoral Researcher at the U.S. Census Bureau. He recently completed a PhD in economics at the University of Cambridge, under the supervision of Andrew Harvey. His research activities focus on econometric modeling of time series, and he is interested in both classical and Bayesian approaches to statistical analysis.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. To obtain Sign Language Interpreting services/CART (captioning real time) or auxiliary aids, please send your requests via e-mail to EEO Interpreting & CART: eeo.interpreting.&.CART@census.gov and S.Yvonne.Moore@census.gov to make arrangements. If you have any questions, you may contact the EEO office at 301-763-2853 (Voice) and 301-457-2540 (TTY).
Return to topTopic: Robert Engle, ARCH Models, and the 2003 Nobel Prize in Economics
- Speaker: Carla Inclan, Quantitative Economics & Statistics Group, Ernst & Young
- Chair: Linda Atkinson, Economic Research Service, USDA
- Discussant: Keith Ord, Georgetown University
- Date/Time: Wednesday, April 21, 2004; 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Conference Center Room 10, Postal Square Building (PSB), 2 Massachusetts Ave. NE, Washington, D.C. Please use the First St., NE, entrance to the PSB.
- Sponsor: Economics Section
Abstract:
In 2003, the Nobel Prize in Economics was awarded to Rob Engle "for methods of analyzing economic time series with time-varying volatility (ARCH)" and to Clive Granger "for methods of analyzing economic time series with common trends (cointegration)". This WSS seminar follows after the first in the series, Clive Granger and Cointegration, and will focus on Rob's contribution of ARCH. In this second seminar of the Economics Nobel Prize series, Carla Inclan will examine Engle's career, his ideas, and a few of the people who contributed to the development of ARCH models. Also, we will discuss various parts of the extensive branch of econometrics generated by ARCH models, as well as some of the future lines of research connected to modeling conditional variances.
Return to topTitle: Avoiding Bad Samples in Business Surveys and Getting "Good" Ones
- Speakers:
Mary Batcher, Ernst and Young LLP
Fritz Scheuren, NORC, University of Chicago
Yan Liu, Ernst and Young LLP - Chair: Chris Moriarity, U.S. General Accounting Office
- Date/Time: Thursday, April 22, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 10, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Co-Sponsors: WSS Methodology and Computing Sections
Abstract:
This is the first of four seminars on ways to avoid bad samples and, instead, obtain "good" samples with high probability. In this session the broad framework and claims we expect to prove are stated. The ideas of replicated sampling and deep stratification are introduced in the context of Neyman allocation of stratified element samples. Additionally, we review and complete the earlier work on Tam, Chan, and Brewer, among others, on how to discard a bad sample if drawn -- doing so in a statistically principled way, of course. Return to top
2004 WSS SPECIAL AWARD FOR EXCELLENCE IN REPORTING STATISTICS
Title: The Consumer Price Index and the Debate over Inflation
- Recipient & Speaker: John Berry, Bloomberg News (formerly, Washington Post)
- Discussants: William Barron, Princeton University, and Patrick Jackman, Bureau of Labor Statistics
- Chair: David Marker, Westat
- Date/Time: Friday, April 23, 3:00-4:15pm; reception to follow
- Location: Bureau of Labor Statistics, Conference Center, Room G440, Room 1, Postal Square Building (PSB), 2 Massachusetts Avenue, NE, Washington, DC. Please use the First St., NE, entrance to the PSB.
Abstract:
The measurement of inflation is a difficult chore at best. Unfortunately for the statisticians who do it, the difficulty is compounded because the results of their efforts can have significant impacts on government revenues and outlays and even monetary policy. Comments by Federal Reserve Chairman Alan Greenspan recently underscored this fact and provoked renewed political controversy. In Britain, the Bank of England, which formally uses an inflation target in setting monetary policy, changed the inflation index which it uses. Meanwhile, many academics routinely assert that most of the alternative inflation indexes overstate inflation. The European Central Bank, which also targets an inflation rate, concluded that this is not true. On the other hand, numerous analysts on Wall Street, who believe the Federal Reserve is ignoring an imminent inflation threat, argue the indexes understate price changes. In most of this contentious discussion, the statistical details of the indexes and why they vary are usually ignored, both by the discussants and the press.
John Berry has reported on economic statistics in an exemplary manner for over 30 years, mostly for the Washington Post. He regularly reports on data from the Bureau of the Census, the Bureau of Economic Analysis, the Bureau of Labor Statistics, and the Federal Reserve. Moreover, in his Trendlines column and in other articles, he has discussed major methodology issues associated with these statistics. His work has been an important contribution to public understanding of government statistics. Please join the Washington Statistical Society in honoring John Berry as we present this award to him and celebrate in a reception following the seminar.
Return to topTitle: Deflationary Dynamics in Hong Kong: Evidence from Linear and Neural Network Regime Switching Models
- Speaker: Paul McNelis, Georgetown University
- Moderator: Charlie Hallahan, ERS/USDA
- Date/Time: Tuesday, April 27, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 10, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: Statistical Computing Section
- Talk to be video-conferenced.
Abstract:
This paper examines deflationary dynamics in Hong Kong with a linear and a nonlinear neural network regime-switching (NNRS) model. The NNRS model is superior to the linear model in terms of in-sample specification tests as well as out-of-sample forecasting accuracy. As befitting a small and highly open economy, the most important variables affecting inflation and deflation turn out to be growth rates of import prices and wealth (captured by the rates of growth of residential property prices). The NNRS model indicates that the likelihood of moving out of deflation has been steadily increasing.
Return to topTitle: Combining Filter Design with Model-Based Filtering (A Model-Based Perspective)
- Speaker: Agustin Maravall, Bank of Spain
- Discussant: Thomas M. Trimbur, U.S. Census Bureau
- Chair: David F. Findley, U.S. Census Bureau
- Date/Time: May 5, 2004, Wednesday, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 1, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Methodology Sections
Abstract:
Filters used to estimate unobserved components (UC) also called "signals" in economic time series are often designed on a priori grounds, so as to capture the frequencies that should be associated with the signal. We shall refer to them as a priori designed (APD) filters; basically, their design is independent of the particular series at hand. It is well known that a limitation of APD filters is that they may produce spurious results (a trend, for example, could be extracted from white noise).
The spuriousness problem can be, in principle, avoided if the filter is derived following a model-based approach. The series features are captured through an ARIMA model, models for the components are derived, and the Wiener-Kolmogorov filter is used to estimate the components; we shall refer to this approach as ARIMA-model-based (AMB) filtering. AMB filtering also presents some drawbacks. First, it may provide components that display poor band- pass features. Second, parsimony of the ARIMA models typically identified for economic series implies little resolution in terms of UC detection, so that the AMB decomposition cannot go much beyond the standard "trend-cycle + seasonal + irregular" decomposition. Thus, it would be nice to combine a higher resolution in order to obtain components with the desirable features, with lack of spuriousness and consistency with the structure of the overall observed series. The problem of business-cycle estimation is treated in two steps: application of a filter to produce a trend-cycle or seasonally adjusted series, and application of the Hodrick-Prescott filter to extract a cycle component. Advantages of applying an AMB filter at the first step are seen both theoretically and in an application.
Return to topTopic: Tools for Retrieving and Analyzing Information from Multiple Data Sources at the U.S. Census Bureau
- Speaker: Cavin Capps, Chief, Survey Modernization Branch Demographic Surveys Division
- Date/Time: May 5, 2004, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of the Census, 4401 Suitland Road, Suitland,Maryland, the Morris Hansen Auditorium, Federal Office Building 4. Pleasecall (301) 763-4974 to be placed on the visitors' list. A photo ID isrequired for security purposes.
- Sponsor: U.S. Bureau Of Census, Statistical Research Division
Abstract:
Data Ferrett is a tool that enables analysts to easily examine and manipulate data from different sources of the U.S. Census Bureau. The DataWeb servers are sets of Server software services that networks different databases and datasets together. This can be done potentially throughout the Census Bureau's Intranet to provide quick and simplified analytical access to internal confidential data, and public use data. Such access could provide Census analysts simple access to data located in various computers throughout the Census Bureau representing potentially data from different programs or different stages of survey production processing. The software is being developed to create the kind of specialized, often complex tabulations that are the standard output of Census Bureau processes. The system is designed to quickly tabulate datasets ranging in size from the complete 2000 confidential microdata file to small private spreadsheets and Microsoft Access databases. The tools also provide online business charting, graphing and a full set of Census Geography mapping facilities.
The presentation will demonstrate some of the advanced data manipulation facilities of the toolset, discuss how it is being used inside the Bureau to streamline production processing. We will also show how it is being used by non-profits outside the Bureau to build topical websites that highlight data from the Census and other statistical agencies. Finally, a brief overview of future directions will be discussed.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. To obtain Sign Language Interpreting services/CART (captioning real time) or auxiliary aids, please send your requests via e-mail to EEO Interpreting & CART: eeo.interpreting.&.CART@census.gov and Sherry.Y.Moore@census.gov to make arrangements. If you have any questions, you may contact the EEO office at 301-763-2853 (Voice) and 301-457-2540 (TTY).
Return to topTitle: Theory of Median Balanced Sampling
- Speakers:
Susan Hinkins, NORC, University of Chicago
Patrick Baier, NORC, University of Chicago
Yan Liu, Ernst and Young LLP - Chair: Fritz Scheuren, NORC, University of Chicago
- Date/Time: Thursday, May 6, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Co-Sponsors: WSS Methodology and Computing Sections
Abstract:
There are many ways to obtain "good" samples, given that you can avoid really bad ones. We describe several of these, focusing initially on median balancing when engaged in stratified sampling, first for two strata, then for three strata, and then in general. This is done initially when the unit of sampling is a population element, then for other units including IID replicates that collectively make up the total sample. Asymptotic results are provided that prove in some settings that the limiting distribution is normal or even better than normal. By "better than normal," we mean that the variance of the variance is less than would exist if the distribution were normal. As we will cover, this characteristic of the balanced samples we use is an especially attractive property in small samples, like those common to some business applications.
Return to topTitle: 50 Years Since Brown v. Board of Education - The Role of Statistics in Desegregation
- Speakers:
Stephan Thernstrom, Harvard University
Sean Reardon, Penn State University - Discussants: Robert Lerner, Commissioner National
Center for Education Statistics
Emerson Elliott, Former Commissioner National Center for Education Statistics - Chair: David Marker, Westat
- Date/Time: Monday, May 17, 2004, 12:30 - 2:30 pm
- Location: Bureau of Labor Statistics Conference Center, 2 Massachusetts Ave. NE, Washington, D.C. Please use the First St., NE, entrance to the PSB.
Abstract:
May 17th is 50 years since the Supreme Court's landmark Brown v. Board of Education ruling outlawing segregated "separate but equal" school systems. Leading up to that decision and since then, there have been a series of studies done using statistics to clarify the effects on education of segregation and the success or failure of desegregation efforts. Statistical evidence has also been regularly used by both sides in ongoing court cases regarding busing, magnet schools, and in higher education affirmative action. Our two speakers have worked in this area, supporting different solutions based on their research and the work of others in the field. Our discussants are the current and former Commisioners of NCES. We look forward to a very lively discussion. Please attend the session in person and stay afterwards to talk with the presenters over light refreshments (note the extended time for this session).
Return to topTitle: Technology Enabling Practical Mobile Data Collection
- Speakers:
David Hill, Westat
Richard Huey, Westat - Organizer: Jonaki Bose, Bureau of Transportation Statistics
- Date: Tuesday, May 18, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, conference rooms 7 and 8, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: Data Collection Methods Section, WSS
Abstract:
Emergent mobile electronic hardware and software technologies are enabling new methods of field data collection. This includes computerized platform units, such as tablet PCs and better handhelds (PDAs), and more peripherals, such as GPS receivers, digital cameras, and audio devices. This presentation will survey current practical platforms and peripherals, with examples of these devices in use by Westat, including brief demonstrations. The impacts on project operations and general data quality will be discussed. We will highlight useful future trends.
Return to topTitle: Operational Examples of Balanced Sampling
- Speakers:
Yan Liu, Ernst and Young LLP
Ali Mustag, NORC, University of Chicago
Hongwei Zhang, NORC, University of Chicago - Date/Time: Thursday, May 20, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Co-Sponsors: WSS Methodology and Computing Sections
Abstract:
The operational details of how to avoid bad samples and get good ones instead are covered in this seminar, with examples drawn from practice. Deep stratification, the current best practice, is compared for two "typical" populations with several variants of median balanced selection where the median balancing is done initially with the actual population elements as the sampling units. The use of balanced sampling of replicates or IID subsamples is also covered. Replicate balancing is shown, for the kinds of populations commonly encountered in business applications, to be quite an advance. The case where one variable is used to stratify and a second variable or even a third to balance on is also taken up, albeit more theory still remains to be developed here.
Return to topJPSM 10th Anniversary Symposium
Title: Valid Survey Inference via Imputation Requires Multiple Imputation
- Speaker: Donald Rubin, John L. Loeb Professor of Statistics & Chairman, Department of Statistics, Harvard University
- Discussants:
John Eltinge, Associate Commissioner for Survey Methods Research, Bureau of Labor Statistics
Roderick Little, Richard D.Remington Collegiate Professor of Biostatistics, Professor of Statistics and Senior Research Scientist, Institute for Social Research, University of Michigan, Ann Arbor
Frtiz Scheuren, VP Statistics, NORC University of Chicago and ASAPresident-Elect - Date/Time: Wednesday, May 19 3:00pm - 5:00pm
- Location: University of Maryland College Park. (Exact location TBA)
Abstract:
Valid survey inference means valid randomization-based frequentist inference, in the sense of Neyman -- that is, estimates of population estimands that are approximately unbiased, tests of true null hypotheses that reject at most at their nominal levels, and confidence intervals that cover the true population estimands at least at their nominal levels. Valid survey inference via imputation means that these properties must hold when analyzing an imputed data set as if it were complete. This implies, in general, that the available tools of analysis must be limited to those tools that have been designed for complete-data analysis, supplemented with completely generic tools to "correct" the results of those complete-data analyses on the imputed data. Simple examples will be used to show that if these implications of "valid survey inference via imputation" are accepted, the imputation must be multiple imputation. This conclusion does not necessarily suggest that multiple imputation must be implemented following the guidelines in Rubin (1987, etc.) nor that imputation must be used to address all problems of missing data in surveys. However, I now believe there is ever increasing evidence that these assertions are essentially accurate, especially given the flexibility of modern computing and the constraints of real world survey practice.
Return to topTitle: Modification of Chernikova's Algorithm and Error Localization in Linear Editing
- Speaker: Stanley Weng, National Agricultural Statistics Service, U.S. Dept. of Agriculture
- Discussant: William E. Winkler, U.S. Census Bureau
- Chair: Dale Atkinson, National Agricultural Statistics Service, U.S. Dept. of Agriculture
- Date/Time: Thursday, May 27, 2004, 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Methodology Section
Abstract:
In automatic linear editing by the Fellegi-Holt methodology, error localization is a core issue. Chernikova's algorithm has been used to generate extreme vectors for error localization, as solving a cardinality constrained linear program. However, Chernikova's algorithm is computationally inefficient, largely due to a nonlinear procedure in the algorithm, as a rule to check for extreme vectors. The inefficiency has limited the usefulness of the algorithm in practice.
This paper proposes another rule, called rank rule, for identifying extreme vectors, based on the polyhedral theory. The rule takes explicit form and is computationally straightforward. This modification to Chernikova's algorithm appears promising to considerably improve the efficiency of the algorithm. This paper also discusses the polyhedral projection interpretation of the Fellegi-Holt theory to linear editing, which promotes the view that Chernikova's algorithm represents a proper way for error localization in linear editing.
Return to topPresidential Invited Address
Title: Some Observations on Response Rate Experiments
- Speaker: Andrew Kohut,Director of the Pew Research Center for The People and The Press
- Chair: David Marker, Westat
- Date/Time: Friday, June 4, 2004, 3:00 - 4:30 p.m. NOTE THE SPECIAL TIME!
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 9, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Co-Sponsors: DC-AAPOR
Abstract:
Surveys have become harder to conduct than just a few years ago. Yet a new survey experiment shows that carefully designed and implemented surveys continue to obtain representative samples of the public and provide accurate data about the opinions and experiences of Americans.
Andrew Kohut has been conducting surveys since the 1960s. He was President of the Gallup Organization from 1979 to 1989; President of the American Association for Public Opinion Research (AAPOR) 1994-95; and currently is the Director of the Pew Research Center. In addition to their many domestic surveys, the Pew Research Center has been conducting surveys in over 40 countries, measuring attitudes about the United States and its policies. He is a regular contributor on NPR's "All Things Considered" and PBS' "The News Hour with Tom Lehrer."
His presentation will examine the impact of falling response rates on surveys. Are we still including representative samples? What types of studies are more likely to be adversely affected by the lower rates? The effect of lower response rates varies by subject matter and intended use of the data. He will report on recent surveys done by the Pew Research Center to examine these issues.
A reception with a cash bar will follow at Union Station from 4:30 to 5:30. Mr. Kohut will be happy to answer questions concerning this topic or the other surveys conducted by Pew.
Return to topTitle: Statistics of Tectonic Plate Reconstructions
- Speaker: Ted Chang, Department of Statistics, University of Virginia, and National Agricultural Statistical Service
- Chair: Amrut Champaneri, Bureau of Transportation Statistics
- Date/Time: Tuesday, June 8, 2004, 12:30-2:00 PM
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Conference Room 10, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Sponsor: WSS Quality Assurance and Physical Sciences Section
Abstract:
In 1960, Hess proposed the theory of sea floor spreading: that new ocean crust is formed by magma welling up from the interior of the earth and cooling as it reached the surface at mid-ocean ridges. This crust is carried across the bottom of the ocean floor until it is subducted in trenches. Thus the surface of the earth is, to first approximation, composed of tectonic plates which move rigidly away from the mid-ocean ridges. The molten magma acquires a magnetization whose direction depends upon the Earth's magnetic field at the time that it reaches the surface. Periodically in the past the North magnetic pole has flipped to close to the South geographic pole, resulting in the so called marine magnetic anomaly lineations. These marine magnetic anomaly lineations provide the best information to reconstruct the past position of tectonic plates. We will discuss the statistical errors in these reconstructions.
The relative position of two tectonic plates at a fixed time in the past is given by a 3-dimensional rotation matrix. Similar statistical issues arise in the estimation of an unknown 3 dimensional coordinate system, a problem which has arisen in other engineering contexts and in image analysis. We will focus on some general statistical principles that would apply in these other problems.
In estimating these reconstructions, the shapes of the lineations become a nuisance parameter and hence a parsimonious model for their shapes becomes necessary. Previous models assumed a piecewise great circular shape, however, as the data density has increased, these models become untenable. We will discuss some recent results on the use of an Ornstein-Uhlenbeck process to model these shapes.
No geophysical background will be needed. If time allows, we will show that statisticians can also have fun with some slides of a geophysical data collection cruise in the Indian Ocean.
Return to topCommittee on National Statistics Seminar
Announcement:
Survey Nonresponse and Related OMB Guidance
Title: Household Survey Nonresponse - What Do We Know? What Can We Do?
- Speakers:
Robert Groves, University of Michigan:
Brian Harris-Kojetin, U.S. Office of Management and Budget - Discussants:
Kenneth Prewitt, Columbia University
William Kalsbeek, University of North Carolina - Date/Time:
June 18, 2004, Friday, 3:00 - 4:30 p.m.
Coffee, Tea, and Cookies available at 2:30 p.m.
Reception to follow seminar at 4:30 p.m. - Location: Keck Center of the National Academies, Lecture Room (Room 100), 500 Fifth Street, NW. The Keck Center is located on the block bounded by Fifth, Sixth, E, and F Streets, NW. It is located diagonally opposite the MCI Center and the National Building Museum. The pedestrian entrance is on the Fifth Street side of the building, near the north end. The garage entrance is on the Sixth Street side; visitor parking is on the first level, and the elevator to the lobby level is marked. The building is conveniently located on Metrorail. From the Gallery Place/Chinatown station (Red/Yellow/Green), use the 7th Street/Arena exit and walk two blocks east. From Judiciary Square (Red), use the Law Enforcement Memorial exit and walk one-half block west.
- Sponsor: The Committee on National Statistics, the National Academies
Abstract:
Gaining the public's cooperation with household surveys is becoming more and more difficult, resulting in reduced response rates and higher field costs. Concerns about household survey nonresponse have been the subject of much discussion among federal statistical agencies as well as private and academic survey researchers. The U.S. Office of Management and Budget (OMB) is in the process of revising two statistical policy directives on standards for statistical surveys and publication of statistics that were last issued in 1978. OMB is also preparing guidance for agencies on OMB's review of surveys under the Paperwork Reduction Act. Both of these documents address issues of survey nonresponse. This seminar will first review recent research on survey nonresponse and linkages to error properties of statistics. An overview of the draft OMB documents relevant to the treatment of survey nonresponse will then be presented. Remarks by two discussants will be followed by questions from the floor.
All are welcome to attend the seminar, but you must RSVP by June 15, 2004, for security purposes.
To RSVP, or for further information, please contact Christine Covington Chen at (202) 334-3096 or e-mail cnstat@nas.edu.
Return to topTitle: Third Annual Seminar of the Funding Opportunity In Survey and Statistical Research
- Organizers: Robert Fay (robert.e.fay.iii@census.gov) and Monroe Sirken, Research Subcommittee of the Federal Committee on Statistical Methodology
- Chair: Katherine Wallman, Chief Statistician OMB
- Date/Time: Monday, June 21, 2004, 9:00 A.M.- 4:00 P.M. (NOTE SPECIAL TIME)
- Sponsors: Washington Statistical Society, and Washington DC/Baltimore Chapter AAPOR
- Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Rooms 1,2, and 3,2 Massachusettes Ave. N.W., Washington, D.C. Please use the First Street entrance to the PSB.
Abstract:
Since 1998, 12 Federal statistical agencies in collaboration with the National Science Foundation with support of the Federal Committee on Statistical Methodology have been funding and administrating The Funding Opportunity in Survey and Statistical Research, a problem oriented research grants program oriented to the needs of the Federal Statistical System. The Third Annual Seminar of the Funding Opportunity features the reports of the principal investigators of 4 research projects that were funded in 2002.
- "Identifying Causal Mechanisms Underlying Nonignorable Unit Nonresponse Through Refusals to Surveys" by Robert Groves, Mick Couper, Elinore Singer, and Stanley Presser.
- "Testing for Marginal Dependence Between Two or More Multiple-Response Categorical Variables" by Thomas M. Loughin and Christopher R. Bilder.
- "Theory and Methods for Nonparametric Survey Regression Estimation" by Jean D. Opsomer and F. Jay Breidt.
- "A Comparison of RDD and Cellular Telephone Survey" by Charlotte Steeh. Federal agency statisticians and survey methodologists will be discussants of each report.
Agenda
9:00 - Registration and Continental Breakfast
- 9:30 - Welcoming Remarks
- Katherine K. Wallman, OMB
- 9:35 - Session 1. Identifying Causal Mechanisms Underlying Nonignorable Unit Nonresponse Through Refusals to Surveys
- Investigators: Robert Groves, Mick Cooper, Elinore Singer, and Stanley Presser - University of Michigan
- Discussant: to be selected
10:30 - Break
- 10:45 - Session 2. Testing for Marginal Dependence Between Two or More Multiple-Response Categorical Variables
- Investigators: Thomas M. Loughin - Kansas State University ,and Christopher R. Bilder - Oklahoma State University
- Discussant: to be selected
11:45 - Lunch on your own
- 1:00 - Session 3. Theory and Methods for Nonparametric Survey Regression Estimation
- Investigators: Jean D. Oppenheimeer - Iowa State University, and E. Jay Breidt - Colorado State University
- Discussant: to be selected
- 2:00 - Session 4. A Comparison of RDD and Cellular Telephone Surveys
- Investigator: Charlotte Steeh - Georgia State University
- Discussant: to be selected
3:00 - Break
3:15 - Session 5 Seminar's Discussant: Graham Kalton - Westat
Return to topTitle: Quantifying What a Representative Sample Is
- Speakers:
Mary Batcher, Ernst and Young LLP
Susan Hinkins, NORC, University of Chicago
Chris Moriarity, U.S. General Accounting Office - Chair: Fritz Scheuren, NORC, University of Chicago
- Date/Time: Thursday, June 24, 2004, 12:30 - 2:00 p.m. Location: Bureau of Labor Statistics, Postal Square Building (PSB), Conference Center, Room 10, 2 Massachusetts Ave., N.W., Washington, D.C. Please use the First Street entrance to the PSB.
- Co-Sponsors: WSS Methodology and Computing Sections
Abstract:
In this last seminar in the series, we return to Royall's original formulation and attempt to describe what it means to have a "representative balanced sample." Intuitively the extent to which a sample may be said to be "representative" is a function of many factors -- including the size of the sample, the sample's design and the nature of the population. The use of mass imputation is employed to focus on where the sample is "representative." Formally we expand Royall's original idea to quantify the degree to which a given sample is representative. The way we approach this is to massively employ nearest-neighbor imputation to connect the balanced sample drawn with the population elements by matching the two together on the frame variables. The degree to which a close match can be said to exist is then taken to be a measure of the sample's representativeness. This formulation focuses the sampler on the portion of the population not being "covered" or not closely matched, and exposes the need in a very explicit way to engage in model-based inference. In our formulation the blend between conventional sampling inference and modeling is being determined by data, not by theoretical arguments. It is conjectured that conventional sampling inference is best employed only for that part of the population that can be "covered" by the matching.
Return to topTopic: The Seven Communication Standards of Highly Successful Scientific Disciplines
- Speaker: Robert E. Fay III, Ph.D., Senior Mathematical Statistician, U.S. Bureau of Census
- Date/Time: September 14, 2004, 10:30 - 11:30 a.m.
- Location: U.S. Bureau of the Census, The Morris Hansen Auditorium, FOB 3, 4700 Silver Hill Road, Suitland, Maryland. Please call (301) 763-4974 to be placed on the visitors' list. A photo ID is required for security purposes.
- Sponsor: U.S. Bureau Of Census, Statistical Research Division
Abstract:
A core group of scientific disciplines shares recognizable standards of communication. Some standards are distinct from those of non-scientific disciplines. Not simply arbitrary conventions, the standards reflect the very nature of science and contribute to progress in the disciplines that observe them. During the 20th century, some disciplines adopted these standards of communication as they matured. This review proposes how the relationship between the discourse structure of a scientific discipline and its success may be studied.
Almost 10 years ago, I gave an SRD seminar with the provocative title "If I were a real scientist, what would I do next?-draft 1." Although I attempted to sketch some of the differences between mathematics and the empirical sciences as a way to understand the linkage between statistics and science, I never managed to produce a written paper from the talk. But I have continued to investigate this issue.
Outside of my work hours and duties, I am now working on a possible book on scientific communication. Whether I actually ever complete the book is an open question, but I am nearing the completion of a draft first chapter outlining the scope of my argument. The first paragraph above is an abstract for the chapter. I would like to share some of the results as if I were an outside researcher.
This seminar is physically accessible to persons with disabilities. For TTY callers, please use the Federal Relay Service at 1-800-877-8339. This is a free and confidential service. To obtain Sign Language Interpreting services/CART (captioning real time) or auxiliary aids, please send your requests via e-mail to EEO Interpreting &CART: eeo.interpreting.&.CART@census.gov and Sherry.Y.Moore@census.gov to make arrangements. If you have any questions, you may contact the EEO office at 301-763-2853 (Voice) and 301-457-2540 (TTY). Return to top
Title: Inference On Abundance And Analysis Of Spatial Patterns In Ecological Communities
- Speaker: Professor Sujay Datta
Dept. of Mathematics, Statistics and Computer Science
Northern Michigan University, Marquette, Michigan, USA - Time: September 17, 2004 -- 11:00am-12noon
- Location: Monroe Hall 307, 2115 G St., NW. Foggy Bottom metro stop on the blue and orange line.
- Sponsor: The George Washington University, Department of Statistics
Abstract:
Ecology is the study of animal and plant populations on the face of the earth: their habitats, behavior, resources and mutual interactions (cooperation, competition, etc.). The cornerstone of many (though not all) studies in single-population ecology as well as community ecology is an estimate of the abundance of a particular population. It can be determined in an absolute or a relative sense (the latter being preferred due to its practical convenience) and by means of either complete enumeration or ‰Û÷fair and representative‰Û(tm) sampling (the latter being more common since a census is often not feasible or practical). The sequence of decisions by which one decides how to estimate absolute or relative abundance involves many factors: ecological, economic and statistical. In landscape ecology, on the other hand, a central issue is that of patterns of individuals in space. Considering that organisms are seldom spread across the landscape at random, this is an important question both behavioral ecology (where zoologists are concerned with the spacing behaviors of animals) and plant ecology (where botanists often study plants as individuals). A variety of statistical methods have been put forward over several decades by ecologists for both of these problems.
This presentation provides a brief overview of some of these methods. Regarding abundance estimation, we touch upon the four broad categories of procedures: capture-mark-recapture techniques, removal and resight methods, methods based on quadrat counts and those based on line-transects and distances. Regarding spatial patterns, we first address the question of what pattern a population exhibits and then move on to the question of developing a set of distance measures that help us make comparisons between patterns in two populations. Methods discussed include various tests for spatial patterns and various indices of dispersion appropriate for different scenarios (e.g., when a complete spatial map is available, when such a map is unavailable and sampling is needed, when sampling units are natural or arbitrary). Time permitting, other sampling-related issues (such as adaptive sampling, multistage/sequential sampling) are discussed.
Note: For a complete list of upcoming seminars check the department's seminar web site: http://www.gwu.edu/~stat/seminars/Fall2004.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359.
Return to topTitle: Data Quality Methods
- Speaker: William E. Winkler, U.S. Census Bureau (william.e.winkler@census.gov)
- Chair: Jacob Bournazian, U.S. Department of Energy
- Date/Time: Tuesday, September 21, 2004/12:30 - 2:00 p.m.
- Location: BLS Conference Center , Room 3.. BLS is located at 2 Massachusetts Avenue, NE. Use the Red Line to Union Station.
- Sponsor: WSS Methodology Section
Abstract:
Statistical agencies collect data from surveys and create data warehouses by combining data from a variety of sources. To be suitable for analytic purposes, the files must be relatively free of error. Record linkage (Fellegi and Sunter, JASA 1969) is used for identifying duplicates within a file or across a set of files. Statistical data editing and imputation (Fellegi and Holt, JASA 1976) are used for locating erroneous values of variables and filling-in for missing data. Although these powerful methods were introduced in the statistical literature, the primary means of implementing the methods have been via computer science and operations research (Winkler, Information Systems 2004). This talk provides an overview of the recent developments.
Return to topTitle: Testing An Automated Refusal Avoidance Training Methodology
- Speakers: Tracey Hagerty-Heller, Westat
- Chair: Jonaki Bose, Bureau of Transportation Statistics
- Date/Time: Thursday, September 23, 2004/12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Conference Center. BLS is located at 2 Massachusetts Avenue, NE. Use the Red Line to Union Station.
- Sponsor: Methodology, WSS and DC-AAPOR
Abstract:
Interviewer training to avoid refusals has traditionally relied on tailoring the messages to the concerns of the individual respondent (Groves and Couper, 1998; Schaeffer, 1991). For many telephone interviews, tailoring is difficult because relatively little information is provided to interviewers before a refusal occurs. Research has shown that practice and drilling interviewers to respond as quickly as possible plays an important role in avoiding refusals (Groves and McGonagle, 2001; Mayer and O'Brien, 2001). Practice may take many different forms, including role playing, "games" and observing others. Westat has developed automated refusal avoidance training for telephone interviewers. The purpose of the training is to provide interviewers with an opportunity to develop and practice their refusal avoidance techniques. Interviewers, seated at a CATI station, call into an Interactive Voice Response (IVR) system to listen and respond to a set of recorded scenarios where survey respondents object, refuse and ask questions. The conversations are recorded and interviewers/supervisors are able to listen and evaluate their responses. This tool potentially offers a relatively inexpensive way for interviewers to get practice and for supervisors to get a realistic assessment of how the interviewer will perform once they are making "live" calls. This presentation will report results from a study that evaluated the effectiveness of this newly developed training tool. The evaluation randomly assigned approximately 100 interviewers to two groups. Both groups received the standard refusal avoidance training (classroom lecture and role-playing) while only the second group participated in a series of practice sessions with the IVR system. Cooperation rates between the two groups, direct observations of interviews (e.g., using behavioral coding), and feedback from interviewers were assessed to measure success.
Return to topTitle: Creating A Market for Reducing Sprawl
- Speaker: Frederic H. Murphy, Temple University
- Chair: Mel Kollander, Temple University
- Date/time: Wednesday, September 29, 2004/12:30 pm-2 pm
- Location: Bureau of Labor Statistics Conference Room 10. BLS is located at 2 Massachusetts Avenue, NE. Use the Red Line to Union Station.
- Sponsor: WSS Agriculture and Natural Resources Section
Abstract:
Sprawl has well-known environmental consequences. It has caused the increase in oil imports over the past few decades and makes the country more vulnerable to high energy prices. Now it has been implicated in slowing economic growth and harming the health of the population. Sprawl damages the longer-run growth prospects of a region by reducing innovation through lessening social interactions among potential entrepreneurs and technologists. Most recently sprawl has been identified as an important contributor to the obesity problem in the U.S.
%pSprawl has become a major problem for metropolitan areas because the regional markets for land and development do not include the cost of sprawl to society at large. The individual decisions to locate residential, commercial and industrial facilities do not incorporate costs that include damage to watersheds, congestion on the highways, and lack of job access for lower-income residents. Nor do they include the consequences of lost economic growth or on health.
%pAttempts to legislate limits on growth and use zoning have led to lawsuits and inefficiencies that create a zero-sum mentality that in turn accentuates conflicts within regions. What we propose is to design a market mechanism that adds the cost of sprawl to the costs of development in a way that is equitable to developers. The mechanism is based on the successful approach of the federal government to substantially reduce sulfur pollution throughout the United States.
%pSimply put, a state or region should place annual limits on the rights to use land for new development within the state and auction off these rights. The buyers of the rights would be the companies and individuals who seek to develop land and anyone else interested in owning the rights. The revenues gained from the auctions could then be used to mitigate the impact of development on the region or reduce tax rates.
Return to topTitle: Microdata Confidentiality Methods
- Speaker: William E. Winkler, U.S. Census Bureau (william.e.winkler@census.gov)
- Discussant: Arthur Kennickell, Federal Reserve Board
- Chair: Barry Johnson, IRS Statistics of Income Division
- Date/Time: Thursday, October 14, 2004/12:30 - 2:00 p.m.
- Location: BLS Conference Center , Room 3. BLS is located at 2 Massachusetts Avenue, NE. Use the Red Line to Union Station.
- Sponsor: WSS Methodology Section
Abstract:
The data in files of statistical agencies is a valuable resource for the evaluation of processes. The data are often obtained via surveys or by combining files into data warehouses. Because of the increased analytic skills of many potential users of data, individuals want public-use microdata that can be analyzed in a variety of ways. Fuller (1993 JOS) and Lambert (1993 JOS) have indicated that public-use files should be suitable for statistical analyses while still preserving confidentiality of the individual data. This talk provides an overview of current methods (Winkler 2004).
Return to topTitle: From Controlled Experiment to Production Environment: Refusal Aversion Training Adoption and Considerations for Future Use and Research
- Speakers: Eileen M. O'Brien, Center for Survey Methods Research, U.S. Census Bureau
- Chair: Jonaki Bose, Bureau of Transportation Statistics
- Date/Time: Tuesday, October 19, 2004/12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Conference Center Room 2. BLS is located at 2 Massachusetts Avenue, NE. Use the Red Line to Union Station.
- Sponsor: Methodology Section, WSS
Abstract:
Initial interviewer-respondent interactions bear considerable influence on survey participation and response quality. Previous research shows that individual interviewer survey cooperation rates can improve with progressive, realistic practice of discrete steps in addressing respondents' initial concerns. Though this theory-guided training method developed by Groves and McGonagle (2001) has proved valid across survey organizations, topics and modes in experimental settings, evidence is needed on its migration to the production setting. This work reports on the adoption of the refusal aversion training protocol in two large-scale demographic surveys conducted by the United States Census Bureau, modifications made for practical and administrative reasons, and considerations for future use and research.
Return to topTitle: Purchasing Power Parities - Statistics to Describe the World
- Speaker: Fred Vogel, World Bank
- Discussant: Kimberly Zieschang, International Monetary Fund
- Chair: Linda Atkinson, Economic Research Service, USDA
- Date/Time: Wednesday, October 20, 2004 / 12:30 PM - 2:00 PM
- Location: BLS Conference Center Room 7. BLS is located at 2 Massachusetts Avenue, NE. Use the Red Line to Union Station.
- Sponsor: WSS Economics Section
Abstract:
The International Comparison Program is a coordinated global survey of prices used to compute Purchasing Power Parities which remove the distortions caused when using exchange rates to compare the relative sizes of economies and measure poverty levels between countries. The methodology used to determine the basket of items to price, prepare the sampling and survey framework, and estimate the Purchasing Power Parities will be presented. Alternative methods of estimation will be presented along with new methodology being developed. The issues faced when coordinating this effort across countries with differing capabilities will also be discussed.
Return to topReturn to top
Title: Measuring Disclosure Risk for Microdata Using Probabilistic Models
- Speaker: Fred Vogel, World Bank
- Discussant: Kimberly Zieschang, International Monetary Fund
- Chair: Linda Atkinson, Economic Research Service, USDA
- Date/Time: Wednesday, October 20, 2004 / 12:30 PM - 2:00 PM
- Location: BLS Conference Center Room 7. BLS is located at 2 Massachusetts Avenue, NE. Use the Red Line to Union Station.
- Sponsor: WSS Economics Section
Abstract:
Title: Data Quality and Record Linkage: A Broad Survey of Applications and Techniques
- Speakers: Natalie Shlomo, Hebrew University of Jerusalem, University of Southampton
- Chair: Neil Russell, Bureau of Transportation Statistics
- Date: Wednesday, October 27, 2004/ 12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Conference Center Room 8. BLS is located at 2 Massachusetts Avenue, NE. Use the Red Line to Union Station.
- Sponsor: Methodology Section, WSS
Abstract:
Disclosure risk occurs when there is a high probability that an intruder can reidentify an individual in released microdata and confidential information may be obtained. The talk will focus on measuring the disclosure risk based on probabilistic methods for microdata containing records from samples. In general, the population from which the sample is drawn is considered unknown, although there may be partial information about the population through marginal distributions obtained from censuses and administrative data. These distributions are generally used by Statistical Agencies to benchmark the sample data and reduce biases due to non-response through sophisticated estimation procedures for calculating sampling weights.
The disclosure risk is a function of both the population and the sample, and in particular the cell count for the contingency table defined by the combinations of identifying discreet key variables. In order to estimate the necessary parameters for the disclosure risk based on samples, we will use the structure of the table and the relationship among the variables to model the data using a Baysian approach. By assuming a prior distribution, we can reduce the large number of parameters into a few hyperparameters. We consider global risk measures for the entire file, such as the number of sample uniques that are population uniques and the expected number of correct matches to external files, as well as individual risk measures for subsets of the population or for each cell of the key.
In this talk, we will start with a natural model proposed by Bethlehem, Keller and Pannekoek (1990) based on the Poisson-Gamma distributions. The model is the basis for most of the current research being carried out in this area of statistical disclosure control. In addition, we will connect the model to the individual risk methodology currently being implemented in the European CASC project's ARGUS software for disclosure control (Benedetti, Capobianchi and Franconi (1998), Rinott (2003)) and show how the methodology can be improved and goodness-of-fit tests carried out prior to applying the models. The methods will be demonstrated on both simulated data and real data sets from the Israel Central Bureau of Statistics. In addition, the general framework of the risk measure can be expanded to take into account continuous key variables and the introduction of measurement errors by connecting it to record linkage theory.
Return to topTitle: The Design of Computer Experiments to Determine Optimum and Robust Control Variables
- Speaker: Professor William Notz, Department of Statistics, Ohio State University
- Time: November 5, 2004 _ 11:00 am-12:00 noon
- Location: Monroe Hall, 307, 2115 G St., NW. Foggy Bottom metro stop on the blue and orange line.
- Sponsor: The George Washington University, Department of Statistics
Abstract:
In this talk I will discuss the design of computer experiments when there are two types of inputs: control variables and noise variables. Control variables are determined by a product designer while noise variables are uncontrolled in the field but take on values according to some probability distribution. I will consider two problems. The first is the situation in which there are two outputs (responses), each of which is expensive or time consuming to compute. The objective is to find values of the control variables that optimize the mean (over the distribution of the noise variables) of one response subject to a constraint on the mean of the other response. The second is to find values of the control variables at which the response is insensitive to the value of the noise variables.
For both problems, I will describe a sequential strategy to select the values of the inputs at which to observe the responses. The methodology is Bayesian; the prior takes the responses as draws from a Gaussian stochastic process. At each stage, the strategy determines which response to observe and at what set of inputs so as to maximize a posterior expected "improvement" over the current estimate of the optimum. This is joint work with Jeffrey Lehman, Tom Santner, and Brian Williams.
Note: For a complete list of upcoming seminars check the department's seminar web site: http://www.gwu.edu/~stat/seminars/Fall2004.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359.
Return to topTitle: Performance Measurement: A Technical Perspective
- Speaker: Clyde Tucker, Bureau of Labor Statistics
- Discussant: Fritz Scheuren, NORC, University of Chicago
- Chair: Alan K. Jeeves, Bureau of Transportation Statistics
- Date/Time: Wednesday November 10, 2004/12:30 - 2:00 p.m.
- Location: Bureau of Labor Statistics, Postal Square Building (PSB) Conference Center, 2 Massachusetts Ave., N.W., Washington, D.C.
- Sponsor: WSS Quality Assurance and Physical Sciences Section
Abstract:
Performance-based evaluation is, in principle, a great step toward government accountability. As usual, however, the devil is in the details. This paper focuses on issues surrounding the implementation of the Government Performance Review Act (GPRA) in federal statistical agencies. The paper does not address political issues concerning GPRA, but, instead, deals with technical concerns. The topics covered include, but are not limited to, the measurement of future costs and benefits of statistical programs, determining the market value of government products, measurement issues involved in developing indicators of performance, and the resources required to carry out the GPRA mandates.
Return to topTitle: Sequential Classification on Lattices with Experiment-Specific Response Distributions, with Applications
- Speaker: Professor Curtis Tatsuoka. Department of Statistics, George Washington University
- Time: November 12, 2004 - 11:00am-12:00 Noon
- Location: Monroe Hall, 307, 2115 G St., NW. Foggy Bottom metro stop on the blue and orange line.
- Sponsor: The George Washington University, Department of Statistics
Abstract:
A statistical framework will be described for the problem when there exists a "true" state among a collection of states, and observations from sequentially selected experiments are used to identify it. The classification model is assumed to be a lattice, and response distributions will be experiment-specific. This generalizes a framework described by Tatsuoka and Ferguson (2003), which assumes that all experiments share the same response distributions. Applications of this setting include those in group testing, and in neuropsychological and educational assessment. Results relating to optimal rates of convergence will be discussed, and a simple and intuitive class of experiment selection rules will be shown to attain optimal rates under general conditions. An application in neuropsychological assessment will be presented.
Note: For a complete list of upcoming seminars check the department's seminar web site: http://www.gwu.edu/~stat/seminars/Fall2004.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359.
Return to topTitle: Regression Models with Increasing Number of Unknown Parameters
- Speaker: Dr. Asaf Hajiyev, Azerbaijan Academy of Sciences, Department of probability and statistics, Baku State University.
- Time: November 12, 2004 - 4:00pm-5:00pm
- Location: Monroe Hall, 103, 2115 G St., NW. Foggy Bottom metro stop on the blue and orange line.
- Sponsor: The George Washington University, Department of Statistics
Abstract:
The regression models (linear and nonlinear) with increasing numbers of unknown parameters and unknown variances of the errors are considered. At the each point of observation there is only one observable value and that does not allow for estimation of a variance. Such problems are typical for applications, but there haven't been enough investigations on models with increasing numbers of unknown parameters and unknown variances. The method of direct estimation (without estimation of the variances) of the elements of the covariance matrix of deviation vector is suggested. Using this method a confidence band for unknown function in regression models has been constructed.
Note: For a complete list of upcoming seminars check the department's seminar web site: http://www.gwu.edu/~stat/seminars/Fall2004.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359.
Return to topTitle: Data Quality and Record Linkage: A Broad Survey of Applications and Techniques
- Speaker: Thomas Herzog, U.S. Department of Housing and Urban Development
- Discussant: William Winkler, U.S. Census Bureau
- Chair: Fritz Scheuren, NORC
- Date/Time: Tuesday, November 16, 2004/12:30 - 2:00 p.m.
- Location: BLS Conference Center , Room 1. BLS is located at 2 Massachusetts Avenue, NE. Use the Red Line to Union Station.
- WSS Methodology Section
Abstract:
As the cost of computing continues to drop, and more practitioners have access to large data warehouses, the importance of data quality continues to increase commensurately. With this in mind, the speaker will briefly describe some of the key methodological issues dealing with data quality and record linkage. He will present a wide range of applications including (1) Biomedical and Genetic Studies, (2) Transportation Studies, (3) Sample Surveys and Censuses, (4) Federal Tax Policy, (5) Social Insurance, and (6) Cargo Shipments. A wide variety of references to the literature will be provided for those interested in delving further into these topics.
Return to topMORRIS HANSEN LECTURE
Title: Bridging the Gap: NCHS's Experience Transitioning to the 1997 Standards for Collecting Data on Race and Ethnicity
- Speaker: Jennifer Madans, National Center for Health Statistics
- Discussants: Clyde Tucker, Bureau of Labor Statistics and Robert Hill, Westat, Inc.
- Date/Time: Wednesday, November 17, 2004, 3:30 to 5:30 p.m.
- Location: Jefferson Auditorium in the South Building of the Department of Agriculture
For details on the lecture, please see the November, 2004 Issue of the WSS NEWS.
Title: Nonparametric, Hypothesis-based Analysis of Molecular Heterogeneity for Comparative Phenotype Characterization
- Speaker: rofessor Jeanne Kowalski, Division of Biostatistics, Johns Hopkins University
- Time: November 19, 2004 - 11:00am-12:00 Noon
- Location: Monroe Hall, 103, 2115 G St., NW. Foggy Bottom metro stop on the blue and orange line.
- Sponsor: The George Washington University, Department of Statistics
Abstract:
In this talk, I describe two novel, inference-based approaches to analysis of molecular heterogeneity associated with phenotypes. A common theme among them is the construction of testable hypotheses in a very high-dimensional setting, based on developed U-statistic theory, with nonparametric inference. With a modest sample, I discuss a distance-based approach for analysis of sequence heterogeneity. In the extreme case of several single, high-dimensional samples that are to be compared from a microarray experiment, I introduce a class of stochastic linear hypotheses that includes the Mann-Whitney Wilcoxon rank sum test as a special case. In each setting, I discuss the statistical and bioinformatic approaches developed to characterize either genes within a genome or locations within a sequence that depict groups of similar phenotype. As motivation, I examine two separate problems, one for relating sequence heterogeneity in a region of the HIV genome to drug resistance, an d a second for relating gene expressions to hypothesized pathways for immunogenetic analysis of T cells.
Note: For a complete list of upcoming seminars check the department's seminar web site: http://www.gwu.edu/~stat/seminars/Fall2004.htm. The campus map is at: http://www.gwu.edu/Map/. The contact person is Reza Modarres at Reza@gwu.edu or 202-994-6359.
Return to topReturn to top
Topic: Data Quality: Automated Edit/Imputation and Record Linkage (Repeated)
- Speaker: William Winkler, US Census Bureau
- Date/Time: Wednesday, December 1, 2004/10:30 a.m. - 12:00 p.m.
- Location: Auditorium Building 3, US Census
Bureau
Visitors will need to contact barbara.a.palumbo@census.gov at least three days before the talk to be placed on the visitors list. If driving, be sure to specify you also need to be on the parking sticker list. <<li>Sponsor: U.S. Bureau Of Census, Statistical Research Division /UL>Abstract:
Statistical agencies collect data from surveys and create data warehouses by combining data from a variety of sources. To be suitable for analytic purposes, the files must be relatively free of error. Record linkage (Fellegi and Sunter, JASA 1969) is used for identifying duplicates within a file or across a set of files. Statistical data editing and imputation (Fellegi and Holt, JASA 1976) are used for locating erroneous values of variables and filling-in for missing data. Although these powerful methods were introduced in the statistical literature, the primary means of implementing the methods have been via computer science and operations research (Winkler, Information Systems 2004). This talk provides an overview of the recent developments.
Note: This is a repeat of the September 21st WSS talk at BLS by the same title. Due to technical difficulties, remote sites were unable to participate and the speaker has graciously agreed to repeat the presentation.
Return to top
Seminar Archives
2024 2023 2022 2021 2020 2019 2018 2017 2016 2015 2014 2013 2012 2011 2010 2009 2008 2007 2006 2005 2004 2003 2002 2001 2000 1999 1998 1997 1996 1995