Washington Statistical Society
        Washington Statistical Society on Meetup   Washington Statistical Society on LinkedIn

September 2008

Contents:



Congratulations!

The following were elected to the WSS board of directors:

President-elect
John Eltinge

Methodology Program Chair
Brian Meekins

Representative at Large
Jim Knaub
Elizabeth H. Margoshes

Treasurer
Jane Li

We congratulate the winners and express our thanks to the other candidates.

Return to top

ASA Fellows

The following National Capital Area ASA members, all members of WSS, became ASA Fellows at the Awards Ceremony at the Joint Statistical Meetings, August 5, in Denver:

  • Chet Bowie, National Opinion Research Center at the University of Chicago, Bethesda, MD

  • Nilanjan Chatterjee, National Cancer Institute, National Institutes of Health, Rockville, MD

  • Brian A. Harris-Kojetin, US Office of Management and Budget, Washington, DC

  • Thomas N. Herzog, US Department of Housing and Urban Development, Reston, VA

  • Henry D. Kahn, National Center for Environmental Assessment, US Environmental Protection Agency, Washington, DC

  • Karol P. Krotki, RTI International, Washington DC

  • Jill M. Montaquila, Westat, Rockville, Maryland

  • Alvan O. Zarate, National Center for Health Statistics, Centers for Disease Control and Prevention, Hyattsville, Maryland

Three of these (Drs. Harris-Kojetin, Krotki, and Montaquila) were elected members of the WSS Board of Directors at the time the award was first announced.

Congratulations to all!

Return to top

Washington Statistical Society 2007-08 Annual Report

The past program year (from July 2007 to June 2008) for the Washington Statistical Society (WSS) was fruitful and productive.

Among the accomplishments were the following:

The WSS membership exceeds 900. This count includes WSS members who are also members of the American Statistical Association (by far the largest group) and ones (associate members) who are not.

The Short Course Committee sponsored a successful and illuminating short course on "The Analysis of Cross-Classified Categorical Data," taught by Professor Stephen E. Fienberg of Carnegie-Mellon University.

Forty-one regular technical sessions were held and most were transmitted via video feed to remote sites.

Ron Wasserstein, the ASA Executive Director, has been regularly attending WSS Board meetings. Steve Pierson, the new ASA Science Policy Director, has also attended.

The Morris Hansen Memorial Lecture was held at the Jefferson Auditorium of the Department of Agriculture and featured Professor Joe Sedransk of Case Western Reserve University, speaking on "Assessing the Value of Bayesian Methods for Inference about Finite Population Quantities." The chair was Donald Malec (Census Bureau) and the discussants were Nathaniel Schenker (National Center for Health Statistics) and David Binder (Statistics Canada). This lecture is made possible by the co-sponsorship of Westat, the National Agricultural Statistics Service, and WSS.

The 2007 Roger Herriot Award, co-sponsored by WSS and the Government and Social Statistics Sections of ASA as an award for innovation in Federal statistics, was presented to Nancy Kirkendall at the 2007 Joint Statistical Meetings. A WSS session in the form of a panel discussion will be held this fall to honor Nancy.

A special President's Invited Seminar by Ron Wasserstein, ASA Executive Director, was held on the topic "What's Up at the ASA?" It was preceded by a reception.

This year's Holiday Party was again held at Gordon Biersch in downtown DC.

The Quantitative Literacy committee continued its important activities at the elementary, middle, and high school levels. Meetings have been held with Arlington Public Schools. WSS is now represented on the DC STEM (Science, Technology, Engineering, and Mathematics) Alliance, the DC Public Schools advisory group. WSS provided judges for the DC elementary schools science fair.

WSS continues its highly successful science fair judging for DC and surrounding counties at the middle and high school levels. The Gallup Organization provides funding for the science fair awards for outstanding projects in terms of their statistical content.

WSS sponsored its first member for the national honorary statistics fraternity, Mu Sigma Rho. If an applicant's college is not affiliated with Mu Sigma Rho, WSS can sponsor.

The WSS Poster Competition had another very successful year.

Under Tom Krenzke's leadership, the Curtis Jacobs Award Competition is being redesigned to better mesh with the school schedule.

The WSS History is being updated by our historian, Tom Mule. See http://www.cos.gmu.edu/~wss/ for the WSS History from 1896 to 2002.

Joe Maisog of Georgetown University, this year's student representative, contributed very innovative columns to the WSS Newsletter. If you missed them, you should take a look on the Web.

The "WSS Board Officers' and Committee Chairs' Handbook," first compiled last year, received its first update.

The Annual Dinner was held at Meiwah Restaurant in Chevy Chase. As is now customary, this year's speaker was the recipient of the Gertrude Cox Award, Professor Thomas Lumley of the University of Washington. His topic was "Open Source Statistical Software: Why? When? Where?" The Gertrude Cox Award is made possible by funding from RTI International.

The Julius Shiskin Award winners were William R. Bell of the U.S. Census Bureau and Robert M. Groves of the University of Michigan Institute for Social Research.

The Wray Jackson Smith Scholarship Award was presented to Ms. Kirsten Lum, then a student at American University. She is now a doctorial student in biostatistics at The George Washington University. The WSS cosponsors this award, the Government Statistics and Social Statistics Sections of ASA being the principal sponsors.

The WSS President's Awards were presented at the Annual Dinner to Yasmin Said, Hiro Hikawa, and Jose M. (Joe) Maisog. These were the first three Student Representatives to the WSS Board of Directors and through their efforts the student representation has become institutionalized as an important element of the Board's operations. Each one individually made unique contributions to WSS.

WSS again presented awards at the Annual Dinner to Outstanding Graduate students attending area universities.

Thanks also to the many Board members, whether appointed or elected, that worked so diligently on behalf of WSS. See any 2007-08 WSS Newsletter (available on the Web) for a listing of these individuals. It has been very gratifying that folks come forward and volunteer to get involved with WSS but, of course, more volunteers are always needed.

A goal has been to strengthen relations with the younger generations, both in terms of quantitative literacy and science fair efforts with school-age people, and increased involvement with WSS from young professionals. We made progress in these areas, but I would like to see more. I know the Board would love to get your ideas.

In talking with people, I have learned that many think WSS is expensive. Dues are less than $10 a year! (The exact amount depends on the category of membership.) Please spread the word. Not just statisticians but others who have a statistical component to their work should consider joining.

The "three Presidents" (President-Elect, President, and Past President) work as a team so I am very grateful to Jill Montaquila (Past President, 2007-08) and Karol P. Krotki (President-Elect, 2007-08) for their wonderful cooperation and advice. I also want to give special mention to WSS Secretary Chris Moriarity for his splendid help to me on numerous matters. The WSS is an extraordinary organization, and it has been an honor and a pleasure to serve as President.

Michael P. Cohen
Past President

Return to top

Administrative Announcement

WSS needs your help to assure your contact information is up-to-date

WSS wants to do what we can to keep all WSS members informed, primarily by email. Vince sends out notices and the WSS monthly newsletter to WSS members via the WSS listserv. The WSS newsletter is available at the WSS website, http://scs.gmu.edu/~wss , shortly after newsletter editor Mike Feil finishes creating it.

WSS maintains a list of WSS member email addresses for the WSS listserv, independent of ASA. Many WSS members receive WSS email at a different address than the one on file with ASA. Thus, we do not change a member's email address in the WSS list unless we receive a request from the member, or there is a listserv delivery problem. Recently, there was a delivery problem with some email notices sent for the WSS election, even though we are not seeing delivery problems when WSS listserv messages are sent to most of those addresses. Thus, we can't be certain that an email address is valid just because it appears to be receiving WSS listserv messages.

If you change your email address with ASA, and you want WSS to also make this change, please tell us (Chris and Vince) - we don't do it automatically. If you are a WSS member, and you are not getting WSS listserv messages and you want to, please tell us. If you would like WSS to use a different email address for the WSS listserv than the one currently being used, please tell us, and we'll do an update.

Organizations periodically update spam and other filters that block messages from the WSS listserv. We generally have no way to know if the mail is being blocked. Typically, the WSS listserv sends out 8-10 messages per month (newsletter, meetings, employment opportunities, etc.). If you are not getting mail from the WSS listserv for a few weeks, please contact us to see if there is a difficulty. If there is a spam or other blocker problem at the organization, we will try to work with the member's IT folks to resolve the problem.

We obtain email addresses for new ASA/WSS members from ASA, so it's important that ASA has up-to-date contact information for you. If you're not sure if your contact information is current with ASA, you can check this online at the ASA website, using the "Members Only" function. If you didn't get an email notice recently from ASA about the ASA election, it's likely that ASA does not have a valid email address on file for you, and WSS probably doesn't either.

Thank you!
Chris Moriarity (cdm7@cdc.gov) and Vince Massimini (svm@mitre.org)

Return to top

Administrative Announcement

Mailing Address Change

The mailing address for the Washington Statistical Society is now P.O. Box 2033, Washington, DC 20013 which is in the same building as BLS. The Suitland P.O. box mailing address will be retained for a limited time during the transition.

Return to top

Administrative Announcement

Changes in the Board

The listing of the members of the Board of Directors and Committees (pdf) has been updated for the upcoming program year. Please contact the WSS Secretary as well as the editor of the WSS NEWS with any changes.

Return to top

Federal Committee On Statistical Methodology Statistical Policy Seminar

Beyond 2010: Confronting the Challenges
November 18-19, 2008

The Ninth in a Series of Seminars Hosted by COPAFS
(The Council of Professional Associations on Federal Statistics)

Participants will include statisticians, economists, and managers, as well as other professionals in the broader statistical community who share an interest in keeping current on issues related to federal data.

Support Provided by:

  • Agency for Healthcare Research and Quality
  • Bureau of Economic Analysis
  • Bureau of Justice Statistics
  • Bureau of Labor Statistics
  • Bureau of Transportation Statistics
  • Energy Information Administration
  • Environmental Protection Agency
  • National Agricultural Statistics Service
  • National Center for Education Statistics
  • National Center for Health Statistics
  • Office of Research, Evaluation, and Statistics of the Social Security Administration
  • Statistics of Income Division of the Internal Revenue Service * U.S. Census Bureau * Science Resources Statistics/National Science Foundation

Topics:

  • Statistical Uses of Administrative Records in Federal Agencies
  • Case Studies in the Statistical Uses of Administrative Records
  • Cell Phones: The New Frontier in RDD surveys
  • New Perspectives and Practices on Non-Response Bias Analyses
  • Current Issues in Privacy and the Safekeeping of Personally Identifiable Information
  • Survey Respondent Incentives
  • Current Trends in Access to Restricted-Use Data
  • Development and Management of Human and Institutional Capital in Statistical Organizations
  • 2010 Census Experiments
  • Issues of Data Capacity and Statistical Quality to Support Modeling and Micro-simulation Efforts
  • Making Survey Processes More Robust in Response to Funding Reductions
  • Using Paradata to Improve the Management of Survey Costs

Keynote Address: Hermann Habermann, Consultant

Location and Seminar Cost: L'Enfant Plaza Hotel, 480 L'Enfant Plaza, S.W., Washington, D.C. 20024 Cost: $195.00 per person

For Further Information, Contact the COPAFS Office at: Phone: 703-836-0404 Email: copafs@aol.com Fax: 703-836-0406

The registration form is available at the COPAFS web site at: www.copafs.org

Return to top

18th Annual Morris Hansen Lecture
October 28, 2008

Louis Kincannon, former director of the U.S. Bureau of the Census, will give the 18th Annual Morris Hansen Lecture on Tuesday October 28 at 3:30 P.M. in the Jefferson Auditorium of the Department of Agriculture's South Building (Independence Avenue SW, between 12th and 14th Street). The Hansen Lecture Series is sponsored by the Washington Statistical Society, Westat, and the National Agricultural Statistics Service (NASS).

The USDA South Building (Independence Avenue SW) is between 12th and 14th Streets at the Smithsonian Metro Stop (Blue Line). Enter through Wing 5 or Wing 7 from Independence Ave. (The special assistance entrance is at 12th & Independence). A photo ID is required.

Please pre-register for this event to help facilitate access to the building. After September 1, Pre-register on line at http://www.nass.usda.gov/morrishansen/. Additional information will appear in the October issue.

Return to top

Students' Corner

I live in a large residential complex in Arlington named River Place. Here is a map of the grounds (2 MB PDF file), kindly provided by local realtor Judith Michaels. Although the four buildings of River Place are superficially very similar in outward appearance e.g., all four are red-brick high-rises shaped in cross-section like a "plus" sign they are not identical. For example, apartments in the East and South buildings tend to have scenic views of D.C., the Potomac River, and/or the Iwo Jima Memorial; such scenic views can add as much as $20,000 in value to the unit. The West building is the only one that doesn't have a garage, so residents of that building must park either outside the building or in the other buildings' garages. However, the West building is also the only building to have individual unit heat and air-conditioning controls. The South building has the fitness center shared among the four buildings, making it a good location for winter workouts. The 1,205 units of River Place are not evenly divided among the four buildings: the East is the largest and has twelve floors, the South has eleven, while the West and North each have only ten.

For several years I have been following a monthly listing of asking prices of River Place units, that Judith posts in the mailroom and in the laundry room. Each monthly report lists about two dozen units, more or less. In addition to the asking prices, each report also gives the building (North, South, East, or West), the square footage, and the floor for each unit listed. I've been thinking that it would be interesting to perform a multiple linear regression analysis of this data, with asking price as the dependent variable, and square footage, floor, and building as independent variables. Such an analysis could suggest whether square footage and floor each have a significant effect on the asking price. One might expect that a greater square footage or a higher floor would tend to increase the asking price, that is, that these two effects would have positive regression coefficients. Similarly, multiple regression analysis may suggest whether there is a differential effect of Building (North, South, East, or West). Finally, a multiple regression analysis will generate a fitted linear model that might be used to estimate (predict) the asking price of a unit, given the square footage, floor above ground, and building.

Judith has kindly allowed me to present her data. So, here is a link to the data from the July 2007 Report, painstakingly typed into an Excel spreadsheet by your faithful Student Representative. (For ease in interpreting the results, prices are in thousands of dollars. And note that I left the spreadsheet cells unformatted, because formatted cells seem to cause SAS's IMPORT procedure some confusion.) As an exercise, write a SAS script that will read in the data and analyze it using multiple linear regression.

OK, I'll be generous. Here is a link to such a SAS script I wrote myself. And here are the results of running this SAS script to analyze the River Place asking price data:

        Multiple Regression Analysis of River Place Asking Prices from July 2007 Report   3
                                                                   18:55 Tuesday, August 5, 2008

                                       The GLM Procedure

Dependent Variable: Price   Price

                                              Sum of
      Source                      DF         Squares     Mean Square    F Value    Pr > F

      Model                        6     1562581.351      260430.225     505.60    <.0001

      Error                       24       12362.189         515.091

      Uncorrected Total           30     1574943.540

                       R-Square     Coeff Var      Root MSE    Price Mean

                       0.854547      10.18395      22.69562      222.8567


      Source                      DF       Type I SS     Mean Square    F Value    Pr > F

      Building                     4     1492738.310      373184.578     724.50    <.0001
      Floor                        1       11718.826       11718.826      22.75    <.0001
      SquareFootage                1       58124.215       58124.215     112.84    <.0001


      Source                      DF     Type III SS     Mean Square    F Value    Pr > F

      Building                     4       806.25045       201.56261       0.39    0.8128
      Floor                        1      3677.21452      3677.21452       7.14    0.0133
      SquareFootage                1     58124.21518     58124.21518     112.84    <.0001


                                                      Standard
          Parameter                   Estimate           Error    t Value    Pr > |t|

          Building      East       17.05317022     19.64758718       0.87      0.3940
          Building      North       6.46484397     19.69302591       0.33      0.7455
          Building      South       8.16986711     20.50842874       0.40      0.6939
          Building      West        7.39842423     22.00502622       0.34      0.7396
          Floor                     4.39987827      1.64673321       2.67      0.0133
          SquareFootage             0.33092510      0.03115252      10.62      <.0001

As a budding statistician, you should make sure that you understand every quantity shown in the analysis results above. Consult the SAS online documentation if necessary.

Questions (some open-ended):

P-Values vs. Parameter Estimates. Among the three effects, Building, Square Footage, and Floor, which seems to have the greatest effect on the asking price? Would you use the p-values to answer this question? Or would the parameter estimates (regression coefficients) be more relevant? Perhaps we need to clarify what is meant by "greatest effect."

Sums of Squares. The SAS results show two sets of p-values, one using "Type I Sums of Squares" and the other using "Type III Sums of Squares." What is the difference between these two types of Sums of Squares? Which do you think would be better to use in this case? (Do a search on "Type III SS" in the SAS online documentation to find out!)

And what happened to the "Type II Sums of Squares"? Why does the SAS results report offer results based on Type I and Type III Sums of Squares but not Type II? Is there a Type IV Sums of Squares?

Intercept. In my call to PROC GLM, I used the NOINT option. Try running the analysis without this option. How does it change the interpretation of the results? Which way do you prefer?

Fitted Price. One could use the parameter estimates (regression coefficients) to form the following linear equation for a fitted (predicted) price:

Fitted Price = BaseBuilding + (0.33092510*SquareFootage) + (4.399987827*Floor)

where Building is one of North, South, East, or West, and

BaseNorth = 6.46484397
BaseSouth = 8.16986711
BaseEast = 17.05317022
BaseWest = 7.39842423

Try plugging in some values from the data into this equation, keeping in mind that the prices are i terms of thousands of dollars. How well does our linear model fit the data?

The quantity I've called the "base price" of the East building is computed to be about 17.05, more than twice that of the other buildings! Each building at River Place is a separate corporate entity and is run by its own management, so perhaps the East management is especially good? Or maybe there's something about the physical building itself that increases its "base price"? Then again, note that all four of the base prices have large standard errors (relatively to each parameter estimate), and none are statistically significant. Here is a link to the data from the July 2008 report, again painstakingly typed into an Excel spreadsheet by your faithful Representative. Run the same analysis on the July 2008 data. Is the special status of the East building seen in the 2008 data? How similar are the parameter estimates computed from the 2008 data to those computed from the 2007 data?

In the July 2007 data, the lowest square footage was 367 square feet, and the highest level was the 11th floor. Would it be okay to use the multiple regression equation to estimate the asking price of units below 367 square feet, or above the 11th floor?

Independence. Based on what you might know about the real estate market, is the assumption of independence of observations a good assumption in this analysis? Do you know of any good methods for testing dependence/independence of observations? If the assumption of independence doesn't hold, then the p-values we were examining in item #1 are in question.

Assume for the moment that the assumption of independence is true. Then we can feel more confident about the p-values generated in the analysis; in particular, the p-value for the overall fit of the model is less than 0.0001, indicating that the model fits the data well. This in turn implies that two similar units (e.g., both in the North building, both on the 4th floor, both 716 square feet in area) would tend to have similar prices, as predicted by the linear equation in #4 above. I.e., if one unit has a high asking price, then the other unit will also tend to have a high asking price. Does this contradict the assumption of independence that we just assumed to be true?

Time Effect. Combine the 2008 data with the data from the July 2007 Report into one MS Excel spreadsheet. Then add a new column entitled Year, and in this column type in the year for each observation, 2007' or 2008'. Modify the SAS script to include year as an effect of interest. Caution: you will need to instruct SAS to handle year as a categorical variable, not as a numeric value! Does there seem to be a significant effect of year? Is the 2008 market down' compared to 2007? Is it okay to assume that the data from the two years are independent? The July 2007 report had 30 observations, while the July 2008 report has only 18; is the differing number of observations a problem for this analysis?

Theoretically, one could take several years' worth of consecutive monthly data, and then add Month as another effect to the model; perhaps one would find that, e.g., prices wax in the summer but wane in the winter. Again, do you foresee any problem with independence of observations across time? For example, if a unit was on the market in June, would it be problematic for the assumption of independence if the very same unit were again on the market in July? This would bring us into the domain of repeated measures, in which case one might need to do a mixed effects analysis (PROC MIXED).

Alternative Analyses. Try other sorts of analysis on this data. For example, you might try a Principal Components Analysis (PCA) or Independent Components Analysis (ICA) to obtain several orthogonal (or, in the case of ICA, independent) components. Also, I have recently learned of a relatively new matrix decomposition technique called Non-Negative Matrix Factorization (NMF); perhaps you can look up this method and try it on this data. How might you interpret the components/factors that such decomposition methods generate? With these decomposition methods, is it possible to obtain p-values, to perform a formal statistical inference?

A caveat. While the analysis we have performed here may be good for a student exercise, it's important to keep in mind than an actual buyer would factor in many other considerations, too. In other words, I wouldn't recommend making a buy/sell decision solely on the simple multiple regression analysis presented here!

And finally, please allow me a tip of the hat to Judith, without whom this column would not have been possible. If you're interested in purchasing a unit at River Place, call Judith by phone at 202-352-8200. Or, email her at: myrealtor@judithmichaels.com. Her website is: http://www.judithmichaels.com.

I believe that this is the last installment of the Students' Corner that I'll be writing as Student Representative to the Washington Statistical Society. The next Student Representative will be taking the reins of this column next month. It has been my pleasure to have served you, my fellow students, this past year. And always remember: statistics is fun!

Jose M. Maisog
Georgetown University / Medical Numerics, Inc.

Return to top

Note From The WSS NEWS Editor

Items for publication in the October issue of the WSS NEWS should be submitted no later than September 12, 2008. E-mail items to Michael Feil at michael.feil@usda.gov.

Return to top

Click here to see the WSS Board Listing (pdf)
Return to top