Washington Statistical Society
        Washington Statistical Society on Meetup

April 2008

Contents:



Annual Dinner

This year's WSS Annual Dinner is at the MeiWah restaurant in Chevy Chase, Maryland (Friendship Heights Metro), Wednesday, June 25, 2008. The Gertrude Cox Award winner is Dr. Thomas Lumley from the University of Washington. Dr Lumley will speak at the dinner. The title of the talk is to be announced. The price for the dinner is $45 person.

Return to top

2008 Wray Jackson Smith Scholarship

Applications due by April 15, 2008! The Government Statistics Section (GSS) and Social Statistics Section (SSS) of ASA are pleased to announce the availability of a scholarship in memory of Wray Jackson Smith, a founding member of the GSS and long-time contributor to Federal statistics. The Wray Jackson Smith Scholarship (WJSS), co-sponsored with the Washington Statistical Society, the Caucus for Women in Statistics, Harris-Smith Institutes, Mathematica Policy Research, and Synectics for Management Decisions, Inc., is intended to reward promising young statisticians for their diligence, thereby encouraging them to consider a future in government statistics. Everyone is encouraged to seek out promising candidates and to urge them to apply.

Type of Project

The WJSS Award provides funding of $1,000 for use in exploring any of a broad number of opportunities for furthering the development of a career related to government statistics. Applicants are encouraged to be creative in seeking support for a wide variety of uses, including:

  • Tuition, board, and books for courses or short courses
  • Conference attendance
  • Purchase of books, software, data sets, or other supporting materials for research projects related to government statistics.
Activities may relate to any level of government, including Federal, state, and local governmental units. They must be statistical in nature, focusing on data, methodology, analysis, or data presentation. Recent award winners have used the WJSS to fund attendance at the Joint Statistical Meetings, support continued public policy research, and to take short courses to better understand and analyze data for current research.

Application

To apply for a WJSS Award, the following information must be sent to the Wray Jackson Smith Scholarship Committee by April 15, 2008:

  • A completed WJSS Application Form
    (see: http://www.amstat.org/sections/sgovt/) for current year's form and click on the format you want to use)
  • A proposal of activity to be funded
  • Academic transcript (for current/recent students) or job performance reviews for the past 2 years (for non-students) or equivalent proof of superior academic and/or professional performance
  • Two letters of recommendation.

Please send materials to:

Wray Jackson Smith Scholarship Committee
    c/o Michael P. Cohen
    1615 Q Street NW #T-1
    Washington DC 20009-6310 USA

or electronically to: mpcohen@juno.com

Selection Process

The WJSS Committee, consisting of a total of three GSS and SSS members, will review each proposal, based on an established rating scheme, and select the awardee. Each application will be judged based on the following criteria:

  • Stage in Career
  • Past Performance
  • Quality of the Proposed Activity
  • Relevance of Activity to Government Statistics
  • Innovation/Ingenuity of the Proposed Project
  • Feasibility of Completion of Activity
  • Two Letters of Recommendation

Announcements of the awardees are made by June 1, 2008. All applicants are notified by e-mail.

Eligibility

The WJSS is targeted at students and persons early in their career in government statistics. Applicants must have a Bachelor's degree or equivalent level of education. Membership in the Government Statistics Section, Social Statistics Section, or in the ASA is not required. For more information, contact Mike Cohen by e-mail: mpcohen@juno.com

Wray Jackson Smith Scholarship Committee

The Committee for 2008 consists of Michael P. Cohen (Chair) [mpcohen@juno.com], Robert A. Kominski [Robert.A.Kominski@census.gov], and Stephen Campbell [Stephen.Campbell@nist.gov]. The Committee members thank Juanita Tamayo Lott for her invaluable advice and assistance.

Return to top

Girl Scout Science Day
Udvar-Hazy Center
National Air and Space Museum
Smithsonian Institutions

The Girl Scout Council of the Nations Capital sponsored a Girl Scout Science Day on March 8 at the Steven F. Udvar-Hazy Center of the National Air and Space Museum. Over 40 groups - representing professional societies, government agencies, science clubs, etc - brought in hands-on educational exhibits designed to introduce principles of general aviation, science, statistics, environmental science, etc., to Girl Scouts ranging in age from Daisys (kindergarten) to Seniors (high school) . In addition to the more than 2000 Girl Scouts and family members registered for the event, the Center was also open to the public. The Washington Statistical Society (WSS) booth, staffed by Todd Blessinger, Stephine Keeton, Jurate Landwehr, Carl Landwehr, and Anna Nevius, introduced probabilistic concepts of a "fair" versus "unfair" game. Scouts were invited to roll one of two types of large foam dice. The "fair" dice had all smooth sides whereas the "loaded" dice had one corner cut out. After rolling a die, the Scouts were invited to register their result on a poster board with a sticker in a column above the appropriate die face number, creating an evolving bar chart. (Results were also compiled on a computer program.) Much to the chagrin of the WSS members, at the end of the day with a total number of 276 and 278 rolls recorded for the fair and loaded dice, respectively, both distributions looked about comparable! Various conjectures why this might be included: the dice were not manufactured with exact precision so the fair dice were not really fair; scouts were dropping the dice rather than rolling them so the experiments was not consistently conducted, or the sample size was not large enough. In any case, the WSS group enjoyed interacting with the girls and their leaders and parents. The group also passed out flyers to interested parents and leaders about the ASA poster contest for K-12 and the WSS special project contest for middle school and high school students. In addition, the Girl Scouts provided a lovely free box lunch and cookies to all volunteers. Plus, the Center is a just great place to spend the day. The Girl Scout Council will probably sponsor another Girl Scout Science Day in two years. If you are interested, contact Carolyn Carroll, WSS QL (Quantitative Literacy) Coordinator.

Return to top

A Two-Day Workhop on
Bayesian Methods That Frequentists Should Know
The University of Maryland Statistics Consortium
College Park, April 30 -May 1, 2008

Co-sponsors:
The University of Maryland Statistics Consortium
Office of Research and Methodology, National Center for Health
Statistics, CDC Survey Research
Methods Section of the American Statistical Association
Washington Statistical Society

The main purpose of the workshop is to assess the current state of usage of the Bayesian methodology in different disciplines and to discuss potential issues preventing the applications of the Bayesian methods. The workshop will highlight methods that have broad interest and appeal cutting across the Bayesian/Frequentist divide.

The two-day Program will consist of six plenary sessions, a pair of general lectures (the Statistics Consortium Distinguished Lectures) in a special afternoon session on Wednesday, April 30, and a Poster Session to be held during a general Reception immediately following the general lecture session. The plenary sessions each consist of a 45 minute to 1 hour lecture with a formal discussion wherever possible, followed by floor discussion.

The confirmed participants of the plenary sessions and general lectures are: James O. Berger (Duke University), Snigdhansu Chatterjee (University of Minnesota), Malay Ghosh (University of Florida, Gainesville), Stephen Fienberg (Carnegie Mellon University), Roderick J.A. Little (University of Michigan, Ann Arbor), Carl N. Morris (Harvard University), J.N.K. Rao (Carleton University) and Alan M. Zaslavsky (Harvard University).

Posters that are related to the theme of the workshop will be accepted, subject to space constraints. Please visit the workshop web site http://www.jpsm.umd.edu/stat/workshop for detailed information on the workshop, on the Statistics Consortium Distinguished Lectures, and on submission of abstracts for posters. There is no registration fee for attending the workshop, the Statistics Consortium Distinguished Lectures or the reception. We strongly request that you indicate your interest by completing the registration form, which can be downloaded from the website, and sending it to statcons@math.umd.edu or to: Eric Slud, Statistics Consortium, Mathematics Department, Mathematics Building, University of Maryland, College Park, MD 20742, USA, by March 15, 2008. Note that there is no registration fee for attending the workshop.

Return to top

Students' Corner

Jill Montaquila, the former president of the Washington Statistical Society, recently pointed me to a website that lists job opportunities for statisticians. If you're looking for a job, consider trying out this website: http://jobboard.casro.org. Thanks, Jill!


A friend of mine, Doug Galbi, suggested that I check out a website named "Many Eyes". This website encourages users to upload data for other users to visualize. (To me, "visualize" suggests exploratory analyses.) I thought that this might be interesting to a student of statistics, especially since there may be interesting data sets available through the website. http://services.alphaworks.ibm.com/manyeyes/home. Thanks, Doug!


A recent issue of Wired magazine had an article about a real-world application of pattern classification (http://www.wired.com/techbiz/media/magazine/16-03/mf_netflix). It is about the NetFlix Prize; see http://www.netflixprize.com/.

This was of particular interest to me because I am currently taking a class in Pattern Recognition. Perhaps you can try your hand at this competition and win a million dollars. You could then use the money to pay your tuition!


A classmate of mine recently mentioned that she is learning to program in C++. This delighted me, since I sometimes use C++ in my work, and I have wondered whether C++ programming would be useful to statisticians in general. After all, one could always program in SAS, MATLAB, or R. But my classmate reports that many job advertisements she has seen do in fact mention C++ programming skills as desirable, if not required.

So, I thought I'd present a tutorial on getting started with scientific programming in C++. The original tutorial that I drafted on this topic was too long, so I decided to split it into two. This month I'll present the first half, in which we'll first install Dev-C++, a full-featured C++ integrated development environment. Next, we'll compile a small "Hello World" program to demonstrate the basics of compiling a program in Dev-C++. Finally, we'll download the source code for Newmat, a C++ implementation of useful linear algebra classes and functions, and then use Dev-C++ to compile it into a statically linked library.

In my earlier tutorial on LaTeX, the intent was not to have you become a LaTeX expert in one day, but simply to introduce you to two free tools to get started, MiKTeX and TeXnicCenter. Similarly, my intent here is not to turn you into a C++ expert in one day (and I myself do not claim to be a C++ expert), but simply to showcase two free tools to get started. As with the earlier LaTeX tutorial, I have chosen to do this tutorial under a Windows environment rather than, say, a Linux environment, simply because I suspect that most of us students still have readier access to Windows machines rather than other sorts of computers. And as with the earlier tutorial, you'll need administrator privileges to install Dev-C++ on a computer. If you don't have such privileges, you'll need to ask your system administrator to do the installation for you.

By necessity, I'll assume that you have some basic familiarity with Windows, such as how to navigate through folders, how to click-and-drag an icon, and how to copy and duplicate files. I'll use the convention that, e.g., <Control-F9> means to press the Control key, and while keeping it depressed, press the F9 key; then let go of both keys. Another convention I'll use is that if I say to select something like File --> New --> Project . . . within Dev-C++, it means to go to Dev-C++'s main menu, select File, which makes further options available; from these further options select New, which makes further suboptions available; and from these suboptions select Project. Also, I will use the words folder and directory interchangeably.

  1. Install Dev-C++ (Programming Environment)

    I'll assume here that you don't have an older copy of Dev-C++ or the GCC compiler already installed on your computer. Go to this webpage: http://www.bloodshed.net/dev/devcpp.html. Towards the bottom of the page, under Downloads, you'll see three entries with "SourceForge" hyperlinks. Click on the "SourceForge" hyperlink under the entry labeled "with Mingw/GCC", since we'll need the GCC compiler. This will cause you to download the Dev-C++ installation file, perhaps to your desktop; it will have a name like devcpp-4.9.9.2_setup.exe (the exact name may vary depending on the current version). After the file is downloaded to your computer, double-click on the file and follow the installation wizard's instructions. Again, you'll need to have administrator privileges to perform this software installation. I'd recommend you install Dev-C++ into the default directory, C:\Dev-Cpp.

  2. Create Folders

    To prepare forthe rest of this tutorial, go into some folder where you have write privileges, such as your home folder e.g., C:\Documents and Settings\XXX, where XXX is your login name and create a folder there named C++. Then go into the newly created C++ folder, and in it create three folders named helloworld, newmat, and matrixdemo.

  3. Create New C++ Project: Hello World!

    OK, we're ready to try programming in Dev-C++. As Kernighan and Ritchie wrote many years ago, the first program in any language is the same it's a program to write a simple message to the computer screen (Kernighan and Ritchie, 1978)

    1. Invoke Dev-C++ by clicking on the start button in the lower left corner of the Windows desktop, and then selecting All --> Programs --> Bloodshed Dev-C++ --> Dev-C++. (You might also be able to invoke Dev-C++ through your Quick Launch bar or your Start Menu, or by browsing to the Dev-C++ installation folder C:\Dev-Cpp and double-clicking on the file named devcpp.exe found therein.) You will be greeted by a "Tip of the Day" window; read the tip, then dismiss the window by clicking on the Close button. The Dev-C++ window will look something like this:
      Dev-C++ window
      Most of the window area is currently occupied by a featureless gray area. When we create or add files to a project, a tabbed editor pane will appear here, one tab per file. (You can think of a "project" as being components necessary to be compiled and put together to create a program or library.) Note the white area to the left, which will enable us to browse through files within the project. Along the bottom is an area, currently empty, for logs and system messages. Along the top is a bank of convenient buttons. And way at the top is the main menu, including familiar options such as File, Edit, and Help, among others.

    2. In Dev-C++, select File --> New --> Project . . . This will pop up the New project window.
      Dev-C++ new project window
    3. In the New project window, select the icon labeled Console Application. Type in helloworld for the Name in the lower left corner, and in the box in the lower right corner select C++ Project. Then click on the Ok button. This pops up a window titled Create new project.
      Create new project window
    4. In the Create new project window, browse into the hello world folder you created in Step II above. After you've browsed to the proper folder, "helloworld" should be displayed in the "Save in" box at the top of the window, as seen in the preceding figure. Accept the default filename, helloworld.dev, and click on the Save button. This dismisses the Create new project window. A tabbed editor pane entitled [*] main.cpp now appears in the Dev-C++ window (see figure below); the [*] indicates that the pane's content has not been saved yet. This tabbed pane is an editor which allows you to edit code in C++.

    5. Note that Dev-C++ has already inserted a skeletal C++ program in the [*] main.cpp tabbed pane, to get us started. This bare-bones program pops up a DOS command-line window with a prompt, but otherwise does nothing. We could compile and run this program right now, but instead let's edit it to make the program print out the traditional "Hello World!" message to the computer screen. Add the line
          cout << "Hello World!" << endl;
      just after the opening curly bracket (those are double quotes around Hello World!), so that the program now looks like this:
      Hello World window
      The << operator stuffs the character string "Hello World!" and an end-of-line (endl; that's a small "L" at the end, not a numeral "1") into a C++ object called cout, which causes "Hello World!" to be printed to the computer screen. Note that the command is terminated with a semi-colon.

    6. Click on the Save button, which looks like an icon of a floppy disk (yes, I believe they are still in use). This pops up the Save File dialog box. Save the file with its default name main.cpp to the helloworld folder.
      Save file window
    7. Then, select Execute --> --> Compile and Run. Or equivalently, press the F9 key as a shortcut. A black DOS window should appear with the message "Hello World!", followed by a message "Press any key to continue ". (I'm not showing a picture of this window because it is otherwise rather boring. Also, it would be wasteful of ink if this tutorial were printed out.) This means that it worked! To dismiss this window, make sure that it is selected, by clicking on it if necessary, and then press any key on the keyboard. Dev-C++ requires you to dismiss this window before you can compile and run again. So, if you find that the Compile and Run option is unavailable, it could be that you forgot to dismiss the DOS window from an earlier run.

      Since we're compiling a very simple program, it should compile and run without problem. But in practice, if there's some error in your code, the compiler will report errors in the log area at the bottom of the Dev-C++ window, pinpointing the exact lines in the file where errors are found. Correct the errors, and then try a Compile and Run again.

    8. The Compile and Run step created a new binary executable file called helloworld.exe in the helloworld folder. In other words, it translated the human-readable source code in the file main.cpp and translated it into a binary file helloworld.exe that your PC knows how to run. You can re-run this program without recompiling it by pressing within Dev-C++. You can also run it outside of Dev-C++ by bringing up a Windows folder browser, browsing to the helloworld folder, and then double-clicking on the file labeled helloworld.exe. Yet another way to invoke the program is to bring up a DOS command-line window, navigate to the helloworld folder in the DOS window, and then type helloworld at the DOS prompt, followed by a carriage return.

  4. Obtain Newmat (Source Code for Linear Algebra)

    The point of this exercise is to provide tools for getting started in scientific programming. Thus far, we've gotten a taste of the "programming" part. We now need to address the "scientific" component.I have heard it said that the "natural language" of statistics is linear algebra. In this step, we'll obtain the source code for a C++ package that provides linear algebra classes, so that you can instantiate objects such as a Matrix or a ColumnVector and perform standard linear algebra operations on them, e.g. matrix multiplication.Here's how to obtain the Newmat source code.

    1. Go to Dr.Robert Davies' download page, http://www.robertnz.net/download.html, and click on the link labeled "newmat11.zip". This will download the Newmat 11 source code in a single .zip file somewhere on your computer, possibly on your desktop.

    2. Double-click on the .zip file. You should see about 96 files, most with names ending in either .cpp or .h.

    3. Select all of the files and copy them into the newmat folder you created in Step II above.

  5. Create New C++ Project: Statically Linked Library

    If you have a set of useful functions that will be used over and over again in different programs, it might be useful to collect them all into a library. Then you won't have to re-compile these functions over and over again each time you build new programs you'll need to compile only the new programs, and then link them to the library to access the pre-compiled functions. In this step, I'll demonstrate how to build a library from the Newmat source code we downloaded in Step IV.

    1. Create a new Dev-C++ project by repeating Steps III(B) III(C), but select the icon labeled Static Library rather than Console Program. Name the project newmat. Save the project into the newmat folder we created in step II; accept the default project name of newmat.dev.

    2. In Dev-C++, select Project --> Add to Project. This pops up the Open File window.

    3. Browse to the newmat folder and select all the .h and .cpp files, EXCEPT example.cpp, nl_ex.cpp, sl_ex.cpp, garch.cpp and test_exc.cpp, and EXCEPT all files whose names begin with "tmt". Here's how to do this in an efficient way.

      1. Select All Files. Left-mouse-click on any of the files listed in the Open File window, so that it is highlighted in blue. Then type <Control-A> to select all the files. All of the filenames will now be highlighted in blue.

      2. Deselect certain files. Press the Control key. While holding the Control key down, click on all files whose names being with "tmt"; this de-selects them. Also while pressing down the Control key, click on the following files: nm_ex1.cpp, nm_ex2.cpp, example.cpp, nl_ex.cpp, sl_ex.cpp, garch.cpp and test_exc.cpp, to de-select them too. Leave all other files selected. (I know it's a little tedious; please bear with me here.) If you accidentally unselect a file that you shouldn't have, you can re-select it by clicking on it while still pressing the Control key. If you make a mistake and feel like you have to start all over, just return to step V(C1) above. When you're through, the result will look something like this:
        List of files window
      3. Click on the Open button. All of the selected files will then be added to the newmat project. The files should now be listed in the white pane along the left side of the Dev-C++ window. Note that the files in a project don't have to be located in the same folder as the project; in this case, the files just happened to be kept in the same newmat folder as the project, just for convenience.

      4. Press <Control-F9> to compile the static library. If all goes well, the message "creating newmat_lib.a" should appear in the log at the bottom of the Dev-C++ window, under Message. The Dev-C++ window should now look like this:
        List of files window

And that's all there is to it! This concludes part I of this tutorial.

  • Suggestions for Further Exploration
    1. Learn more about Dev-C++'s functionality. Within Dev-C++, select Help --> Help on Dev-C++ to bring up the online manual, and from there you'll be able to learn more about this feature-rich programming environment. Next month, I'll give you a brief glimpse of the debugger in Dev-C++.

    2. Learn more about C++. There are many sources out there; I am sure that you can find many free tutorials, e-books, and even video courses online. The O'Reilly book I reference below may be a good place to start.

    3. If you have no experience in programming in either C or C++, try the C programming course that comes packaged with Dev-C++. After selecting Help --> Help on Dev-C++ in Dev-C++, look under the Contents tab for the item labeled An Introduction to C Programming, and then double-click on it. C++ is an extension of C, so anything you learn about programming in C will be very relevant to programming in C++.

    4. Do someonline research and try to determine the difference between a statically linked library and a dynamically linked library (DLL). What are the advantages and disadvantages of each sort of library?

    That's all for this month. Next month, we'll make a small program that statically links the Newmat library we just created, and that demonstrates some of the possibilities with Newmat and with C++ in general. If you have any feedback on this column or ideas for future topics, please email me at jmm97@georgetown.edu.

    As always, your thoughts will be greatly appreciated.

    Joe Maisog
    Georgetown University / Medical Numerics

    With thanks to Lanlan Yin for test-driving this tutorial; any errors of course remain my fault.

    References

    Brown D and Satir G, C++: The Core Language, Cambridge, MA: O'Reilly, Inc., 1995.

    Dev-C++ website, http://www.bloodshed.net/dev/devcpp.html

    Kernighan BW and Ritchie DM, The C Programming Language, Englewood Cliffs, NJ: Prentice Hall, 1978.

    Return to top

    SIGSTAT Topics for Spring 2008

    April 16, 2008: Survival Models in SAS: PROC PHREG Part 1
    (http://www.sas.com/apps/pubscat/bookdetails.jsp?pc=55233)

    Continuing the series of talks based on the book "Survival Analysis Using the SAS System: A Practical Guide" by Paul Allison, in November we'll start Chapter 5: Estimating Cox Regression Models with PROC PHREG. Topics discussed are:

    1. The proportional hazards model
    2. Partial likelihood
    3. Tied data

    May 21, 2008: Survival Models in SAS: PROC PHREG - Part 2
    (http://www.sas.com/apps/pubscat/bookdetails.jsp?pc=55233)

    Continuing the series of talks based on the book "Survival Analysis Using the SAS System: A Practical Guide" by Paul Allison begun in October 2007, we'll start Chapter 5: Estimating Cox Regression Models with PROC PHREG.

    Topics covered are: Tied data

    June 18, 2008: Survival Models in SAS: PROC PHREG - Part 3
    (http://www.sas.com/apps/pubscat/bookdetails.jsp?pc=55233)

    Continuing the series of talks based on the book "Survival Analysis Using the SAS System: A Practical Guide" by Paul Allison begun in October 2007, we'll start Chapter 5: Estimating Cox Regression Models with PROC PHREG.

    Topics covered are: Time-Dependent Covariates

    SIGSTAT is the Special Interest Group in Statistics for the CPCUG, the Capital PC User Group, and WINFORMS, the Washington Institute for Operations Research Service and Management Science.

    All meetings are in Room S3031, 1800 M St, NW from 12:00 to 1:00. Enter the South Tower & take the elevator to the 3rd floor to check in at the guard's desk.

    First-time attendees should contact Charlie Hallahan, 202-694-5051, hallahan@ers.usda.gov, and leave their name. Directions to the building & many links of statistical interest can be found at the SIGSTAT website, http://www.cpcug.org/user/sigstat/.

    Return to top

    Note From The WSS NEWS Editor

    Items for publication in the May issue of the WSS NEWS should be submitted no later than April 15, 2008. E-mail items to Michael Feil at michael.feil@usda.gov.

    Return to top

    Click here to see the WSS Board Listing (pdf)
    Return to top