### June 2008

Contents:

- Annual Dinner
- Book Signing at Reiter's Books
- Seminars, Conferences, Symposia & Call For Papers:

- Upcoming Seminars
- NCHS Data Users Conference (August 11-13, 2008)

- Education Announcements:

- Fall 2008 Graduate Courses (George Washington University)
- Students' Corner
- Short Courses (includes JPSM short courses)
- SIGSTAT Topics for Spring 2008

- Employment Opportunities
- Note From The WSS NEWS Editor
- WSS People

- PDF Versions:
(Requires Adobe Acrobat Reader)

Newsletter

Area Meetings and Courses

### Annual Dinner

This year's WSS Annual Dinner is at the Meiwah restaurant in Chevy Chase, Maryland (Friendship Heights Metro), Wednesday, June 25, 2008. The Gertrude Cox Award winner is Dr. Thomas Lumley from the University of Washington. Dr Lumley will speak at the dinner about open source statistical software. The price for the dinner is $45 person.

Thomas Lumley: "Open source statistical software: how, why, where?"

Over the current decade the R statistics environment and its package system have gone from being too obscure to be worth citing to being too well-known to need citing. I will talk about some current and historical issues in the design and development of R and of my R 'survey' package. I will also discuss how open-source statistical software fits (or fails to fit) the standard explanations for the success of the open-source model.

Downlaads:

flyer (pdf)
dinner program (pdf)

Thomas Lumley's keynote presentation (pdf)

###
Book Signing at Reiter's Books

Wednesday, June 11th 12:00 noon-2:00 pm

Reiter's Books

1990 K Street NW

Washington DC

Lite refreshments (wine, cheese, and soft drinks) provided.

Wendy Alvey and Fritz Scheuren

Elections and Exit Polls

The newly-released book — Elections and Exit Polling, edited by Fritz Scheuren and Wendy Alvey — is a tribute to Warren Mitofsky, the father of exit polling. The volume, just published by John Wiley & Sons, Inc., consists of excerpts from interviews with Mitofsky shortly before he died and selected readings from recent statistical research related to election polling and exit polling.

Combining wisdom from one of the most notable names in the field along with findings from modern research and insightful recommendations for future practices, Elections and Exit Polling is an excellent supplement for political science and survey research courses at the upper-undergraduate or graduate level. It is also a one-of-a-kind reference for pollsters, survey researchers, statisticians, and anyone with a general interest in the methods behind global elections and exit polling.

All royalties from the book have been donated to the Warren J. Mitofsky Award for Excellence in Survey Research. This award is being co-sponsored by the American Association for Public Opinion Research and the American Statistical Association and is being managed by The Roper Center for Public Opinion Research. The award recognizes outstanding research or reporting of public opinion or survey methodology, especially work based on data from The Roper Center's public opinion archives.

The chapter titles are:- Introduction, by Fritz Scheuren and Wendy Alvey
- The Infamous 2000 Election
- 2004: Fraudulent Election?
- Midterm Elections: 2006
- Globe-Trotting Consultant
- Looking Ahead: Recommendations for 2008 and Beyond
- Technical Appendix

Sponsor: WSS Human Rights Section and WSS Public Policy Section

Return to top###
The George Washington University

Fall 2008 Graduate Courses

The Statistics Department at The George Washington University will offer the following Graduate Courses in Fall 2008 (September 2 - December 20, 2008) at the main campus.

Enhance your statistical analysis skills by taking one or more of these courses. Registering as a non-degree student is easy - please visit http://www.gwu.edu/~regweb for relevant information.

For questions or further information please contact Dr. Reza Modarres, e-mail: reza@gwu.edu, ph: 202-994-6888. More information is also available at http://www.gwu.edu/%7Ebulletin/grad/stat.html.

Statistics 201-10. Mathematical Statistics

Thursday, 6:10pm-8:40pm

Instructor: Dr. H. Mahmoud

This is the first part of a two-part graduate level series in Mathematical Statistics. The objective of the course is to introduce students to the concepts of probability that are useful for understanding statistical theory (the course continues on to Stat 202 in Spring, which deals with the theory of statistical inference). Topics to be covered in Stat 201 include basics of probability theory (including conditional probability, Bayes theorem, random variables, density and mass functions), univariate transformations, expected value, moment generating function, common probability distributions (including binomial, normal, uniform), multivariate distributions and transformations, covariance, inequalities and sampling distributions. This is roughly chapters 1 through 5 of the text: Statistical Inference (2nd Ed.) by Casella, G. and Berger, R. L.; Duxbury Press, CA.

This course is required for MS and Ph.D. students in Statistics, and Biostatistics, and Ph.D. students in Epidemiology. Students from other quantitative fields such as Economics, Finance, Engineering etc. would also find the course very useful and are encouraged to join. Prerequisites: Multivariable Calculus (Math 33), and Linear Algebra (Math 124) or equivalent.

Statistics 207. Methods of Statistical Computing I

Tuesday, 06:10pm-08:40pm

Instructor: Dr. Y. Lai

Computing is essential for the practice of statistics. This course will introduce basic computational methods from a statistical point of view. In particular, the following general areas will be covered: (i) Fundamental of computer science; (ii) Numerical analysis and computer intensive methods; and (iii) Statistical computing and graphics.

Prerequisites include knowledge of a programming language, a course in matrix algebra and mathematical statistics.

Textbooks: Statistical Computing, by W. J. Kennedy and J. E. Gentle and An Introduction to the Bootstrap, by B. Efron and R. J. Tibshirani.

Statistics 215. Applied Multivariate Analysis

Monday, 06:10pm-08:40pm

Instructor: Dr. R. Modarres

This courseis intended for students interested in statistical analysis of several variables, most likely dependent, following a joint normal distribution. It covers inferential and descriptive multivariate techniques, including the multivariate normal distribution, assessing the assumption of normality, transformations to near normality, Hotelling test for the mean vector, confidence regions and simultaneous comparisons of component means, missing observations and the EM algorithm and principal components analysis. In addition to the text, other topics from the literature, including some non-parametric techniques will be covered. For each technique, the theoretical foundation is developed and applied to observations from behavioral, social, medical, and physical sciences. The computational aspects will include use of matrix algebra tools (SAS/IML). Prerequisites include a course in matrix algebra and mathematical statistics.

Textbook: Applied Multivariate Analysis, 6th Ed., by R.A. Johnson and D.W. Wichern.

Stat 217: Design of Experiments

Tuesday, 6:10pm-8:40pm

Instructor: Dr. Z. Li

This courseis a graduate level introduction to Design of Experiments, an area of statistics concerned with the planning of scientific investigation. The main components of an experimental design are the selection of the independent and dependent variables to be studied, determination of sample size, and allocation of experimental units to experimental treatments.

Specific topics which will be covered in detail include Replication, Blocking, Randomization, Factorial and Fractional -Factorial experiments, Repeated Measures designs, and Latin Square designs. Prerequisite: Stat 157-58; Math 124.

Statistics 227. Survival Analysis

Wednesday, 6:10pm-8:40pm

Instructor: Dr. Q. Pan.

This course will discuss parametric and nonparametric methods for the analyses of events observed in time (survival data). Topics include: survival distributions, Kaplan-Meier estimate of survival functions, Greenwood's formula, Mantel-Haenszel test, logrank and generalized logrank tests, Cox proportional hazards model, parametric regression models, and power and sample size calculations for survival analysis.

Prerequisite: Stat 201-2 or permission of instructor.

Statistics 257. Probability

Wednesday, 6:10pm-8:40pm

Instructor: Dr. H. Mahmoud

This course will discuss rigorous modern measure-theoretic probability. No prior knowledge of measure theory is assumed; the necessary concepts will be developed as necessary. Topics to be covered include: Sigma fields and Probability measures, Probability Axioms, Lebesgue integration and expectation, Measure-theoretic independence, Borel-Cantelli Lemmas, Modes of probabilistic convergence, Weak and strong laws of large numbers, and Central limit theorems.

Students wishing to move on to the next level of sophistication and mathematical maturity needed for study in fields such as stochastic processes, statistics or advanced applications will find this course useful.

Prerequisite: Stat-201 (MS level course in probability).

Textbooks: Karr, A. (1993). Probability. Springer, New York.

Supplemental Texts: Chung, K. (1974). A Course in Probability Theory. Academic Press, Orlando. Billingsley, P. (1990). Probability and Measure, 2nd Edition. Wiley, New York.

Stat 262. Nonparametric Inference

Thursday, 06:10pm-08:40pm

Instructor: Dr. S. Kundu

This course will discuss inferential methods when the form of the underlying distribution is not specified or is only partially specified. These methods are robust as they do not rely on strong distributional assumptions. Topics to be covered in this course include: U-statistics, rank tests, locally most powerful rank tests, one and two-sample tests, asymptotic distribution theory, asymptotic relative efficiency, nonparametric point estimates and confidence intervals, goodness of fit tests. If time permits some advanced topics like Bootstrap, Nonparametric Density estimation, Nonparametric Regression will be covered.

Prerequisite: Stat 201-2 or permission of instructor.

Statistics 263. Advanced Statistical Theory I

Thursday, 6:10pm-8:40pm

Instructor: Dr. T. Nayak

This is an advanced course on principles and theory of statistical inference. Topics include: sufficiency, ancillarity, completeness, unbiased estimation, Cramer-Rao inequality, Bayesian estimation, admissibility, hypotheses testing.

Prerequisite: Stat 201-2 or permission of instructor.

Statistics 287. Modern Theory of Survey Sampling

Wedensday, 6:10pm-8:40pm

Instructor: Dr. P. Chandhok

The main objectives of the course are to provide a rigorous treatment of sampling theory and its applications. With this background the student can modify the existing theory, develop new theory, and better understand its applications. Graduate students from quantitative fields such as Statistics, Mathematics, Economics, Finance and Engineering as well as professionals working in government and private-sector companies, with an interest in survey sampling will benefit from this course. The prerequisites for the class are Statistics 91 (Principles of Statistical Methods) or equivalent and Math 32 (Single-Variable Calculus) or equivalent.

This coursewill introduce the following topics: simple random sampling with and without replacement, systematic sampling, unequal probability sampling with and without replacement, ratio estimation, difference estimation and regression estimation.

Stat 289: Statistical Genetics

Monday, 6:10-08:40pm

Instructor: Dr. Z. Li

There are three objectives of this course: 1) to provide an introduction of quantitative genetics for students without any genetics background; 2) to give a rigorous statistical treatment of some genetic problems; 3) to introduce current research topics in the area of statistical methods for genetic analysis.

Topics include: Allele frequency, Hardy-Weinberg equilibrium, and linkage equilibrium; Genetic variance and correlation; Parametric linkage analysis; Non-parametric linkage analysis; Recurrence-risk ratio method; The transmission/disequilibrium test (TDT); Family-based case control vs. unrelated case-control designs; Multiple point linkage analysis; Interval mapping; Haplotype-based association analysis.

Return to top### Students' Corner

Hello fellow students! This month I would like to bring to your attention a mathematical/probability curiosity that I find fascinating. Strictly speaking, it might not count as statistics per se, but I think that students of statistics who know something about probability should find it interesting to consider.

Imagine a casino that has two tables at which you can gamble. The first table has a game, let us call it Game A, that has a simple coin flip - heads you win, tails you lose. This coin is weighted on one side, so that you have a probability p of winning, where p is some number between 0 and 1, and p is not necessarily equal to 0.5. (We're going to set p to a number less than 0.5, so the game is unfortunately weighted against you!) Below is a small MATLAB function modeling this coin toss.

function [ Capital ] = GameA(Capital,Bet,p) % Generate random number in [0,1] coinFlip = rand(1); % Check whether we won or lost this bet. if ( coinFlip < p ) Capital = Capital + Bet; % Won, increase Capital. else Capital = Capital - Bet; % Lost, decrease Capital. end

This simple MATLAB function takes as input the following three arguments:

- Capital - current amount of money that you have in your gambling account

- Bet - the amount of money you are betting

- p - the probability of winning the coin toss and returns the amount of money that you now have left in your account after resolving the gamble.

We'll assume that Capital and Bet are always integer values. Save this code in a file named GameA.m, in a folder that is in your MATLABPATH; if the folder isn't in your MATLABPATH, you can use the ADDPATH command to make it so.

If p were less than 0.5 and you played Game A over and over again, you'd be guaranteed to go bankrupt in the long run; Game A would be a "losing game." The following MATLAB code simulates betting $1 on Game A for 100 bets in a row, with p set to 0.5-ε, where ε=0.005. 50,000 trials of 100 sequential bets on Game A are simulated, and then averaged.

% TestA T = 50000; % Number of trials G = 100; % Number of bets made per trial. Bet = 1; % Bet $1 each time. e = 0.005; % Epsilon p = 0.5 - e; % Game A probability of winning. RecA = zeros(G,1); % Record of Capital fluctuations. % Run simulations for t=1:T; % Loop over T trials Capital = 0; % Net funds for betting. for g=1:G; % Loop over G games per trial Capital = GameA(Capital,Bet,p); % Play Game A, update Capital RecA(g) = RecA(g) + Capital; % Keep cumulative record of Capital end end % Plot results plot(RecA/T,'b.','LineWidth',2) title(sprintf('Game A Change in Capital, Averaged over %d Trials',T)) xlabel('Games played') ylabel('Capital')

Save the above MATLAB code into the same folder as GameA.m, in a file named TestA.m. Then invoke it in MATLAB by typing

TestA

at the MATLAB prompt. If you run this MATLAB code, you should see a plot similar to that shown to the right (results may vary a little due to the random numbers generated). This demonstrates that in the long run, you'll go bankrupt playing Game A. This is not surprising.

Now, the second table in our hypothetical casino has Game B, which has a rather odd coin toss game involving two coins, B1 and B2. In Game B, if your current capital is an integer multiple of 3, we flip coin B1; otherwise we flip coin B2. This means that on average we'll flip B2 twice as often as B1. With B1, your probability of winning is p1 = 0.1 - ε, while with B2 it is p2 = 3/4 - ε, where ε=0.005. Even though your probability of winning with coin B2 is very good, coin B1's probability of winning, p1, is so bad that it more than offsets p2. So Game B is still rigged against you. As a very simple exercise, compute the overall probability of winning Game B; you should get a number less than 1, meaning that if you bet $1 the expected value of your winnings is less than $1 - a "losing game" again. Below is MATLAB code encoding Game B's scenario. Save it into a file named GameB.m, in the same folder where you saved GameA.m.

function [ Capital ] = GameB(Capital,Bet,M,p1,p2) % Choose which of two coins we'll use for Game B. if ( mod(Capital,M) == 0 ) pp = p1; % Coin B1 else pp = p2; % Coin B2 end % Generate random number in [0,1] coinFlip = rand(1); % Check whether we won or lost this bet. if ( coinFlip < pp ) Capital = Capital + Bet; % Won, increase Capital. else Capital = Capital - Bet; % Lost, decrease Capital. end

The followingMATLAB code simulates playing Game B. Save it into a MATLAB file named TestB.m, and then invoke it in MATLAB. Again, 50,000 trials of 100 sequential bets on Game B are simulated, and then averaged.

% TestB T = 50000; % Number of trials G = 100; % Number of bets made per trial. B = 1; % Bet $1 each time. e = 0.005; % Epsilon M = 3; % Modulus base. p1 = (1/10) - e; % Game B, Coin B1. p2 = (3/4) - e; % Game B, Coin B2. RecB = zeros(G,1); % Record of Capital fluctuations. % Run simulations for t=1:T; % Loop over T trials Capital = 0; % Net funds for betting. for g=1:G; % Loop over G games per trial Capital = GameB(Capital,B,M,p1,p2); % Play Game B, update Capital RecB(g) = RecB(g) + Capital; % Keep cumulative record of Capital end end % Plot results plot(RecB/T,'g-','LineWidth',2) title(sprintf('Game B Change in Capital, Averaged over %d Trials',T)) xlabel('Games played') ylabel('Capital')

If you run this MATLAB code, you should see a plot similar to that shown to the right. As with Game A, in the long run you'll lose money in the long run with Game B. Again, this is not surprising. (By the way, what do you make of those oscillations at the beginning of the plot, where the number of games played is less than 20?)

But a very interesting thing happens if you now alternate between two bets on Game A and two bets on Game B: all of the sudden you start winning money! To demonstrate this save the MATLAB code shown below in a file named TestAB.m, and then invoke it in MATLAB. This code simulates switching between two gambles on Game A and then two on Game B, 100 times in a row. Again, 50,000 simulations are run and then averaged.

% TestAB T = 50000; % Number of trials G = 100; % Number of bets made per trial. B = 1; % Bet $1 each time. e = 0.005; % Epsilon p = 0.5 - e; % Game A probability of winning. M = 3; % Modulus base. p1 = (1/10) - e; % Game B, Coin B1. p2 = (3/4) - e; % Game B, Coin B2. RecAB = zeros(G,1); % Record of fund fluctuations. count = 0; % Initialize counter. game = 0; % IF game==0, Game A; IF game==1, Game B % Run simulations for t=1:T; % Loop over T trials Capital = 0; % Net funds for betting. for g=1:G; % Loop over G games per trial count = count + 1; % Check whether we're playing game A. if ( game == 0 ) Capital = GameA(Capital,Bet,p); % Play Game A, update Capital % Check whether to switch to game B now. if ( count == 2 ) count = 0; game = 1; end % Else we're playing game B. else Capital = GameB(Capital,B,M,p1,p2); % Play Game B, update Capital % Check whether to switch to game A now. if ( count == 2 ) count = 0; game = 0; end end; % IF game RecAB(g) = RecAB(g) + Capital; % Keep cumulative record of Capital end; % FOR g end; % FOR t % Plot results plot(RecAB/T,'r--','LineWidth',2) title(sprintf('Alternating Games Capital Change, Averaged over %d Trials',T)) xlabel('Games played') ylabel('Capital')

This MATLAB code should produce a plot similar to that shown on the right. As you can see, if you alternate between two bets on Game A and two on Game B, you now start winning money and your capital steadily increases! This paradoxical effect is known as Parrondo's Paradox, or Parrondo's Games.

Do you not find this rather surprising? The effect still holds if you randomly switch between the two games, rather than regularly alternate between them. Modify the MATLAB code to demonstrate this! Can you then create a plot that replicates Figure 1b in Harmer and Abbott's 1999 article in Nature? (By the way, this installment of the Students' Corner is heavily based on Harmer and Abbott's article. Please try reading it; it is less than a page long, and highly readable!) Of course, Game B is rather contrived, since it is conditional on whether your current capital is divisible by 3. In a sense, Game B "knows" something about how much capital you currently have, which is kind of strange.

To combine the three previous graphs into one MATLAB plot, try the following MATLAB commands:

axis([0 100 -1.5 1.5]) plot(RecA/T,'b.','LineWidth',2) hold on plot(RecB/T,'g-','LineWidth',2) plot(RecAB/T,'r--','LineWidth',2) hold off title(sprintf('Comparison Of Three Trial Types, Averaged over %d Trials',T)) xlabel('Games played') ylabel('Capital') legend('Game A','Game B',... 'Alternating Between Games A and B','Location','SouthOutside')

An example of running this snippet of MATLAB code is shown to the right.

Try varying the values of p, p1, and p2. Does the paradoxical effect hold for all values of p, p1, and p2? Does the value of the modulus base M matter? What happens if you increase ε? Does ε have to have the same value for both Games A and B? Suppose instead of betting exactly $1 each time, you instead bet a constant fraction of your current holdings, e.g. one one-hundredth of your current capital. What happens if you modify TestA.m to model this? What about TestB.m?

Now consider this thought experiment. Suppose a husband and wife team enter our hypothetical casino, and the man plays Game A while the woman plays Game B. Since they're married, they share the same gambling account, and Game B is conditional on whether the amount in that joint account is divisible by 3. Would the husband and wife working as a team make money? Now suppose the couple have a falling out, and they decide to hold separate gambling accounts, but they continue to play only at their respective tables, the husband at Game A and the wife at Game B. Would it make a difference that they now have separate gambling accounts?

So why does Parrondo's Paradox work? See Chapter 11 in Julian Havil's book, Nonplussed! Mathematical Proof of Implausible Ideas, for a highly detailed explanation of how this phenomenon arises. I must admit that I haven't yet had the patience to wade through the algebra Ð it's a little dense Ð but I thought I'd alert you to Dr. Havil's explanation, if you wanted to try reading it!

The big question that you must be wondering at this point is, can I use this to make money in the stock market? It has been argued that you can't use Parrondo's Paradox to make money in the stock market (Iyengar and Kohli, 2004). However, a New York Times article published in 2000 reported that Dr. Sergei Maslov, a physicist at Brookhaven National Laboratory, showed that it might indeed be possible. I emailed Dr. Maslov asking him for further information on this finding, and he directed me to the bottom of the 5th page of a 1998 paper he published in the International Journal of Theoretical and Applied Finance. You can find that paper at this URL: http://xxx.lanl.gov/abs/cond-mat/9801240. Perhaps you can figure out how to use Parrondo's Paradox to make some money, and then you can pay your tuition!

Harmer and Abbott conclude their 1999 Nature article with the following intriguing conjecture:

Game theory is linked to various disciplines such as economics and social dynamics, so the development of parrondian-like strategies may be useful, for example for modelling cases in which declining birth and death processes combine in a beneficial way.

That's all for this month. If you have any feedback on this column or ideas for future topics, please email me at jmm97@georgetown.edu. As always, your thoughts will be greatly appreciated.

Joe Maisog

Georgetown University / Medical Numerics

References

Harmer GP and Abbot D, Losing strategies can win by Parrondo's paradox, Nature 402(864):864, 1999.

Havil J, Nonplussed! Mathematical Proof of Implausible Ideas, Princeton, NJ: Princeton University Press, 2000, Chapter 11 (pp. 115-126).

Iyengar R and Kohli R, Why Parrondo's Paradox Is Irrelevant for Utility Theory, Stock Buying, and the Emergence of Life, Complexity 9(1):23-27, 2004.

Maslov S and Zhang YC, Optimal investment strategy for risky assets, International Journal of Theoretical and Applied Finance 1(3):377-387 (1998).

Paradox in Game Theory: Losing Strategy That Wins, New York Times (Science Times section), January 25, 2000.

See also:

Harmer et al., The Psradox of Parrondo's Games, Proceedings: Mathematical, Physical and Engineering Sciencea, Vol. 456, No. 1994. (Feb. 8, 2000), pp. 247-259.Harmer and Abbott, Parrondo's Paradox, Statistical Science, Vol. 14, No. 2. (May, 1999), pp. 206-213. See especially section 1.1, which makes an interesting connection between a theoretical physical device called a Brownian Ratchet (which seems sort of like a free energy device!) and Parrondo's Paradox.

For an interactive Java simulation, see the web page http://www.cut-the-knot.org/ctk/Parrondo.shtml (A. Bogomolny, Parrondo Paradox, from Interactive Mathematics Miscellany and Puzzles).

From reading the description on Amazon.com, I gather that Richard Armstrong's 2006 novel God Doesn't Shoot Craps: A Divine Comedy is about a fictional gambler who uses Parrondo's Paradox to make money. I think I will buy this book and read it!

Return to top### SIGSTAT Topics for Spring 2008

June 18, 2008: Survival Models in SAS: PROC PHREG - Part 3

(http://www.sas.com/apps/pubscat/bookdetails.jsp?pc=55233)

Continuing the series of talks based on the book "Survival Analysis Using the SAS System: A Practical Guide" by Paul Allison begun in October 2007, we'll start Chapter 5: Estimating Cox Regression Models with PROC PHREG.

Topics covered are: Time-Dependent Covariates

SIGSTAT is the Special Interest Group in Statistics for the CPCUG, the Capital PC User Group, and WINFORMS, the Washington Institute for Operations Research Service and Management Science.

All meetings are in Room S3031, 1800 M St, NW from 12:00 to 1:00. Enter the South Tower & take the elevator to the 3rd floor to check in at the guard's desk.

First-time attendees should contact Charlie Hallahan, 202-694-5051, hallahan@ers.usda.gov, and leave their name. Directions to the building & many links of statistical interest can be found at the SIGSTAT website, http://www.cpcug.org/user/sigstat/.

Return to top### Note From The WSS NEWS Editor

Items for publication in the Summer issue of the WSS NEWS should be submitted no later than July 7, 2008. E-mail items to Michael Feil at michael.feil@usda.gov.

Return to topClick here to see the WSS Board Listing (pdf)

Return to top