How I would improve my graduate program

At the end of the Ph.D. process, the department I graduated from sent me a survey about my experiences. I wanted to share the feedback I gave them on one specific question. Yes, it’s opinionated, and I welcome comments.

They asked:

How would you Improve your graduate program?

I answered:

The single biggest change I would suggest is a reform of the required two-course introductory statistics sequence for graduate students. (I took [multiple linear regression, the optional third course in the sequence] with [redcated] and I thought it was excellent and extraordinarily useful).

How would I change the course sequence? I’m not positive, but here are my suggestions:

  • The courses should focus on producing researchers who can be informed consumers and critics of quantitative work, even if (and especially if) the students in the course will not go on to be quantitative researchers
  • The courses need to be about thinking statistically, not following rote procedures for analysis in SPSS.
  • The courses should always use real datasets. Toy data sets are contrived, and they bypass so much of the work and thinking that goes into cleaning data. Worse, analyzing toy data produces toy results, which hardly help anyone.
  • Courses should focus on exploratory data analysis, including (and especially) visualizing data.
  • If the courses are going to incorporate doing statistical analysis, they should move away from SPSS. There’s no reason students should have to pay for statistics software when there are fantastic, FREE, industry-standard software toolkits to do statistics (including the R+Rstudio ecosystem and the iPython Notebook). Think about it: not only are we asking students to pay NOW when they take the courses, we’re also only teaching them that system, which means later in their careers they’ll have to spend more money to buy SPSS again, because it’s the only system they were trained on.
  • I really can’t emphasize that last point enough. With the graphical and statistical funcitonalities of R, Python, and other languages, I can’t think of a single reason this course should continue to be taught in SPSS, which costs money, graphically underperforms, and chains students to a for-pay software ecosystem.