Johan A. Elkink

Data Analytics for the Social Sciences

POL 30660 Data Analytics for the Social Sciences

Thursday, 9-11 am, on Zoom.


You should install the necessary software on your computer prior to the first class. All software is free. R can be downloaded here.

You should also install RStudio, after installing R.

A handy overview of R regression commands can be found here: reference card for regression. A more general one for R is here: short reference card. The Google R Style guide provides suggestions for writing clear code.

1 21/1 Introduction slides lab output markdown cheat sheet | videos: markdown | importing Excel | using packages
2 28/1 Distributions and descriptive statistics slides lab output
3 4/2 Comparing through visualisation slides lab output Top 50 ggplot examples | data preparation script
11/2 Group presentations
4 18/2 Linear regression slides lab output videos: merging
5 25/2 Logistic regression slides lab output
6 4/3 Trees and forests slides lab output lecture: trees | forests | project 2
7 25/3 Geography and networks slides lab output
8 1/4 Cluster analysis slides lab output lecture: intro | kmeans | hierarchical | dissimilarity | speeches
9 8/4 Dimension reduction slides lab output lecture: intro | mds | pca | factor analysis
10 15/4 Wordscores slides lab output lecture: intro | wordscores | wordfish
11 22/4 Topic models slides lab output lecture: topic models | putin | afterthoughts | project 3