Johan A. Elkink

Data Analytics for the Social Sciences

POL 30660 Data Analytics for the Social Sciences

Thursday, 9-11 am, on Zoom.


You should install the necessary software on your computer prior to the first class. All software is free. R can be downloaded here.

You should also install RStudio, after installing R.

A handy overview of R regression commands can be found here: reference card for regression. A more general one for R is here: short reference card. The Google R Style guide provides suggestions for writing clear code.

1 21/1 Introduction slides lab output lecture: introduction | live session
videos: markdown | importing Excel | using packages
markdown cheat sheet
2 28/1 Distributions and descriptive statistics slides lab output lecture: levels of measurement | missing data | graphs | descriptive statistics
3 4/2 Comparing through visualisation slides lab output lecture: multiway graphs | control variables | project 1
Top 50 ggplot examples
11/2 Group presentations
4 18/2 Linear regression slides lab output lecture: linear regression | multiple regression | model selection
videos: recoding and merging (live session) | merging (demonstration)
5 25/2 Logistic regression slides lab output lecture: linear probability model | linear discriminant analysis | logistic regression | model fit
6 4/3 Trees and forests slides lab output lecture: trees | forests | project 2
7 25/3 Geography and networks slides lab output lecture will be live in class and uploaded afterwards
8 1/4 Cluster analysis slides lab output lecture: intro | kmeans | hierarchical | dissimilarity | speeches
9 8/4 Dimension reduction slides lab output lecture: intro | mds | pca | factor analysis
10 15/4 Wordscores slides lab output lecture: intro | wordscores | wordfish
11 22/4 Topic models slides lab output lecture: topic models | putin | afterthoughts | project 3