You will have to download the data from the Harvard Dataverse

```
suppressMessages({
library(rio)
library(stargazer)
library(ggplot2)
library(MASS)
library(knitr)
library(tree)
library(randomForest)
library(pROC)
})
ross <- import("ReplicationdataRossVoeten.dta")
icc <- import("http://www.joselkink.net/files/data/icc.dta")
ross$iccSigned <- 0
ross$iccRatified <- 0
for (i in 1:dim(icc)[1]) {
ross$iccSigned[ross$ccode == icc$ccode[i] & ross$Year >= icc$signed[i]] <- 1
ross$iccRatified[ross$ccode == icc$ccode[i] & ross$Year >= icc$ratified[i]] <- 1
}
summary(ross$Year)
```

```
## Min. 1st Qu. Median Mean 3rd Qu. Max.
## 1960 1973 1987 1987 2001 2014
```

`length(unique(ross$countryname))`

`## [1] 171`

`ross2001 <- subset(ross, Year == 2001)`

So there are 171 countries over a time period from 1960 to 2014 - although of course not all variables will be available for all countries and all years.

The above code also downloads a small data file on membership to the International Criminal Court, a binary variable reflecting whether a country signed the treaty and a binary variable whether the country ratified it. For 2001 the distribution is as follows:

```
tbl <- table(ross2001$iccSigned, ross2001$iccRatified)
colnames(tbl) <- c("Not ratified", "Ratified")
rownames(tbl) <- c("Not signed", "Signed")
kable(addmargins(tbl))
```

Not ratified | Ratified | Sum | |
---|---|---|---|

Not signed | 88 | 1 | 89 |

Signed | 60 | 45 | 105 |

Sum | 148 | 46 | 194 |

`kable(addmargins(floor(prop.table(tbl, 1) * 100), 2))`

Not ratified | Ratified | Sum | |
---|---|---|---|

Not signed | 98 | 1 | 99 |

Signed | 57 | 42 | 99 |

So 42% of the signed treaties are subsequently ratified.

To avoid complications that are common in panel data, for the remainder of this lab we just use data from 2001.

We run the linear regression from last class just to get the design matrix:

```
m1 <- lm(iccRatified ~ polity2 + logoil + lngdp + tradegdp + lnpop, data = ross2001)
designMatrix <- m1$model
```

We continue with the same model we used last class for classification, explaining whether or not countries ratified the ICC treaty.

```
t <- tree(as.factor(iccRatified) ~ polity2 + logoil + lngdp + tradegdp + lnpop, data = ross2001)
plot(t)
text(t)
```

`plot(roc(designMatrix$iccRatified, predict(t)[,2]), main = "Tree")`

Compare this to the ROC curves from the previous lab. How does it compare? |

This tree has a lot of branches, so it is difficult to interpret the output. It would probably be worthwhile to prune the tree to, say, 6 branches:

```
t6 <- prune.tree(t, best = 6)
plot(t6)
text(t6)
```

Interpret the plot. |

Instead of using just one tree, we can use a forest of trees. This increases predictive quality, but at the cost of interpretativeness.

```
f <- randomForest(as.factor(iccRatified) ~ polity2 + logoil + lngdp + tradegdp + lnpop, data = designMatrix)
importance(f)
```

```
## MeanDecreaseGini
## polity2 10.685924
## logoil 6.678683
## lngdp 17.017069
## tradegdp 11.640419
## lnpop 10.208301
```

`plot(roc(designMatrix$iccRatified, predict(f, type = "prob")[, 2]), main = "Random Forest")`