-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathfinches.Rmd
32 lines (25 loc) · 1.77 KB
/
finches.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
---
title: "Archetype analysis of Darwin's finches"
output: html_document
---
```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
## Useful libraries (optional)
Remove the following two lines if you want to use other tools for plotting. You might have to install ggplot2 and ggfortify using `install.package(ggplot2)` and`install.package(ggfority)`. You can also use something else to plot. In particular, [autoplot](https://cran.r-project.org/web/packages/ggfortify/vignettes/plot_pca.html) (available through ggplot and ggfortify might be useful to look at the PCA results.)
```{r}
library(ggplot2)
library(ggfortify)
```
## Tasks
Follow the following steps to perform an archetype analysis of the finch data.
* Read the data from `data/finches.csv` (`read.csv` should do the trick).
* Plot the data. Can you already infer archetypes? It is challenging in 5 dimensions, right?
* Perform a principal component analysis (PCA).
* `princomp` is recommended.
* If you need a refresher on PCA, check out this interactive explanation: https://setosa.io/ev/principal-component-analysis/.
* Determine how much of the variance in the data is captured by the individual principal components.
* Plot the first two principle components. How many archetypes can you identify in the plot?
* Look at the feature loadings (`pca_result$loadings`). You can also `autoplot` or `biplot` to generate a [biplot](https://en.wikipedia.org/wiki/Biplot) of the components and loadings. What are the contributions of the features to the first two principal components?
* Compute the convex hull (`chull` is recommended) and highlight it in the PCA plot.
* Bonus points: based on the points in the convex hull, determine the triangle with the largest surface (spanned by three points in the convex hull).