presentation.Rmd

---
title: "Hello, R!"
author: "Yue Hu"
output:
  ioslides_presentation:
    self_contained: yes
    logo: image/logo.gif
    transition: faster
    widescreen: yes
    slidy_presentation:
    incremental: yes
---

# Preface
## What are covered
* A overview of R
* Data manipulation (input/output, row/column selections, etc.)
* Descriptive and binary hypotheses (summary, correlation, t-test, etc.)
* Multiple regression (OLS, GLS, MLM, etc.)
* Presentation (table, graph)
* Version control (if we have time)


# An Overview of R
## Why using R in your research?
You can use R to:

* Do statistics and solve math problems
* Edit codes in Excel, Python, C++, ...
* Scrape data from texts, websites, databases, pdf...
* Create presentation slides in pdf (as LaTex beamer) or html (as Markdown)
* Create webpages
* Write academic articles and save them in html, pdf, or word.
* Write a book (see e.g., [bookdown](https://bookdown.org/home/).)


## Why R rather than the others?? {.columns-2 .build}
* It's free!! 
* It's developing!
    + R is very compatible with new techniques
    + e.g., Network analysis, spatial analysis with GIS, and text analysis with big data. 
* It's multi-lingual!

```{r eval = F}
"Hello 你好 안녕하세요 здравствуйте"
```

* It's popular!


<img src="http://revolution-computing.typepad.com/.a/6a010534b1db25970b01a3fc45e6fc970b-pi" height="400" width="450"/>
[Magoulas & King, 2014, *Data Science Salary Survey*.](http://www.oreilly.com/data/free/stratasurvey.csp)


## A Trade-Off of the Great Power

<div class="centered">
  <img src=https://sites.google.com/a/nyu.edu/statistical-software-guide/_/rsrc/1396388441453/summary/LearningCurve2.png height="400"/>
  </div> Source: NYU Data Services.

## Software and package installations {.columns-2}
### Software installation
[![R](https://www.r-project.org/Rlogo.png)](https://www.r-project.org/)

[![Rstudio](https://www.rstudio.com/wp-content/uploads/2014/03/blue-250.png)](https://www.rstudio.com/products/rstudio/download/preview/)


<span style="color:green">Tip</span>:
<div class="notes">
Install R before Rstudio; so does in updating.
Using the [Rstudio preview](https://www.rstudio.com/products/rstudio/download/preview/).
</div>


### Package installation and loading
Packages are "<span style="color:purple">Apps</span>" for R. 
`install.packages(<package name>)`
`install_github("<repositary/package name>")`


Find instructions of package installation:
[An example](https://github.com/sammo3182/interplot)

Click the apps: Load the package

`library(<package name>)`
`require(<package name>)`

## RStudio{.flexbox .vcenter}
<img src="image/rstudio.png" height="500" width = "900" />


# Math and Basic Statistics with R
## Set where to locate the data and store the results
* Always check or set the <span style="color:purple">working directory</span> first
    + `getwd()`
    + `setwd("E:/R workshop/rworkshop")`
* Or click, click, and click in Rstudio
    + <img src=image/setwd.png />
 

## Terms of R in plain English {.columns-2 .build}
* Object: packing things together and naming it
* Vector: 
    + Mathematics: a one-column matrix
    + Practice: a single variable
* Factor:
    + A special vector
    + Special for ordinal or mulinomial variable
* Matrix vis-a-vis Data frame
    + Matrix is a matrix
    + Data frame is a dataset
    
    
<img src="http://s51.podbean.com/pb/db74b33d7cf889e9ff4096d285b7a075/579e21e7/data2/blogs18/45653/uploads/multidimensions.jpg" height="300" width="450"/>

* Array: a multi-dimension matrix
    + one-dimension array == vector
    + two-dimension array == matrix
* Function: a process to handle the object
    

## Do math with R: Basic Functions

```{r eval=FALSE}
# basic math
x + (1 - 2) * 3 / 4

# advanced math
x^2;sqrt(x);log(x);exp(x)

# matrix algebra
z <- matrix(1:4, ncol = 2)
z + z - z
z %*% z  # inner mul<span style="color:green">Tip</span>lication 
z %o% z  # outter mul<span style="color:green">Tip</span>lication

# logical evaluation
x == z; x != Z
x & z; x | z
x > z; x <= z
```


## Commen Data Type: Vector
(<span style="color:green">Tip</span>)
```{r}
1:10  # numeric (integer/double)
c("R", "workshop") # character
3 == 5  # logical
factor(1:3, levels = 1:3, labels = c("low", "medium", "high"))  # factor 
```

<div class="notes">
The `factor` is a R *function*. Ususally, the first component ("`1:3`") of a R function is the  <span style="color:purple">object</span>, the target this function is going to work on. The rest components ("`levels = 1:3, labels = c("low", "medium", "high")`") are <span style="color:purple">arguments</span>, with which setting the special conditions the object is dealt.

If you are not sure about the utility of certain arguments, ask R for help by `?`, e.g.,

```{r eval=FALSE}
?factor
```
</div>


## Commen Data Type: Dataset {.smaller}

```{r}
matrix(1:4, ncol = 2)  # matrix
data.frame(x = 1:2, y = 3:4)  # data.frame
list(c("one", "two"), c(3, 4)) # 2-D list
```

----
```{r}
array(c(1:8), dim = c(2, 2, 3)) # 3-D or n-D "list""
```


## Save data to an object
```{r}
x <- rep(c(.01, .05, .1), times = 2) # repeat 1:5 for twice
df <- data.frame(x = 1:1, y = 3:4)
list <- list(x, df)

list # == print(list)
```

----

Basic rules for object name:

* Don't start with numbers (WRONG: `1stday`)
* No special signs except for `.` and `-` (WRONG: `M&M`)
* Case sensitivity (`X != x`)


## Attributes of an object {.smaller .build}
* Structure

```{r}
str(df)
```

* Unique values (<span style="color:green">Tip</span>)

```{r}
unique(df$x)
```
<div class="notes">
What is the `$`?  It is used to call specific columns a data.frame. 

To call the components in a vector, we use "[]"

```{r}
x[3]
```

To call the components in a list, we use "[[]]"

```{r}
list[[2]]
```

</div>

* Names

```{r}
names(df)
```

----

Length

```{r}
length(x)
```

Class

```{r}
class(x);typeof(x)  # ; is used to write two commands in one line
```


## Detect the attributes
Using `is.`

```{r}
x <-c(1, 2, NA, 4)
is.numeric(x)
is.na(x) # detect if x includes missing values
```


## Wrap up
* Set the working directory first: `setwd()`
* Four types of data: numeric, character, logical, factor
* Four types of datasets: matrix, data.frame, list, array
* Save the things into an object by `<-`
* Next: Data input <img src=https://cnet1.cbsistatic.com/img/cbDfaPT6Hj22YVzbIXdKHdW7y-k=/270x0/2016/07/08/a82975f5-6adb-4dec-8bec-561ca3d348ea/pokemon-go-gif.gif />


# Data Input and Manipulation
## Input default data types
* Default data types: .Rds, .Rdata(.Rda)

```{r eval=FALSE}
load("<FileName>.RData")

df_txt <- read.table("<FileName>.txt")
df_csv <- read.csv("<FileName>.csv")

```

* Some data are already embedded in R. To call them, use `data()`, e.g.

```{r eval=FALSE}
data(mtcars)
```


## Input data with packages
```{r eval=FALSE}
# SPSS, Stata, SAS
library(haven)
df_spss <- read_spss("<FileName>.sav")
df_stata <- read_dta("<FileName>.dta") 
df_sas <- read_sas("<FileName>.sas7bdat")  

# Excel sheets
library(readxl)
df_excel <- read_excel("<FileName>.xls");read_excel("<FileName>.xlsx") 

# JavaScript Object Notation 
library(rjson)
df_json <- fromJSON(file = "<FileName>.json" )

# XML/Html
df_xml <- xmlTreeParse("<url>")
df_html <- readHTMLTable(url, which=3)

```


## Output data

* Save in a R dataset (`.RData`) 

```{r eval = F}
save(object, file = "./Data/mydata.Rdata")
```

* Save as `.csv`

```{r eval = F}
write.csv(object, file = "mydata.csv")
```

* Save as `.feather` <span style="color:green">Tip</span>

```{r eval=F}
feather::write_feather(mydata, "mydata.feather")
```

<div class="notes">
Feather is a fast, lightweight, and easy-to-use binary file format for storing data frames, which can be read by both R and Python.
See more details in [Feather](https://blog.rstudio.org/2016/03/29/feather/).
</div>


## Manipulate the data
* let's call a dataset first,

```{r}
data(mtcars)
```

* Variable numbers and Observations

```{r}
ncol(mtcars);names(mtcars)
nrow(mtcars)
```


## Have a glimpse
```{r}
dplyr::glimpse(mtcars)
```

----

```{r}
head(mtcars) # show the first six lines of mtcars
```

## Let's zoom in
* locate a specific row, column, or cell of data: `data[row#, col#]` or `data["rowName","colName"]`. 

```{r}
mtcars[1:2,3:4] # show first and the second rows of the third and fourth columns
```

```{r eval=FALSE}
mtcars[ ,"mpg"] # show the column "mpg"
mtcars[ ,"mpg"][3]
```

----

* Select with special conditions

```{r}
mtcars[mtcars$mpg < 20,][1,] # show the first rows which mpg are below 5.
```

* Create new rows/columns

```{r}
mtcars$id <- seq(1:nrow(mtcars))
```


## Let's generalize
Summarise vector in categories

```{r}
unique(mtcars$cyl)
table(mtcars$cyl)
```

----

For a dataset or a numeric vector

```{r}
summary(mtcars$cyl)
```

One can use `mean`, `sd`, `max`, `min`, etc. to extract specific descriptive statistics.

```{r}
mean(mtcars$cyl)
```

## Let's create
* Create a variable into the dataset (<span style="color:green">Tip</span>)

```{r}
mtcars$newvar <- c(1:nrow(mtcars)) # create an "ID" variable
mtcars$newvar
```

<div class="notes">
Obviously, variables can be immediately overwrite without any specific setting. 

It is convenient but also <span style="color:purple">risky</span>.
</div>

* Remove a variable from the dataset

```{r}
mtcars$newvar <- NULL
mtcars$newvar
```

----

Remove variable, result, function, or data from the environment

```{r eval=FALSE}
rm(x)
```

Recode a variable: e.g., numeric to binary, mpg > mean, 1, otherwise 0

```{r eval=FALSE}
# Method I
mtcars$newvar[mtcars$mpg > mean(mtcars$mpg)] <- 1
mtcars$newvar[mtcars$mpg <= mean(mtcars$mpg)] <- 0

# Method II
mtcars$newvar <- ifelse(mtcars$mpg > mean(mtcars$mpg), 1, 0) # overwrite the NAs
```


## Wrap Up 
* Input/output: `load()`/`read.`series and `save()`/`write.`series
* A glimpse of data: `head()` or `dplyr::glimpse`
* Description: `summary()`, `table()`
    + More specific: `mean`, `sd`, `max`, `min`, etc.
* Manipulation: 
    + create: `mtcars$newvar <- c(1:nrow(mtcars))`
    + Remove: `mtcars$newvar <- NULL`; `rm()`
    + Recode: `recodevar[<condition>] <- <new value>`
* There are also [`apply` family](http://www.r-bloggers.com/r-tutorial-on-the-apply-family-of-functions/) functions for with batching management of data.

----

Next: Hypothsis test
<div class="centered">
![core](http://mathsupport.mas.ncl.ac.uk/images/d/d0/95contint.gif)
</div>


# Hypothesis Tests

## package loading
You want the `pacman` package to load multiple packages.

```{r}
pacman::p_load(dplyr)
```

## Data Glimpse
```{r}
data("mtcars")
dplyr::glimpse(mtcars)
```


## Binary Tests: Difference in mean

$H_{0}: \bar{cylinders} = \bar{gears},\ \alpha = .05$

```{r}
t.test(mtcars$cyl, mtcars$gears) 
```

----

`t.test` offers arguments `alternative`, `mu`, `paired`, and `conf.level` for users to change in two-tail/one-tail test, parameter mean, independent/paired comparison, and $\alpha$.

```{r eval=FALSE}
# one side, cyl > gear, alpha = .01
t.test(mtcars$cyl, mtcars$gear,
       alternative = "greater", conf.level = .99)) 

# comparing with the parameter (true value)
t.test(mtcars$cyl, mu = 6)   # the true mean is 6.

```


## Binary Tests: Correlation
$H_{0}: \rho_{(cyl,gear)} = 0,\ \alpha = .05$

```{r}
cor.test(mtcars$cyl, mtcars$gear)
```

----

`cor.test` offers various arguments as in `t.test` for more specific settings. Moreover, users can use the `method` argument to set the method to calculate the correlations, "Pearson", "Kendall", or "Spearman." (<span style="color:green">Tip</span>)

```{r}
cor.test(mtcars$cyl, mtcars$gear, method = "kendall")
```

<div class="notes">
Do I have to type the `mtcars$` every time? 

* No you don't.
    + It offers a potential for cross-dataset operation, though.
    + Use `within()`: e.g., `within(mtcars, cor.test(cyl, gear))`
    + Use `attach()` (not recommonded)
</div>


----

We can get the correlation matrix, too:
```{r}
cor(mtcars[,1:4])
```

## Present the correlations
You want the `corrplot` package.
```{r fig.height=4}
cor(mtcars) %>% corrplot::corrplot()
```

----

Or a mixed format:
```{r}
cor(mtcars) %>% corrplot::corrplot.mixed()
```


## Binary Tests: ANOVA {.smaller}
One way or two way ANOVA: 

```{r}
aov_one <- aov(cyl ~ gear, data = mtcars) #one-way

aov_two <- aov(cyl ~ gear + am, data = mtcars) #two-way

summary(aov_one); summary(aov_two)
```


## Wrap up
* T-test: `t.test(x, y = NULL, alternative = c("two.sided", "less", "greater"), mu = 0, paired = FALSE, conf.level = 0.95, ...)`

* Correlation: `cor.test(x, y, alternative = c("two.sided", "less", "greater"), method = c("pearson", "kendall", "spearman"), conf.level = 0.95, continuity = FALSE, ...) `

* ANOVA: `aov(formula, data = NULL, ...)`

----

Next: Multiple regression

<div class="centered">
![core](http://www.math.yorku.ca/SCS/spida/lm/mreganim3.gif)
</div>


# Multiple Regression
## Ordinary Linear Regression
$Mileage = \beta_0cylinders + \beta_1horsepower + \beta_3weight + \varepsilon$

```{r}
lm_ols <- lm(mpg ~ cyl + hp + wt, data = mtcars)
``` 

* `lm_ols`: Object name
* `mpg`: Dependent variable
* `cyl + hp + wt`: Independent variables
* `data = mtcars`: Where the variables are stored


## Result{.smaller}

```{r} 
summary(lm_ols)
```

## Nonlinear transition
ln, square, exponential, or inverse

```{r}
lm_tran <- lm(log(mpg) ~ I(cyl^2) + exp(hp) + I(1/wt), data = mtcars)
```

* `log(mpg)`: logistic
* `I(cyl^2), I(1/wt)`: square, inverse
* `exp(hp)`: exponential

## The result {.smaller}

```{r}
summary(lm_tran)
```

## Adding binary variables

When the model including binary variables based on a factor 

```{r}
mtcars$gear_f <- factor(mtcars$gear, levels = 3:5, labels = c("3-gear", "4-gear", "5-gear"))
table(mtcars$gear)
table(mtcars$gear_f); class(mtcars$gear_f)
```

## The result {.smaller}

```{r}
lm_f <- lm(mpg ~ cyl + hp + wt + gear_f, data = mtcars)
summary(lm_f)
```


## Interaction
Two-way interaction: horsepower * Weight

```{r}
lm_in <- lm(mpg ~ cyl + hp * wt, data = mtcars)

```

Equivalent to `lm_in2 <- lm(mpg ~ cyl + hp + wt + hp:wt, data = mtcars)`


## The result {.smaller}
```{r}
summary(lm_in)
```


## Post-estimate diagnoses: Residural
  
```{r fig.height=3.5, fig.align="center"}
res <- resid(lm_ols); res[1:4]
plot(lm_ols, which = 1) # residural vs. fitted plot
```

## Post-estimate diagnoses: Outliers
```{r}
car::outlierTest(lm_ols) # Bonferonni p-value for most extreme obs
```

----

```{r}
car::qqPlot(lm_ols)  #qq plot for studentized resid 
```


## Post-estimate diagnoses: CLRM Properties{.build}
* Heteroscedasticity 

```{r}
car::ncvTest(lm_ols) 
```

* Multicollinearity

```{r}
car::vif(lm_ols) 
```

----

Autocorrelation

```{r}
car::durbinWatsonTest(lm_ols)
```


## Logit
$vs = \frac{1}{1 + e^{-(\beta_0 + \beta_1cylinder + \beta_2horsepower + \beta_3weight + \varepsilon)}}$

```{r}
logit <- glm(vs ~ cyl + hp + wt, data = mtcars, family = "binomial")
```

MLE on other distributions: change the value of the argument `family` to `Gamma`, `poisson`, `gaussian`, etc.

## The result{.smaller}

```{r}
summary(logit)
```


## Interpretation: Margin

```{r message=FALSE}
library(mfx)
logit_m <- logitmfx(vs ~ cyl + hp + wt, data = mtcars) 
logit_m
```

## Interpretation: Predicted probability
Predicted Probability when `cyl` changes from 4 to 6.

```{r}
# Step 1: creat an aggregate data 
mtcars_fake <- with(mtcars, data.frame(cyl = 4:6, hp = mean(hp), wt = mean(wt)))
# Step 2: predict based on the new data
logit_pp4 <- cbind(mtcars_fake,predict(logit, newdata = mtcars_fake, type = "link", se = TRUE))
# Step 3: convert to probability 
logit_pp4 <- within(logit_pp4, {pp <- plogis(fit) 
                                lb <- plogis(fit - 1.96 * se.fit)
                                ub <- plogis(fit + 1.96 * se.fit)})
logit_pp4[,7:9]
```


## Wrap Up
* OLS: `lm(Y ~ X, data = data)`
    + Non-linear transformations: `I(X^2)`, `exp(X)`, `log(X)`.
    + Using factor variable: R will handle that for you.
    + Interaction: `lm(Y ~ X * Z, data = data)`.
    + Post-estimate diagnoses: `resid()`, `outlierTest()`, `qqPlot()`, `ncvTest()`, `vif()`, `durbinWatsonTest()`
* Logit: `glm(Y ~ X, data = data, family = "binomial")`
    + Margins: using `mfx::logitmfx`
    + Predict probabilty: 
        + Step 1: create an aggregate data
        + Step 2: predict the log odds
        + Step 3: transfer to probability
        
----

Next: Presenting with R

<div class="centered">

<img src="https://espngrantland.files.wordpress.com/2014/06/9u4jd.gif" height="500" width = "800" />

</div>


# Presentation with R
## Tabling
There are over twenty packages for [table presentation](http://conjugateprior.org/2013/03/r-to-latex-packages-coverage/) in R. My favoriate three are `stargazer`, `xtable`, and `texreg`.

(Sorry, but all of them are for **Latex** output)

* `stargazer`: good for summary table and regular regression results
* `texreg`: when some results can't be presented by `stargazer`, try `texreg` (e.g., MLM results.)
* `xtable`: the most extensively compatible package, but need more settings to get a pretty output, most of which `stargazer` and `texreg` can automatically do for you.

## An example {.smaller .columns-2}

```{r message = F}
lm_ols <- lm(mpg ~ cyl + hp + wt, data = mtcars)
stargazer::stargazer(lm_ols, type = "text", align = T)
```

----

Present in PDF

<div class="centered">
  <img src=image/table.png height="400"/>
  </div> 
  
* For the users of MS Word, click [here](http://www.r-statistics.com/2010/05/exporting-r-output-to-ms-word-with-r2wd-an-example-session/).


## But...why tabulating if you can plot?
Three types of graphic presenting approaches in R:

* Basic plots: `plot()`.
* Lattice plots: e.g., `ggplot()`.
* Interactive plots: `shiny()`. (save for later)
    + <div class="centered">
  <img src="http://i.stack.imgur.com/qZObK.png" height="300"/>
  </div> 

## Basic plot
Pro:

* Embedded in R
* Good tool for <span style="color:purple">data exploration</span>. 
* <span style="color:purple">Spatial</span> analysis and <span style="color:purple">3-D</span> plots.

Con:

* Not very pretty
* Not very flexible

## An example: create a histogram

```{r fig.align="center"}
hist(mtcars$mpg)
```

## Saving the plot{.build}
* Compatible format:`.jpg`, `.png`, `.wmf`, `.pdf`, `.bmp`, and `postscript`.
* Process: 
      1. call the graphic device
      2. plot
      3. close the device

```{r eval = F}
jpeg("histgraph.jpg")
hist
dev.off()
```

<span style="color:green">Tip</span>
<div class="notes">
Sometimes, RStudio may distort the graphic output. In this situation, try to <span style="color:purple">zoom</span> or use `windows()` function. 
</div>

----

The device list:

| Function                    	| Output to        	|
|-----------------------------	|------------------	|
| pdf("mygraph.pdf")          	| pdf file         	|
| win.metafile("mygraph.wmf") 	| windows metafile 	|
| png("mygraph.png")          	| png file         	|
| jpeg("mygraph.jpg")         	| jpeg file        	|
| bmp("mygraph.bmp")          	| bmp file         	|
| postscript("mygraph.ps")    	| postscript file  	|


## `ggplot`: the most popular graphic engine in R {.build}

+ Built by Hadley Wickham based on Leland Wilkinson's *Grammar of Graphics*.
+ It breaks the plot into components as <span style="color:purple">scales</span> and <span style="color:purple">layers</span>---increase the flexibility.
+ To use `ggplot`, one needs to install the package `ggplot2` first.

```{r message=FALSE}
library(ggplot2)
```


## Histogram in `ggplot`
```{r fig.align="center", fig.height=2.7}
ggplot(mtcars, aes(x=mpg)) + 
    geom_histogram(aes(y=..density..), binwidth=2, colour="black") 
```

## Decoration

```{r fig.align="center", fig.height=2.7}
ggplot(mtcars, aes(x=mpg)) + 
    geom_histogram(aes(y=..density..), binwidth=2, colour="black", fill="purple") +
    geom_density(alpha=.2, fill="blue")  + # Overlay with transparent density plot
    theme_bw() + ggtitle("histogram with a Normal Curve") + 
    xlab("Miles Per Gallon") + ylab("Density")
```


## Break in Parts:{.smaller}

```{r eval=FALSE}
ggplot(data = mtcars, aes(x=mpg)) + 
    geom_histogram(aes(y=..density..), binwidth=2, colour="black", fill="purple") +
    geom_density(alpha=.2, fill="blue")  + # Overlay with transparent density plot
    theme_bw() + ggtitle("histogram with a Normal Curve") + 
    xlab("Miles Per Gallon") + ylab("Density")
```
* `data`: The data that you want to visualise

* `aes`: Aesthetic mappings
describing how variables in the data are mapped to aesthetic attributes
    + horizontal position (`x`)
    + vertical position (`y`)
    + colour
    + size
* `geoms`: Geometric objects that represent what you actually see on
the plot
    + points
    + lines
    + polygons
    + bars

----

* `theme`, `ggtitle`, `xlab`, `ylab`: decorations.
* Other parts you may see in some developed template
    + `stats`: Statistics transformations
    + `scales`: relate the data to the aesthetic
    + `coord`: a coordinate system that describes how data coordinates are
mapped to the plane of the graphic.
    + `facet`: a faceting specification describes how to break up the data into sets.


## Save `ggplot`
* `ggsave(<plot project>, "<name + type>")`:
    + When the `<plot project>` is omitted, R will save the last presented plot. 
    + There are additional arguments which users can use to adjust the size, path, scale, etc.


## Plotting with packages: Map

```{r eval=FALSE}
starbucks <- read.csv("https://opendata.socrata.com/api/views/ddym-zvjk/rows.csv?accessType=DOWNLOAD")


library(leaflet)
leaflet() %>% addTiles() %>% 
  setView(-91.535632, 41.660965, zoom = 16) %>% 
  addMarkers(data = starbucks, lat = ~Latitude, lng = ~Longitude, popup = starbucks$Name)
```

----


```{r two-column, echo=FALSE, results = 'asis', out.extra = '', cache=TRUE}
starbucks <- read.csv("https://opendata.socrata.com/api/views/ddym-zvjk/rows.csv?accessType=DOWNLOAD")


library(leaflet)
leaflet() %>% addTiles() %>% 
  setView(-91.535632, 41.660965, zoom = 16) %>% 
  addMarkers(data = starbucks, lat = ~Latitude, lng = ~Longitude, popup = starbucks$Name)
```


## Plotting with packages: `dotwhisker`{.smaller}
Plot the comparable coefficients or other estimates (margins, predicted probabilities, etc.).

```{r message=FALSE}
library(dotwhisker)
library(broom)
m1 <- lm(mpg ~ wt + cyl + disp + gear, data = mtcars)
```

----
```{r}
summary(m1)
```

----

```{r}
dwplot(m1)
```


----

```{r message=F, fig.align="center", fig.height=4}
m2 <- update(m1, . ~ . + hp) # add another predictor
m3 <- update(m2, . ~ . + am) # and another 

dwplot(list(m1, m2, m3))
```

----

```{r eval = F}
dwplot(list(m1, m2, m3)) +
     relabel_y_axis(c("Weight", "Cylinders", "Displacement", 
                     "Gears", "Horsepower", "Manual")) +
     theme_bw() + xlab("Coefficient Estimate") + ylab("") +
     geom_vline(xintercept = 0, colour = "grey60", linetype = 2) +
     ggtitle("Predicting Gas Mileage") +
     theme(plot.title = element_text(face="bold"),
           legend.justification=c(0, 0), legend.position=c(0, 0),
           legend.background = element_rect(colour="grey80"),
           legend.title = element_blank()) 
```

----

```{r echo = F}
dwplot(list(m1, m2, m3)) +
     relabel_y_axis(c("Weight", "Cylinders", "Displacement", 
                     "Gears", "Horsepower", "Manual")) +
     theme_bw() + xlab("Coefficient Estimate") + ylab("") +
     geom_vline(xintercept = 0, colour = "grey60", linetype = 2) +
     ggtitle("Predicting Gas Mileage") +
     theme(plot.title = element_text(face="bold"),
           legend.justification=c(0, 0), legend.position=c(0, 0),
           legend.background = element_rect(colour="grey80"),
           legend.title = element_blank()) 
```


## Plotting with packages: `interplot`{.smaller}


```{r message=FALSE}
library(interplot)
lm_in <- lm(mpg ~ cyl + hp * wt, data = mtcars)
summary(lm_in)

```

----

```{r fig.align="center"}
interplot(m = lm_in, var1 = "hp", var2 = "wt") + 
  xlab("Automobile Weight (thousands lbs)") + 
  ylab("Estimated Coefficient for \nGross horsepower")
```

## Wrap Up
* R has a bunch of packages for creating publishing-like tables, e.g., `stargazer`, `xtable`, and `texreg`

* There are three ways to visualize statistics in R: basic, lattice (`ggplot`), and interactive.
    + basic: e.g., `hist(<vector>)`
    + `ggplot`: /n  e.g., `ggplot(<data>, aes(x=<vector>)) + geom_histogram()`.

* Two special types of plot:
    + Estimate plot with [`dotwhisker`](https://cran.r-project.org/web/packages/interplot/vignettes/interplot-vignette.html).
    + Interaction plot with [`interplot`](https://cran.r-project.org/web/packages/dotwhisker/vignettes/dwplot-vignette.html).


## Almost the end: one topic left

<div class="centered">
[![present](http://conservatives4palin.com/wp-content/uploads/2013/06/snob.gif)]
</div>


# Version Control
## Just a brief introduction{.columns-2 .build}
<div class = "center">
<img src= "http://www.foldertrack.com/images/Personal_Version_Mess.png" width = "400" height = "400" />
</div>


* Tried to recall the deleted codes?
* Tried to figure out what changes?
* Saved a lot of replication files?
* Version control can help you.

---- 

<div class = "center">
<img src="http://cdn.arstechnica.net//wp-content/uploads/2012/05/uncommitted-changes-1.png" />
</div>
 

## Using Git with RStudio

* RStudio has associate with the Git and SVN very well. 
* Process to use git:
    + Register a user account in https://github.com.
    + Connect your account with RStudio following [this instruction](http://www.molecularecologist.com/2013/11/using-github-with-r-and-rstudio/).
    + Create a version-control project in RStudio
        + <img src="https://andreacirilloblog.files.wordpress.com/2014/12/new-project.png" height = "200" />
    + Commit, Pull and Push

## External Sources
* Q&A Blogs: 
    + http://stackoverflow.com/questions/tagged/r
    + https://stat.ethz.ch/mailman/listinfo/r-help

* Blog for new stuffs: http://www.r-bloggers.com/

* Graph Blogs:
    + http://www.cookbook-r.com/Graphs/
    + http://shiny.stat.ubc.ca/r-graph-catalog/

* Workshops: http://ppc.uiowa.edu/node/3608
* Consulting service: http://ppc.uiowa.edu/node/3385/


----

<div class = "center">
[![end](http://rescuethepresent.net/tomandjerry/files/2016/05/16-thanks.gif)]
</div>