-
Notifications
You must be signed in to change notification settings - Fork 2
/
Copy pathGNpre.Rmd
141 lines (80 loc) · 3.02 KB
/
GNpre.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
# Some topics by GN
## R Installation
## IDE: Rstudio
## Data structures
## Subsetting / Indexing
`subset()`
`[]`
Binary search in `data.table`
## Data I/O / Database connection
## tidy data
http://vita.had.co.nz/papers/tidy-data.html
http://blog.rstudio.org/2014/07/22/introducing-tidyr/
## reshape(2)
Reshape data from long to wide format and vice versa.
## data.table
data.table package extends data.frame especially for large data.
It is quite fast due to optimized code.
https://github.com/Rdatatable/data.table/wiki
## dplyr
dplyr is an R-package for data manipulation.
http://www.r-bloggers.com/data-manipulation-with-dplyr/
## Profiling
Make your code better (faster).
## shiny (demonstration)
R goes interactive - accessible via web.
http://shiny.rstudio.com/
## bash / batchjobs (demonstration)
For simulations it's recommended to use batchjobs instead of
interactive sessions. They can run while you do other things.
### Rscript (/usr/bin/Rscript)
Can be included in your *.R file (header) or executed directly.
Example:
```bash
Rscript -e 'rmarkdown::render("GN.md", "html_document")'
```
It can be included in a Makefile.
## loops (implicit/explicit)
Explicit: *for*, *while*
Implicit: *lapply* and friends, which are much faster
## Rcpp
http://www.isid.ac.in/~deepayan/ICP2015/
https://support.rstudio.com/hc/en-us/articles/200486088-Using-Rcpp-with-RStudio
## R packages (structure, why?, Roxygen)
Roxygen is a documentation system for R-packages. It allows you
to document the source code within the *.R file and generate an
*.Rd file afterwards.
## visualization (ggplot2)
Grammar of graphics
## Reproducible research (Rmarkdown, knitr, pandoc, LaTeX)
The integration of R-code in markdown or LaTeX documents is easy
using *knitr* and *rmarkdown* for reproducible research.
*pandoc* can convert many file formats into others.
Using this framework, it's possible to get different output files
(e.g. HTML and PDF) from the same input file.
YAML headers are convenient for that.
## Code sharing / version control: git / github
Git is a revision control system for software development.
https://en.wikipedia.org/wiki/Git_%28software%29
Github is an online platform for projects using git.
https://guides.github.com/activities/hello-world/
If you share your code, others can use and improve it.
Example:
```bash
git clone https://github.com/nachti/ISPS16.git
```
[Using Git with RStudio](https://jennybc.github.io/2014-05-12-ubc/ubc-r/session03_git.html)
[Using Version Control with RStudio](https://support.rstudio.com/hc/en-us/articles/200532077-Version-Control-with-Git-and-SVN)
## Makefile (GNU make)
Makefiles can help you to build output from source files using
dependencies. E.g. it's possible to run several `*.R` files and/or
create a report from an `*.Rmd` file.
https://en.wikipedia.org/wiki/Makefile
Example:
Go to the directory where you cloned the ISPS16 repository:
```bash
cd ISPS16
make
```
Make without any options generates a `*.html`, `*.pdf` and
`*.docx` for you from the source file (GN.Rmd).