Questions:
- What are basic principles for using spreadsheets for good data organization?
Objectives:
- Describe best practices for organizing data so computers can make the best use of data sets.
Keypoints:
- Good data organization is the foundation of any research project.
Learning Objectives
-
Describe the purpose of the RStudio Script, Console, Environment, and Plots panes.
-
Organize files and directories for a set of analyses as an R project, and understand the purpose of the working directory.
-
Use the built-in RStudio help interface to search for more information on R functions.
-
Demonstrate how to provide sufficient information for troubleshooting with the R user community.
Learning Objectives
- Describe what a data frame is.
- Load external data from a .csv file into a data frame.
- Summarize the contents of a data frame.
- Describe what a factor is.
- Convert between strings and factors.
- Reorder and rename factors.
- Change how character strings are handled in a data frame.
- Export and save data.
Learning Objectives
-
Describe the purpose of the
dplyr
andtidyr
packages. -
Select certain columns in a data frame with the
dplyr
functionselect
. -
Select certain rows in a data frame according to filtering conditions with the
dplyr
functionfilter
. -
Link the output of one
dplyr
function to the input of another function with the 'pipe' operator%>%
. -
Add new columns to a data frame that are functions of existing columns with
mutate
. -
Use the split-apply-combine concept for data analysis.
-
Use
summarize
,group_by
, andcount
to split a data frame into groups of observations, apply summary statistics for each group, and then combine the results. -
Describe the concept of a wide and a long table format and for which purpose those formats are useful.
-
Describe what key-value pairs are.
-
Reshape a data frame from long to wide format and back with the
spread
/pivot_wider
andgather
/pivot_longer
commands from thetidyr
package.
Learning Objectives
- Produce scatter plots, boxplots, and time series plots using ggplot.
- Set universal plot settings.
- Describe what faceting is and apply faceting in ggplot.
- Modify the aesthetics of an existing ggplot plot (including axis labels and color).
- Build complex and customized plots from data in a data frame.
Learning Objectives
At the end of this section, students should understand
- the need and concept of table joins,
- different between different types of joins,
- the importance of keys in joins,
- circumstances leading to the appearance of missing values,
- the implications of using non-unique keys.
Learning Objectives
- Understand the concept of reproducible research and reproducible documents.
- Undertand the process by which a source document in compiled into a final report.
- Generate a reproducible report in html or pdf from an Rmarkdown document using RStudio.
Learning objectives
- Learning about the wider contect of bioinformatics and omics data analysis
- First exposure to the Bioconductor project
- Notions of experimental design
- Omics data containers - theory
- The
SummarizedExperiment
class
Learning Objectives
Learn programming concepts, including
- how to handle conditions
- iterate of data structures
- good coding practice
- code re-use through functions