CompositionAmplicondata.Rmd

---
title: "Microbiome composition"
bibliography: 
- bibliography.bib
output:
  BiocStyle::html_document:
    number_sections: no
    toc: yes
    toc_depth: 4
    toc_float: true
    self_contained: true
    thumbnails: true
    lightbox: true
    gallery: true
    use_bookdown: false
    highlight: haddock
---
  <!--
  %\VignetteEngine{knitr::rmarkdown}
  %\VignetteIndexEntry{microbiome tutorial - composition}
  %\usepackage[utf8]{inputenc}
  %\VignetteEncoding{UTF-8}  
-->
  
  
Also see [phyloseq barplot examples](http://joey711.github.io/phyloseq/plot_bar-examples.html).
  
Read example data from a [diet swap study](http://dx.doi.org/10.1038/ncomms7342):
  
```{r composition-example1}
# Example data
library(microbiome)
library(dplyr)
data(dietswap)

# Just use prevalent taxa to speed up examples
# (not absolute counts used in this example)
pseq <- core(dietswap, detection = 8^2, prevalence = 90/100)

# Pick sample subset
library(phyloseq)
pseq2 <- subset_samples(pseq, group == "DI" & nationality == "AFR" & timepoint.within.group == 1)
```

### Composition barplots

Same with compositional (relative) abundances; for each sample (left), or averafged by group (right).
  
```{r composition-example4b, fig.width=12, fig.height=5, out.width="400px", fig.show="hold", warning=FALSE, message=FALSE}
# Try another theme
# from https://github.com/hrbrmstr/hrbrthemes
library(hrbrthemes)
library(gcookbook)
library(tidyverse)

# Limit the analysis on core taxa and specific sample group
p <- plot_composition(pseq2,
		      taxonomic.level = "OTU",
                      sample.sort = "nationality",
                      x.label = "nationality") +
     guides(fill = guide_legend(ncol = 1)) +
     scale_y_percent() +
     labs(x = "Samples", y = "Relative abundance (%)",
                                   title = "Relative abundance data",
                                   subtitle = "Subtitle",
                                   caption = "Caption text.") + 
     theme_ipsum(grid="Y")
print(p)  

# Averaged by group
p <- plot_composition(pseq2,
                      average_by = "bmi_group", transform = "compositional")
print(p)
```


### Composition heatmaps


Heatmap for CLR-transformed abundances, with samples and OTUs sorted
with the neatmap method:
  
```{r composition-example7, fig.width=10, fig.height=4, eval=FALSE}
tmp <- plot_composition(pseq2, plot.type = "heatmap", transform = "compositional", 
            sample.sort = "neatmap", otu.sort = "neatmap", mar = c(6, 13, 1, 1))
```


### Plot taxa prevalence

This function
allows you to have an overview of OTU prevalences alongwith their
taxonomic affiliations. This will aid in checking if you filter OTUs
based on prevalence, then what taxonomic affliations will be lost.

```{r plot_prev, fig.height=6, fig.width=8, dev="CairoPNG"}
data(atlas1006)

# Use sample and taxa subset to speed up example
p0 <- subset_samples(atlas1006, DNA_extraction_method == "r")

# Define detection and prevalence thresholds to filter out rare taxa
p0 <- core(p0, detection = 10, prevalence = 0)

# For the available taxonomic levels
plot_taxa_prevalence(p0, "Phylum", detection = 10)
```

### Amplicon data

See further examples on [community composition for amplicon data](CompositionAmplicondata2.html).