This repository supports the submission of study The microbiome associated with the reef builder Neogoniolithon sp. in the eastern Mediterranean. We will provide code and data that was used to generate data, results and plots.
Hopefully coming soon ... The publication is currently in the second round under review.
The data folder holds different data files:
- table_ASVs.csv: ASV raw table including taxonomic assignment a result from the DADA2 pipeline. There is taxonomy assignmend based on SILVA 138 (columns 39 to 44) and also SILVA 132 (columns 45 to 50). Note this table still contains chloroplast and mitochondria ASVs which are removed during the data processing.
- rarefied_3000.xlsx: Rarefied ASV table
- rel_abundance.xlsx: Relative abundance ASV table was generated by averaging samples. In addition the rare biosphere with ASVs < 0.1% relative abuandance were removed.
- alphaDiv_indices.xlsx: Alpha diversity indices for each sample generated from the rarefied_3000.xlsx (see script alpha_diversity.R lines 21-27.
- table_sample_stats.xlsx: This table contains the basic statistics of DADA2 pipeline and include a sample description.
Raw data: The raw sequencing data files are stored at ENA https://www.ebi.ac.uk/ena/browser/view/PRJEB38881.
We are providing the code for the data processing and generation of many of the main figures. Please note: some plots were live in R using ggplot. The code presented here is logically divided into different steps:
- Adapter and quality trimming of the raw reads: adapter_quality_trimming.R
- DADA2: dada2.R
- Data processing: process_data.R
- Alpha diversity: alpha_diversity.R
- NMDS and bubbleplots: nmds_bubbleplot.R
In the following subsection the different files will be shortly explained.
This adapter_quality_trimming.R script uses the raw sequencing files (download here) and perfoms adapter and quality trimming including a minimal length filtering. To execute the cutadapt tool and the R package seqRFLP are required required.
The dada2.R script uses the DADA2 pipeline published in nature methods to generate Amplicon Sequence Variants (ASVs). In addition taxonomic assigment of the ASVs is done with SILVA 138. This results in a ASV table. The script requires R packages dada2.
The process_data.R script removes chloroplast and mitochondria ASVs, generates rarefied data and generates the relative abundance data.
The alpha_diversity.R script uses rarefied data to generate alpha diversity plots and give statistical measures. Lines 21-27 are outcommented as alpha diversity measures need to be calculated only once. Alpha diversity measures of each sample are available here. The script needs the R packages readxl, writexl, ggplot2, phyloseq, vegan.
The nmds_bubbleplot.R script generates NMDS and bubbleplot figures. Bubbleplot generation need code from the https://github.com/AlexanderBartholomaeus/BubblePlot is used. The script relies on various R packages (see Lines 1-15).