Skip to content

Customizable workflows based on snakemake and python for the analysis of NGS data

Notifications You must be signed in to change notification settings

maxplanck-ie/snakepipes

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
katarzyna.otylia.sikora@gmail.com
Feb 6, 2023
0d8e030 · Feb 6, 2023
Dec 13, 2022
Feb 6, 2023
Mar 8, 2022
Feb 6, 2023
Feb 6, 2023
Feb 6, 2023
Aug 9, 2022
Mar 2, 2020
Dec 3, 2021
Sep 11, 2019
Jun 25, 2018
Mar 1, 2022
Dec 16, 2022
Nov 15, 2021

Repository files navigation

snakePipes

Documentation Status Build Staus Citation

snakePipes are flexible and powerful workflows built using Snakemake that simplify the analysis of NGS data.

./docs/content/images/snakePipes_small.png

Workflows available

  • DNA-mapping*
  • ChIP-seq*
  • mRNA-seq*
  • noncoding-RNA-seq*
  • ATAC-seq*
  • scRNA-seq
  • Hi-C
  • Whole Genome Bisulfite Seq/WGBS

(*Also available in "allele-specific" mode)

Installation

Snakepipes is a set of Snakemake workflows which use conda for installation and dependency resolution, so you will need to install conda first.

Afterward, simply run the following:

conda install mamba -c conda-forge && mamba create -n snakePipes -c mpi-ie -c bioconda -c conda-forge snakePipes

This will create a new conda environment called "snakePipes" into which snakePipes is installed. You will then need to create the conda environments needed by the various workflows. To facilitate this we provide the snakePipes commands:

  • conda activate snakePipes to activate the appropriate conda environment.
  • snakePipes createEnvs to create the various environments.

Indices and annotations needed to run the workflows could be created by a simple command :

createIndices --genomeURL <path/url to your genome fasta> --gtfURL <path/url to genes.gtf> -o <output_dir> <name>

where name refers to the name/id of your genome (specify as you wish).

A few additional steps you can then take:

1. Modify/remove/add the organism yaml files appropriately : these yaml files would contain location of appropriate GTF files and genome indexes corresponding to different organisms. The location of these files after installation can be found using snakePipes info command.

2. Modify the cluster.yaml file appropriately : This yaml file contains settings for your cluster scheduler (SGE/slurm). Location revealed using snakePipes info command.

Documentation

For detailed documentation on setup and usage, please visit our read the docs page.

Citation

If you adopt/run snakePipes for your analysis, cite it as follows :

Bhardwaj, Vivek, Steffen Heyne, Katarzyna Sikora, Leily Rabbani, Michael Rauer, Fabian Kilpert, Andreas S. Richter, Devon P. Ryan, and Thomas Manke. 2019. “snakePipes: Facilitating Flexible, Scalable and Integrative Epigenomic Analysis.” Bioinformatics , May. doi:10.1093/bioinformatics/btz436

Note

SnakePipes are under active development. We appreciate your help in improving it further. Please use issues to the GitHub repository for feature requests or bug reports.