This repository porvide scripts for developing single cell classification model for HPA dataset. Pipelines are prepared using snakemake for workflow management and depending on conda and pip for dependency control.
For development purpose, we do not provide ".yml" file for one step installation. Please follow the instruction below to install dependencies.
Snakemake does not have to be installed in the same conda environment as the one used for running the pipeline. It can switch between existing conda environments by specifying the --use-conda
option and add the "conda" variable to the Snakefile. For more information, please refer to snakemake doc: Using already existing named conda environments
- Create conda environment:
- Create conda environment:
conda env --name <my-env> python=3.9
- Activate conda environment:
conda activate <my-env>
- Create conda environment:
- Install dependencies
- Install snakemake:
- Install snakemake:
conda install -c conda-forge mamba
- Install snakemake:
mamba install -c conda-forge -c bioconda snakemake
- Install snakemake:
- Install cellpose:
- Install cellpose 2.2.2:
pip install "cellpose[gui]"==2.2.2
- "cyto2" is the model used in cell segmentation. If cellpose fails to download the model, please download the model manually and put it in the cellpose model directory. The model can be downloaded from here
- Install cellpose 2.2.2:
- Install snakemake:
- git clone this repository
In my environment, I directy run the Snakemake on our Linux machince (CentOS), so I haven't tested its function sending commands to cluster. Please refer to snakemake doc: Cluster Execution for more information.
I created three conda environment for running the pipeline.
- "mssm" for hosting Snakemake
- "mscp03" for running cellpose
- "mspytorch" for running basic image processing
Three rules are in the current version of Snakefile:
- convert: using conda env "mspytorch"
- merge: using conda env "mspytorch"
- cellpose: using conda env "mscp03"
Note: mscp03 supposed to be able to handle basic image processing. However, it failed to run rule "convert" and "merge", so I pointed the Snakefile to mspytorch to saving time from debuging.
Please change the conda env according to your environment.
- Snakemake: scalable bioinformatics workflows. Johannes Köster and Sven Rahmann. Bioinformatics 2012. doi:10.1093/bioinformatics/bts480
- Conda: A package manager for any language. https://conda.io/docs/
- Cellpose: generalist nuclei segmentation using learned representations of microscopy images. Carsen Stringer, Tim Wang, Michalis Michaelos, Marius Pachitariu, and Lynn K. Lu. bioRxiv 2020. doi:10.1101/2020.02.02.931238
- snakemake:
- website: snakemake.github.io
- doc: snakemake.readthedocs.io
- conda
- cellpose:
- website: www.cellpose.org
- github: cellpose