This repository contains the code to reproduce all the analysis done for our paper
introducing the SuperCellCyto
R package: https://github.com/phipsonlab/SuperCellCyto.
SuperCellCyto
is an adaptation of the SuperCell R package.
Initially developed for scRNAseq data, the SuperCell package aggregates cells
with similar transcriptomic profiles into "supercells" (also known as “metacells” in the scRNAseq literature).
The preprint of the paper is available on bioRxiv:
Putri, G. H., Howitt, G., Marsh-Wakefield, F., Ashhurst, T. M., & Phipson, B. (2023). SuperCellCyto: enabling efficient analysis of large scale cytometry datasets. bioRxiv; DOI: https://doi.org/10.1101/2023.08.14.553168
To reproduce all the figures in the paper, refer to the Rmd
files in the analysis
folder:
explore_supercell_purity_clustering
for Supercells Preserve Biological Heterogeneity and Facilitate Efficient Cell Type Identificationb_cells_identification
for Identifying Rare B Cells Subsets by Clustering Supercellsbatch_correction
for Mitigating Batch Effects in the Integration of Multi-Batch Cytometry Data at the Supercell Levelde_test
for Recovery of Differentially Expressed Cell State Markers Across Stimulated and Unstimulated Human Peripheral Blood Cellsda_test
for Identification of Differentially Abundant Rare Monocyte Subsets in Melanoma Patientslabel_transfer
for Efficient Cell Type Label Transfer Between CITEseq and Cytometry Datarun_time
for measuring the run time of SuperCellCyto and clustering process applicable for the first 3 items above.
The code
folder contains the scripts used to generate the results that are
processed in the Rmd
files in the analysis
folder.
Please note that running some of these scripts will take a long time. That's the reason why they are in separate R scripts.
Otherwise, each rebuilding of the workflowr website will take hours.
The data
and output
folders are meant for storing raw data and processed data
generated by the scripts in the code
folder respectively.
The content of these folders are purposely not committed into the repository
as they are enormous (over 40GB in total).
If you would like to reproduce our analysis, please download the content for the data
and output
folder from Zenodo: .
Instruction after downloading the files:
- Uncompress
data_20232308.tar.gz
(usingtar -zxvf <filename>.tar.gz
). You should get onedata
folder. This is thedata
folder for the workflowr website. - Uncompress each of the
tar.gz
files starting with the wordoutput
. Each file should uncompress into one folder. - Create a new folder call
output
and place all the folders uncompressed in step 3 into it. - Run
wflow_build()
.