This is the code repository of the feature (gene) selection benchmark in both scRNA-seq and spatial transcriptomics.
After a simple configuration, you can run the benchmark (including data loading, quality control, feature selection, and cell clustering/domain detection) in one single line of code:
from benchmark.run_benchmark import run_bench
# configure the dataset information
data_cfg = {
'your_data_name': {
'adata_path': 'path/to/h5ad/file',
'annot_key': 'annotation_name',
}}
# configure feature selection methods and numbers of selected features
fs_cfg = {'feature_selection_method': [1000, 2000]}
# configure clustering methods and numbers of runs
cl_cfg = {'clustering_method': 2}
# run the benchmark in one line of code
run_bench(data_cfg, fs_cfg, cl_cfg, modality='scrna', metrics=['ARI', 'NMI'])
The evaluation results will be automatically saved as an XLSX file in the working directory with name like this:
2023-02 14_54_32 scrna.xlsx
Other software features are:
- Automatically save the results of each step (preprocessed data, selected features, and cluster labels)
- Reload the cached genes and cluster labels when you use the same data (specified by the data name)
- Support custom feature selection and cell clustering/domain detection methods
- Present detailed and pretty logging messages based on rich and loguru (see examples in tutorial)
Name | Language | Reference |
---|---|---|
GeneClust | Python | paper |
vst | Python | paper |
mvp | Python | paper |
triku | Python | paper |
GiniClust3 | Python | paper |
SC3 | Python | paper |
scran | R | paper |
FEAST | R | paper |
M3Drop | R | paper |
scmap | R | paper |
deviance | R | paper |
FEAST | R | paper |
sctransform | R | paper |
Name | Language | Reference |
---|---|---|
SC3s | Python | paper |
Seurat | R | paper |
SHARP | R | paper |
TSCAN | R | paper |
CIDR | R | paper |
Name | Language | Reference |
---|---|---|
SpatialDE | Python | paper |
SPARK-X | R | paper |
Giotto | R | paper |
Name | Language | Reference |
---|---|---|
SpaGCN | Python | paper |
stLearn | Python | paper |
STAGATE | Python | paper |
This benchmark is written in Python and calls R functions through rpy2
. If you want to use some methods implemented with R language, please install the corresponding R packages.
- anndata>=0.8.0
- numpy>=1.21.6
- setuptools>=59.5.0
- anndata2ri>=1.1
- sc3s>=0.1.1
- scanpy>=1.9.1
- loguru>=0.6.0
- rpy2>=3.5.6
- sklearn>=0.0.post2
- scikit-learn>=1.2.0
- SpaGCN>=1.2.5
- torch>=1.13.1
- stlearn>=0.4.11
- pandas>=1.5.2
- opencv-python>=4.6.0
- scipy>=1.9.3
- rich>=13.0.0
- triku>=2.1.4
- statsmodels>=0.13.5
- SpatialDE>=1.1.3
- STAGATE_pyG>=1.0.0
git clone https://github.com/ToryDeng/FeatureSelectionBenchmarks
cd FeatureSelectionBenchmarks/
python3 setup.py install --user
- The tutorial about how to run the benchmarks: tutorials/run_benchmarks.ipynb
- The tutorial about how to read the records: tutorials/read_records.ipynb