-
Notifications
You must be signed in to change notification settings - Fork 14
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Seurat 4 + seurat convert + mappings tools (#251)
* Seurat convert passing 4/5 tests - no data uploaded yet * Passing AnnData reading test * Ready for trying tests locally * Passing all Seurat tests with planemo locally * Profile and version history * Almost all tests passing with Seurat 4 * All seurat 4 tests passing locally * Seurat UMAP passing test and macro mapper * Seurat AnnData Scanpy 1.8.2 test data retrieval and test hidden * Point to nature methods paper (cherry picked from commit 1ea601d) * Seurat integration and macro WIP * fix loom and others multi-inputs * Seurat integration test data, integration passing lints and tests, tags for umap * Remove repeated options * Seurat map query (all done last night) * Select integration features passing planemo tests * Integration passing tests after adding file option for int. features * Seurat plot * Plot linting passes, initial testing, fixes * Working plots on UI with adequate namings * Seurat plot passing planemo test locally * DoHeatmap with tests * Hover locator * Sanitise potential injections * Please linter warning * History and lintern pleasing * Remove AnnData as input and make sure it is as output * Documentation * Fix version in macro * AnnData is a valid input * Seurat dimplot test expected size * Seurat plot test sizes fixes * Fix plottting labels for linting * Fix input files for seurat_map * Try with conditional nesting for linting * Fix test data downloads * Change map-query missing input * Scale data multipe regress out vars * Map query refdata param changes * Fix scale data vars to regress * Use EBI OC query link for classify query * Size comparison for seurat map query
- Loading branch information
Showing
15 changed files
with
1,986 additions
and
29 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
90 changes: 90 additions & 0 deletions
90
tools/tertiary-analysis/seurat/extra/macro_mapper_seurat.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
--- | ||
- option_group: | ||
- input-object-file | ||
- input-format | ||
pre_command_macros: | ||
- INPUT_OBJ_PREAMBLE | ||
post_command_macros: | ||
- INPUT_OBJECT | ||
input_declaration_macros: | ||
- input_object_params | ||
- option_group: | ||
- output-object-file | ||
- output-format | ||
post_command_macros: | ||
- OUTPUT_OBJECT | ||
input_declaration_macros: | ||
- output_object_params | ||
output_declaration_macros: | ||
- output_files | ||
- option_group: | ||
- input-object-files | ||
- input-format | ||
pre_command_macros: | ||
- INPUT_OBJS_PREAMBLE | ||
post_command_macros: | ||
- INPUT_OBJECTS | ||
input_declaration_macros: | ||
- input_object_params: | ||
multiple: true | ||
- option_group: | ||
- reference-object-files | ||
- reference-format | ||
pre_command_macros: | ||
- REFERENCE_OBJS_PREAMBLE | ||
post_command_macros: | ||
- REFERENCE_OBJECTS | ||
input_declaration_macros: | ||
- input_object_params: | ||
varname: reference | ||
multiple: true | ||
optional: true | ||
- option_group: | ||
- reference-object-file | ||
- reference-format | ||
pre_command_macros: | ||
- REFERENCE_OBJ_PREAMBLE | ||
post_command_macros: | ||
- REFERENCE_OBJECT | ||
input_declaration_macros: | ||
- input_object_params: | ||
varname: reference | ||
- option_group: | ||
- anchors-object-file | ||
- anchors-format | ||
pre_command_macros: | ||
- ANCHORS_OBJ_PREAMBLE | ||
post_command_macros: | ||
- ANCHORS_OBJECT | ||
input_declaration_macros: | ||
- input_object_params: | ||
varname: anchors | ||
- option_group: | ||
- query-object-file | ||
- query-format | ||
pre_command_macros: | ||
- QUERY_OBJ_PREAMBLE | ||
post_command_macros: | ||
- QUERY_OBJECT | ||
input_declaration_macros: | ||
- input_object_params: | ||
varname: query | ||
- option_group: | ||
- plot-out | ||
post_command_macros: | ||
- OUTPUT_PLOT | ||
output_declaration_macros: | ||
- plot_output_files_format: | ||
format: png | ||
- plot_output_files_format: | ||
format: pdf | ||
- plot_output_files_format: | ||
format: eps | ||
- plot_output_files_format: | ||
format: jpg | ||
- plot_output_files_format: | ||
format: ps | ||
- plot_output_files_format: | ||
format: tiff | ||
- plot_output_files_format: | ||
format: svg |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
165 changes: 165 additions & 0 deletions
165
tools/tertiary-analysis/seurat/scripts/seurat-scale-data.R
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,165 @@ | ||
#!/usr/bin/env Rscript | ||
|
||
# Load optparse we need to check inputs | ||
|
||
suppressPackageStartupMessages(require(optparse)) | ||
|
||
# Load common functions | ||
|
||
suppressPackageStartupMessages(require(workflowscriptscommon)) | ||
|
||
# parse options | ||
|
||
option_list = list( | ||
make_option( | ||
c("-i", "--input-object-file"), | ||
action = "store", | ||
default = NA, | ||
type = 'character', | ||
help = "File name in which a serialized R matrix object may be found." | ||
), | ||
make_option( | ||
c("--input-format"), | ||
action = "store", | ||
default = "seurat", | ||
type = 'character', | ||
help = "Either loom, seurat, anndata or singlecellexperiment for the input format to read." | ||
), | ||
make_option( | ||
c("--output-format"), | ||
action = "store", | ||
default = "seurat", | ||
type = 'character', | ||
help = "Either loom, seurat, anndata or singlecellexperiment for the output format." | ||
), | ||
make_option( | ||
c("-e", "--genes-use"), | ||
action = "store", | ||
default = NULL, | ||
type = 'character', | ||
help = "File with gene names to scale/center (one gene per line). Default is all genes in object@data." | ||
), | ||
make_option( | ||
c("-v", "--vars-to-regress"), | ||
action = "store", | ||
default = NULL, | ||
type = 'character', | ||
help = "Comma-separated list of variables to regress out (previously latent.vars in RegressOut). For example, nUMI, or percent.mito." | ||
), | ||
make_option( | ||
c("-m", "--model-use"), | ||
action = "store", | ||
default = 'linear', | ||
type = 'character', | ||
help = "Use a linear model or generalized linear model (poisson, negative binomial) for the regression. Options are 'linear' (default), 'poisson', and 'negbinom'." | ||
), | ||
make_option( | ||
c("-u", "--use-umi"), | ||
action = "store", | ||
default = FALSE, | ||
type = 'logical', | ||
help = "Regress on UMI count data. Default is FALSE for linear modeling, but automatically set to TRUE if model.use is 'negbinom' or 'poisson'." | ||
), | ||
make_option( | ||
c("-s", "--do-not-scale"), | ||
action = "store_true", | ||
default = FALSE, | ||
type = 'logical', | ||
help = "Skip the data scale." | ||
), | ||
make_option( | ||
c("-c", "--do-not-center"), | ||
action = "store_true", | ||
default = FALSE, | ||
type = 'logical', | ||
help = "Skip data centering." | ||
), | ||
make_option( | ||
c("-x", "--scale-max"), | ||
action = "store", | ||
default = 10, | ||
type = 'double', | ||
help = "Max value to return for scaled data. The default is 10. Setting this can help reduce the effects of genes that are only expressed in a very small number of cells. If regressing out latent variables and using a non-linear model, the default is 50." | ||
), | ||
make_option( | ||
c("-b", "--block-size"), | ||
action = "store", | ||
default = 1000, | ||
type = 'integer', | ||
help = "Default size for number of genes to scale at in a single computation. Increasing block.size may speed up calculations but at an additional memory cost." | ||
), | ||
make_option( | ||
c("-d", "--min-cells-to-block"), | ||
action = "store", | ||
default = 1000, | ||
type = 'integer', | ||
help = "If object contains fewer than this number of cells, don't block for scaling calculations." | ||
), | ||
make_option( | ||
c("-n", "--check-for-norm"), | ||
action = "store", | ||
default = TRUE, | ||
type = 'logical', | ||
help = "Check to see if data has been normalized, if not, output a warning (TRUE by default)." | ||
), | ||
make_option( | ||
c("-o", "--output-object-file"), | ||
action = "store", | ||
default = NA, | ||
type = 'character', | ||
help = "File name in which to store serialized R object of type 'Seurat'.'" | ||
) | ||
) | ||
|
||
opt <- wsc_parse_args(option_list, mandatory = c('input_object_file', 'output_object_file')) | ||
|
||
# Check parameter values | ||
|
||
if ( ! file.exists(opt$input_object_file)){ | ||
stop((paste('File', opt$input_object_file, 'does not exist'))) | ||
} | ||
|
||
if (! is.null(opt$genes_use)){ | ||
if (! file.exists(opt$genes_use)){ | ||
stop((paste('Supplied genes file', opt$genes_use, 'does not exist'))) | ||
}else{ | ||
genes_use <- readLines(opt$genes_use) | ||
} | ||
}else{ | ||
genes_use <- NULL | ||
} | ||
|
||
# break up opt$vars_to_regress into a list if it has commas | ||
opt$vars_to_regress <- unlist(strsplit(opt$vars_to_regress, ",")) | ||
|
||
# Now we're hapy with the arguments, load Seurat and do the work | ||
|
||
suppressPackageStartupMessages(require(Seurat)) | ||
if(opt$input_format == "loom" | opt$output_format == "loom") { | ||
suppressPackageStartupMessages(require(SeuratDisk)) | ||
} else if(opt$input_format == "singlecellexperiment" | opt$output_format == "singlecellexperiment") { | ||
suppressPackageStartupMessages(require(scater)) | ||
} | ||
|
||
# Input from serialized R object | ||
|
||
seurat_object <- read_seurat4_object(input_path = opt$input_object_file, format = opt$input_format) | ||
# https://stackoverflow.com/questions/9129673/passing-list-of-named-parameters-to-function | ||
# might be useful | ||
scaled_seurat_object <- ScaleData(seurat_object, | ||
features = genes_use, | ||
vars.to.regress = opt$vars_to_regress, | ||
model.use = opt$model_use, | ||
use.umi = opt$use_umi, | ||
do.scale = !opt$do_not_scale, | ||
do.center = !opt$do_not_center, | ||
scale.max = opt$scale_max, | ||
block.size = opt$block_size, | ||
min.cells.to.block = opt$min_cells_to_block, | ||
verbose = FALSE) | ||
|
||
|
||
# Output to a serialized R object | ||
write_seurat4_object(seurat_object = scaled_seurat_object, | ||
output_path = opt$output_object_file, | ||
format = opt$output_format) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.