Syntax of covrowinds and geneticrowinds #34

came203 · 2023-01-19T10:53:14Z

As a follow-up to my last issue, it's quite cumbersome to construct VCF or BED files for each phenotype separately, so the best way to go on would be probably to use the covrowinds and geneticrowinds argument to select only the samples used in the current analysis. As I'm by no means an expert in Julia, how would one go on to implement this, similar to the snpinds argument which works very nicely. Is there any easy way to construct the covrowinds and geneticrowinds using IDs of participants? Or does one have to construct the indices manually? What is the syntax of the indices?`
Any pointers on this would be highly appreciated

kose-y · 2023-01-20T01:01:02Z

My go-to method would be using dictionaries. Julia has an efficient built-in data structure for the dictionaries, just like in Python.

Here is an example using the bgen format (edited from https://github.com/kose-y/TrajGWAS-reproducibility/blob/main/ukbiobank/4_scoretest_script/scoretest.jl):

using BGEN

genetic_iids_subsample = # a list of sample names here
data = Bgen(bgenfilename * ".bgen"; sample_path = samplefilename)
genetic_iids = samples(data) # assuming sample names are `String`s directly mapped to individual IDs. 

order_dict = Dict{String, Int}() # creating an empty dictionary with key type String, value type Int.
for (i, iid) in enumerate(genetic_iids)
    order_dict[iid] = i
end

sample_indicator = falses(length(genetic_iids))
for v in genetic_iids_subsample
    sample_indicator[order_dict[v]] = true
end

# then use sample_indicator as `geneticrowinds`

You can do the similar with the SNP's RefSeq IDs when constructing snpinds.

It would be a nice feature to have in our package, but it is tricky since it depends on the files you have, as you can see in the example that I linked.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syntax of covrowinds and geneticrowinds #34

Syntax of covrowinds and geneticrowinds #34

came203 commented Jan 19, 2023

kose-y commented Jan 20, 2023

Syntax of covrowinds and geneticrowinds #34

Syntax of covrowinds and geneticrowinds #34

Comments

came203 commented Jan 19, 2023

kose-y commented Jan 20, 2023