Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Execution error on FASP GWAS submission #110

Open
ianfore opened this issue Aug 21, 2020 · 3 comments
Open

Execution error on FASP GWAS submission #110

ianfore opened this issue Aug 21, 2020 · 3 comments

Comments

@ianfore
Copy link
Collaborator

ianfore commented Aug 21, 2020

First attempt to run the GWAS workflow deployed on the DNAStack WES Server.
Using the details posted at the first commit here.

Plan A
Can the vcf file be returned from DRS? Perhaps it's in the BioDataCatalyst Thousand genomes dataset.

Yes it's file id dg.4503/dbd55e76-1100-40b3-b420-0eaeee478fbc
The DRS GetObject url for it is
https://gen3.biodatacatalyst.nhlbi.nih.gov/ga4gh/drs/v1/objects/dg.4503/dbd55e76-1100-40b3-b420-0eaeee478fbc

The following gets us a signed URL when called with an access token in the header
https://cgc-ga4gh-api.sbgenomics.com/ga4gh/drs/v1/objects/dg.4503/dbd55e76-1100-40b3-b420-0eaeee478fbc/access/gs

I was then able to submit a job to DNAStack WES passing that signed URL. Code is here
The run_id was 56d45bee-0022-4a00-ac4d-88e4bb274978

This resulted in an Execution error. See attachment.
DNAStackGWASRun1_result_json.txt

Perhaps the run logs at DNAStack will give a clue as to the problem.

@ianfore ianfore changed the title Execution error on GWAS submission Execution error on FASP GWAS submission Aug 22, 2020
@ianfore
Copy link
Collaborator Author

ianfore commented Aug 22, 2020

Also tested the code for submitting the GWAS to the DNAStack WES by running [DNAStackWESClient] directly (https://github.com/ianfore/FASPclient/blob/master/DNAStackWESClient.py) itself.

Rather than build the run submission using a url supplied by DRS, in this case the runGWASWorkflowTest() method submits the run using the files from https://github.com/DNAstack/plenary-resources-2020
run_id was 2337ac75-212d-4f83-b745-10f940895ec9
Also gave EXECUTOR_ERROR

@ianfore
Copy link
Collaborator Author

ianfore commented Aug 31, 2020

Patrick made some changes to the workflow. Was able to get the workflow to work using the gs: URIs in inputs.gwas.json from https://github.com/DNAstack/plenary-resources-2020.

Still unable to get the workflow to run with a signed URL returned by DRS.
I ran the MD5 workflow on the GWAS vcf and it ran to completion. This suggests whether the signed URLs can be handled or not seems to be down to the workflow, or to the specific program being called.

I also encountered a problem with signed URLs on the Seven Bridges platform. See #109 . The fix there was to build a new Docker container with a version of samtools that would take a URL.

Patrick has opened an issue with the Cromwell team.

@patmagee
Copy link

@ianfore In general the issue seems to be that alot of the signedUrls use illegal characters in their path which most execution engines are not properly handling. For cromwell, they have mentioend this likely is just an improper quoting issue when files are being localized to a VM so should be an easy fix.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants