Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor bwa (resolves #320, resolves #297, resolves #322) #324

Merged
merged 1 commit into from
Jul 11, 2016

Conversation

jvivian
Copy link
Collaborator

@jvivian jvivian commented Jun 27, 2016

No description provided.

@jvivian jvivian added this to the 2.1 Release milestone Jun 27, 2016
@fnothaft
Copy link
Contributor

I will give this a review...

@jvivian
Copy link
Collaborator Author

jvivian commented Jun 28, 2016

@fnothaft I can ping you once I remove the needs work label and tests pass.

@fnothaft
Copy link
Contributor

@fnothaft I can ping you once I remove the needs work label and tests pass.

That would be superb!

@jvivian jvivian force-pushed the issues/297-refactor-bwa branch 2 times, most recently from d75e51c to b812b86 Compare June 28, 2016 04:58
@jvivian
Copy link
Collaborator Author

jvivian commented Jun 28, 2016

@fnothaft — I spread out the history in an attempt to make it more readable, but its not perfect, so if something looks odd check the finished product.

@fnothaft
Copy link
Contributor

Thanks @jvivian; I'll make a pass shortly.

@jvivian
Copy link
Collaborator Author

jvivian commented Jun 28, 2016

Thanks! — I'm happy to squash commits or make changes to make the review easier.

@fnothaft
Copy link
Contributor

fnothaft commented Jun 28, 2016

@jvivian that would be great actually. I think it might make sense to squash down:

Thoughts?

@jvivian
Copy link
Collaborator Author

jvivian commented Jun 28, 2016

Thanks @fnothaft I'll ping you when I've squashed and rebased.

@jvivian jvivian force-pushed the issues/297-refactor-bwa branch from b812b86 to c14ac3f Compare June 28, 2016 20:00
@jvivian
Copy link
Collaborator Author

jvivian commented Jun 28, 2016

@fnothaft —Almost as asked. I had a demon commit (53f) that gave me merge conflicts with missing code anytime I moved commits around it or tried to squash it.

@fnothaft
Copy link
Contributor

Demon commit; nice! I'll take a look.

inputs,
[uuid,
's3://{s3_bucket}/{sequence_dir}/{uuid}_1.fastq.gz'.format(**args),
's3://{s3_bucket}/{sequence_dir}/{uuid}_2.fastq.gz'.format(**args)],
's3://{s3_bucket}/alignment{dir_suffix}'.format(**args),
rg_line).encapsulate()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens to rg_line now?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Er. Nevermind. I see that rg_line is now being passed through inputs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't rg_line just in inputs to start with? If we're refactoring this, mightn't we factor out passing rg_line as a parameter?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't rg_line just in inputs to start with?

Looking at the code, I'm not sure if it's even defined anywhere in align_and_call.

rg_line gets set from uuid_items[1], which is taken from uuid_rg from uuid_list, but uuid_list is parsed from the manifest which has no entry for read group... (confused yet?).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ace!

# If all inputs provided, download from S3
download_ref = job.wrapJobFn(download_url_job, inputs.ref, disk='3G')
faidx = job.wrapJobFn(run_samtools_faidx, download_ref.rv())
# If all the bwa index files are provided
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Semantics, but I would note here that the FAI is considered separate from the BWA indices, so we'll check for that later.

@fnothaft
Copy link
Contributor

Just made another pass @jvivian! Let me know your thoughts!

@jvivian
Copy link
Collaborator Author

jvivian commented Jun 30, 2016

Great feedback @fnothaft , thanks! I'll ping you once tests pass.

I also pinged Hannes about the toil-scripts google group.

@@ -2,8 +2,9 @@
### Guide: Running the BWA Pipeline using Toil

This guide attempts to walk the user through running this pipeline from start to finish. If there are any questions
please contact John Vivian ([email protected]). If you find any errors or corrections please feel free to make a
pull request. Feedback of any kind is appreciated.
please contact the Toil team at: [email protected]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move to project level readme, no? We should have it there, and not duplicate it across all pipeline readmes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, I'll do that now.

@fnothaft
Copy link
Contributor

Dropped two comments; otherwise branch is good to squash down and rebase from my side.

@jvivian jvivian force-pushed the issues/297-refactor-bwa branch 2 times, most recently from c783e8c to 48ffd14 Compare July 6, 2016 19:19
@jvivian
Copy link
Collaborator Author

jvivian commented Jul 6, 2016

@hannes-ucsc — This has been reviewed, squashed and rebased, and is ready to merge.

@hannes-ucsc
Copy link
Contributor

Hmm, don't like how the commits are organized. I should be able to revert a commit and that should undo exactly one change. I bet you that 48ffd14 touches on all of the above commits.

In that sense, I'd rather have just one giant commit. But now that I have to share my Merge button superpowers, diluted beyond recognition to a mere plus or minus one, I don't want to hold this up. Your thoughts, @fnothaft?

@jvivian
Copy link
Collaborator Author

jvivian commented Jul 7, 2016

I squashed the commits because I had to rebase and didn't want to
potentially deal with 33 commits worth of conflicts. I guess I should've
pinged you for review before that, but frank had already looked at them at
that point so I thought I was clear to, sorry about that.

On Wed, Jul 6, 2016, 7:00 PM Hannes Schmidt [email protected]
wrote:

Hmm, don't like how the commits are organized. I should be able to revert
a commit and that should undo exactly one change. I bet you that 48ffd14
48ffd14
touches on all of the above commits.

In that sense, I'd rather have just one giant commit. But now that I have
to share my Merge button superpowers, diluted beyond recognition to a mere
plus or minus one, I don't want to hold this up. Your thoughts, @fnothaft
https://github.com/fnothaft?


You are receiving this because you were mentioned.

Reply to this email directly, view it on GitHub
#324 (comment),
or mute the thread
https://github.com/notifications/unsubscribe/AHOawI5fUpEcRLLxWFsXQtXH-VfQFb8bks5qTF22gaJpZM4I_kue
.

@fnothaft
Copy link
Contributor

fnothaft commented Jul 7, 2016

I should be able to revert a commit and that should undo exactly one change.

In that sense, I'd rather have just one giant commit. But now that I have to share my Merge button superpowers, diluted beyond recognition to a mere plus or minus one, I don't want to hold this up. Your thoughts, @fnothaft?

I strongly +1 the 1 commit = 1 change approach. That's how we run all of the bdgenomics repositories, and I'd love to have this be an official policy over in BD2KGenomics as well.

@hannes-ucsc
Copy link
Contributor

@jvivian, no need to apologize. Organizing changes into commits is an art, especially when it comes to large refactorings. Given what @fnothaft said, I think squashing all commits down to one would be best. I do like your detailed commit messages, so be sure to merge them all together so as to not lose that information.

@jvivian jvivian force-pushed the issues/297-refactor-bwa branch from 48ffd14 to 7b316cc Compare July 7, 2016 21:11
jvivian added a commit that referenced this pull request Jul 9, 2016
Remove email as point of contact. #324 will add the Toil google group as point of contact in parent README.
jvivian added a commit that referenced this pull request Jul 9, 2016
Remove email as point of contact. #324 will add the Toil google group as point of contact in parent README.
@jvivian jvivian removed the needs work label Jul 9, 2016
Add missing uuid to config
Fix bad unpacking
Add sample URLs to logger
Add suffix option to test
Remove unused import
Move google group line to parent README
Add whitespace before map_job
Add comment explaining map_job
Fix missing trim attribute in test
Clean up logic for reference indices
Change point of contact to google group
Improve legibility with comments on their own lines
Unindent FollowOn job
Add logging for index creation
Add inline explaining alt file can't be generated
Clarify that FAI is created separately from BWA indices
Add docstring to move_file_job
Add inline comment to explain disk space
Reverse use of csv
Replace run_bwakit arguments with config
Fix imports and spacing in bwa_alignment
Add /data/ prefix
Add missing import
Move run_bwa_index and run_samtools_faidx to indexing.py
Remove unnecessary renaming
Colocate parameter logic
Add inline comments for conditionals and rg
Provide clarity for rg_line
Move required_length to common lib
Remove bad print statement
Add comments explaining conditional
Clarify samples type
Clarify file_size option
Remove attempt at humor (+5 squashed commits)
Squashed commits:
Move bwa_kit to tool library (resolves #297)
Add bwa_index and samtools_faidx to tool library
Indent Job.Runner, add sanity checks
options -> args (convention)
Print docstring help if no arguments provided
Remove config options
Change config formatting
Add tab-friendly config and manifest names
Replace old bwa_kit job with download_sample_and_align
Clean imports
Move top docstring to main()
Fix test call
Fix adam_gatk_pipeline's call to download_reference_files
Add job version of `move_files`
Update README.md
Remove deprecated launch scripts
Require only reference (resolves #320)
Change download_shared_files -> download_reference_files
Remove old reference requirements
Add single end support for BWA (resolves #322)
Edit manifest docs to include single-end
Add nargs range for single-end and paired-end
Replace parse_config job with parse_manifest function
@jvivian jvivian force-pushed the issues/297-refactor-bwa branch from 7b316cc to eadc6b5 Compare July 9, 2016 22:04
@jvivian
Copy link
Collaborator Author

jvivian commented Jul 11, 2016

@hannes-ucsc — Ready for review/merge

@hannes-ucsc hannes-ucsc merged commit d4bc20f into master Jul 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants