Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cudamapper] Accuracy improvements through chaining #565

Open
wants to merge 72 commits into
base: dev
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
72 commits
Select commit Hold shift + click to select a range
0e525a9
[cudamapper] Add an overlapper_anchmer class which implements a basic…
edawson Aug 11, 2020
5726a8a
[cudamapper] Basic anchmer chaining implementation.
edawson Aug 11, 2020
3da7957
[cudamapper] Implements overlap identification with anchmers.
edawson Aug 12, 2020
934fcfb
[cudamapper] Implement overlap filtering and start RLE-encoding of ov…
edawson Aug 14, 2020
99e6bee
[cudamapper] Implements overlap merging on GPU.
edawson Aug 14, 2020
6945786
[cudamapper] Implement final overlap filtering on GPU.
edawson Aug 14, 2020
9c4fe45
[cudamapper] Disable CPU fusion by placing an early return in postpro…
edawson Aug 14, 2020
57769e5
[cudamapper] Parameter tweaking to try to generate longer overlaps
edawson Aug 14, 2020
6087dab
[cudamapper] Implement mask for self-self mappings and reduce anchor …
edawson Aug 14, 2020
b787915
[cudamapper] Disable initial filtering stage in overlapper_anchmer.
edawson Aug 17, 2020
41c5ebb
[cudamapper] Disable on-CPU fusing and reimplement anchmer chaining
edawson Aug 18, 2020
863e144
[cudamapper] Turns debugging on for overlapper_anchmer and produces a
edawson Aug 19, 2020
0c689a0
[cudamapper] Alternative implementation of anchmers, with scoring.
edawson Aug 21, 2020
4e1adf3
[cudamapper] Finish basic anchmer implementation.
edawson Aug 21, 2020
b0bde26
[cudamapper] Change the definition of identical anchors.
edawson Aug 24, 2020
b373daf
[cudamapper] Fix bug in anchmer chaining and turn off repeat masking.
edawson Aug 25, 2020
b13fe6f
[cudamapper] Implement chaining algorithm based on anchmers
edawson Aug 28, 2020
8733d4b
[cudamapper] Remove debugging output from overlapper_anchmer.cu.
edawson Aug 28, 2020
8c6f465
[cudamapper] Remove all debugging in overlapper_anchmer.
edawson Aug 29, 2020
cd8bf64
[cudamapper] Add an approximate fast gap scoring function to overlapp…
edawson Sep 1, 2020
ea9dc79
[cudamapper] Implement scoring algorithm from minimap2.
edawson Sep 1, 2020
e1fd5ed
[cudamapper] Clean up overlapper_anchmer and add a simple primary cha…
edawson Sep 3, 2020
5b4cb5d
[cudamapper] Changes the termination condition for predecessor findin…
edawson Sep 3, 2020
ec1c2ab
[cudamapper] Change the definition of minimum length in overlapper_an…
edawson Sep 8, 2020
b0122d3
[cudamapper] Begin refactor of minimap2-inspired chaining to overlapp…
edawson Sep 9, 2020
d7b0d3c
[cudamapper] Implement filtering of self mappings and relevant comman…
edawson Sep 9, 2020
ec8350e
[cudamapper] Implement overlapper_minimap.
edawson Sep 11, 2020
04000d3
[cudamapper] Implement filters for overlaps that are reciprocal match…
edawson Sep 11, 2020
f703710
[cudamapper] Clean up overlapper_minimap.cu and reduce the number of
edawson Sep 11, 2020
7a649b8
[cudamapper] Remove status report message in overlapper_minimapper.cu.
edawson Sep 14, 2020
2e6e4b3
[cudamapper] Try to fix strand issues by comparing first/last anchors…
edawson Sep 14, 2020
a6bf5a4
Merge branch 'dev-v0.6.0' into anchmer-fast-score
edawson Sep 14, 2020
3833a7c
Merge branch 'dev-v0.6.0' of https://github.com/clara-genomics/ClaraG…
edawson Sep 14, 2020
1f48924
[cudamapper] Fix a bug where sequences that fully overlapped would be…
edawson Sep 16, 2020
460c579
[cudamapper] Refactor word_size to be an argument to scoring function
edawson Sep 16, 2020
d26f289
[cudamapper] Fix bugs in overlapper_minimap.
edawson Sep 17, 2020
6e206ea
[cudamapper] Add a basic test for overlapper_minimap's scoring function.
edawson Sep 17, 2020
6003979
[cudamapper] Turn off CPU-based fusion of short overlaps.
edawson Sep 17, 2020
7b63014
[cudampper] Add a definition for the number of predecessor search ite…
edawson Sep 17, 2020
99a0e88
[cudamapper] Add (broken) tiled implementation of chaining.
edawson Sep 21, 2020
991e7c2
[cudamapper] Implements a working but oversubscribed version of tiled
edawson Sep 23, 2020
ffb83ac
[overlapper_minimap] Fix global indexes using chain_anchors_in_block.
edawson Sep 25, 2020
c497f74
Add chainer_utils.cuh
edawson Sep 25, 2020
39487ec
[cudamapper] Add a function for finding query read ID runs in chainer…
edawson Sep 28, 2020
1fd0723
[cudamapper] Add call to encode_anchor_query_locations in overlapper_…
edawson Sep 28, 2020
8d882b0
[chainer_utils] Add functions for calculating chain starts per read. …
edawson Sep 29, 2020
21e83ea
[cudamapper] Reconfigure overlapper_minimap to launch one block per r…
edawson Sep 29, 2020
4f2f1e3
[cudamapper] Fix overlapper_minimap's score computation.
edawson Sep 29, 2020
1924059
[overlapper_minimap] Fix select_mask indexing for masking non-max cha…
edawson Sep 29, 2020
4f896c3
Merge pull request #1 from edawson/qtpairs-chainer
edawson Sep 29, 2020
7d7bad8
Merge branch 'dev-v0.6.0' into anchmer-fast-score
edawson Oct 8, 2020
ee01ab0
squash changes
nvvishanthi Oct 9, 2020
18119aa
small fix: bitwise op -> boolean
nvvishanthi Oct 10, 2020
c2dc3e1
Merge pull request #3 from nvvishanthi/nvvishanthi/chaining-scoring-f…
edawson Oct 10, 2020
1dffb46
[cudamapper] Remove OverlapperMinimap test file, refactor to use back…
edawson Oct 13, 2020
9413bf8
Merge pull request #5 from edawson/chainer-utils-backtrace
edawson Oct 13, 2020
40f5550
1. precision/recall improvements 2. removed extra syncthreads 3. othe…
nvvishanthi Oct 13, 2020
1d8e6e0
1. fix case where anchors in last tile are not tile-aligned 2. kind o…
nvvishanthi Oct 13, 2020
333ef13
1. fix backtrace issue where potentially some anchors did not go thro…
nvvishanthi Oct 16, 2020
6279bd6
Merge pull request #6 from nvvishanthi/nvvishanthi/chaining-precision…
edawson Oct 20, 2020
7dee298
[cudamapper] Implement a single-read-per-block style chaining impleme…
edawson Oct 21, 2020
28e6e92
[cudamapper] Reenable writing the score, predecessor, and anchor at t…
edawson Oct 21, 2020
6b97d54
[cudamapper] Toggle build options for pygenomeworks. Make offset tile…
edawson Oct 21, 2020
ae6d597
[cudamapper] Comment out all filtering for testing.
edawson Oct 22, 2020
6f5f711
[cudamapper] Attempt to fix index errors related to chain starts.
edawson Oct 23, 2020
fb8fb1b
[chainerutils] Fix multiple bugs in chainer_utils.cu introduced by no…
edawson Oct 26, 2020
be04ef6
[overlapper_minimap] Modify overlapper minimap to only chain tiles in…
edawson Oct 26, 2020
d14c835
[overlapper] Move reciprocal-overlap testing and duplicate removal to…
edawson Oct 26, 2020
8f4168b
[chainerutils] Fix wrong index use in backtrace_anchors_to_overlaps.
edawson Oct 26, 2020
4e686d4
[overlapper_minimap] Set default for reciprocal overlap check to 50 o…
edawson Oct 26, 2020
b58f705
Merge pull request #9 from edawson/tiler-chainer
edawson Oct 27, 2020
b5a7c70
[chainer_utils] Revert attempted changes to backtrace anchors, and de…
edawson Oct 29, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions cudamapper/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@ endif()

cuda_add_library(${MODULE_NAME}
src/application_parameters.cpp
src/chainer_utils.cu
src/cudamapper.cpp
src/index_batcher.cu
src/index_descriptor.cpp
Expand All @@ -59,6 +60,8 @@ cuda_add_library(${MODULE_NAME}
src/cudamapper_utils.cpp
src/overlapper.cpp
src/overlapper_triggered.cu
src/overlapper_anchmer.cu
src/overlapper_minimap.cu
src/utils.cpp
${CMAKE_CURRENT_BINARY_DIR}/version.cpp)

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -38,6 +38,16 @@ namespace details
namespace overlapper
{

// void filter_self_mappings(std::vector<Overlap>& overlaps,
// const io::FastaParser& query_parser,
// const io::FastaParser& target_parser,
// const double max_percent_similarity);

void filter_self_mappings(std::vector<Overlap>& overlaps,
const io::FastaParser& query_parser,
const io::FastaParser& target_parser,
const double max_percent_overlap);

/// \brief Extends a single overlap at its ends if the similarity of the query and target sequences is above a specified threshold.
/// \param overlap An Overlap which is modified in place. Any of the query_start_position_in_read, query_end_position_in_read,
/// target_start_position_in_read, and target_end_position_in_read fields may be modified.
Expand Down
6 changes: 5 additions & 1 deletion cudamapper/src/application_parameters.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,7 @@ ApplicationParameters::ApplicationParameters(int argc, char* argv[])
{"min-overlap-fraction", required_argument, 0, 'z'},
{"rescue-overlap-ends", no_argument, 0, 'R'},
{"drop-fused-overlaps", no_argument, 0, 'D'},
{"drop-self-mappings", no_argument, 0, 'X'},
{"query-indices-in-host-memory", required_argument, 0, 'Q'},
{"query-indices-in-device-memory", required_argument, 0, 'q'},
{"target-indices-in-host-memory", required_argument, 0, 'C'},
Expand All @@ -59,7 +60,7 @@ ApplicationParameters::ApplicationParameters(int argc, char* argv[])
{"help", no_argument, 0, 'h'},
};

std::string optstring = "k:w:d:m:i:t:F:a:r:l:b:z:RDQ:q:C:c:vh";
std::string optstring = "k:w:d:m:i:t:F:a:r:l:b:z:RDXQ:q:C:c:vh";

bool target_indices_in_host_memory_set = false;
bool target_indices_in_device_memory_set = false;
Expand Down Expand Up @@ -117,6 +118,9 @@ ApplicationParameters::ApplicationParameters(int argc, char* argv[])
case 'D':
drop_fused_overlaps = true;
break;
case 'X':
drop_self_mappings = true;
break;
case 'Q':
query_indices_in_host_memory = std::stoi(optarg);
break;
Expand Down
1 change: 1 addition & 0 deletions cudamapper/src/application_parameters.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ class ApplicationParameters
float min_overlap_fraction = 0.8; // z
bool perform_overlap_end_rescue = false; // R
bool drop_fused_overlaps = false; // D
bool drop_self_mappings = false; // X
int32_t query_indices_in_host_memory = 10; // Q
int32_t query_indices_in_device_memory = 5; // q
int32_t target_indices_in_host_memory = 10; // C
Expand Down
Loading