Skip to content

Commit

Permalink
fix: Avoid re-calculating md5sum on clone and conversion to KmerMinHa…
Browse files Browse the repository at this point in the history
…shBTree (#3385)

While debugging
sourmash-bio/sourmash_plugin_branchwater#503
the flamegraph showed ~26% of the time was spent on calculating MD5.

WHY????

Turns out cloning and converting to `KmerMinHash` to `KmerMinHashBTree`
triggered recalculation of the MD5 sum, even if it was already present
(or... not needed). Oops!
  • Loading branch information
luizirber authored Nov 11, 2024
1 parent e86c8a8 commit e75f306
Showing 1 changed file with 7 additions and 1 deletion.
8 changes: 7 additions & 1 deletion src/core/src/sketch/minhash.rs
Original file line number Diff line number Diff line change
Expand Up @@ -1006,7 +1006,7 @@ impl Clone for KmerMinHashBTree {
mins: self.mins.clone(),
abunds: self.abunds.clone(),
current_max: self.current_max,
md5sum: Mutex::new(Some(self.md5sum())),
md5sum: Mutex::new(self.md5sum.lock().unwrap().clone()),
}
}
}
Expand Down Expand Up @@ -1667,6 +1667,8 @@ impl From<KmerMinHashBTree> for KmerMinHash {
new_mh.mins = mins;
new_mh.abunds = abunds;

new_mh.md5sum = other.md5sum;

new_mh
}
}
Expand All @@ -1691,6 +1693,8 @@ impl From<&KmerMinHashBTree> for KmerMinHash {
new_mh.mins = mins;
new_mh.abunds = abunds;

new_mh.md5sum = Mutex::new(other.md5sum.lock().unwrap().clone());

new_mh
}
}
Expand All @@ -1714,6 +1718,8 @@ impl From<KmerMinHash> for KmerMinHashBTree {
new_mh.mins = mins;
new_mh.abunds = abunds;

new_mh.md5sum = other.md5sum;

new_mh
}
}
Expand Down

0 comments on commit e75f306

Please sign in to comment.