-
Notifications
You must be signed in to change notification settings - Fork 72
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix TANE-based algorithms #396
base: main
Are you sure you want to change the base?
Conversation
6aadb62
to
c396254
Compare
e9b9c16
to
e319415
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't say that I fully remember the essence of the algorithm and can fully verify the changes.
Can you provide useful links? I would like to study in more detail what is happening in these methods.
Also, is it known how much these changes affect performance?
In general, I think the changes are adequate and most likely after I study in more detail what is happening here, it will be possible to merge.
@@ -258,4 +249,4 @@ unsigned long long TaneCommon::ExecuteInternal() { | |||
|
|||
} // namespace tane | |||
|
|||
} // namespace algos | |||
} // namespace algos |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no newline
for (std::size_t rhs_index = vertex->GetRhsCandidates().find_first(); | ||
rhs_index != boost::dynamic_bitset<>::npos; | ||
rhs_index = vertex->GetRhsCandidates().find_next(rhs_index)) { | ||
Vertical rhs = static_cast<Vertical>(*schema->GetColumn((int)rhs_index)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: remove (int)
cast, since we should pass std::size_t
// if we seek for exact FDs then SetInvalid | ||
if (max_fd_error_ == 0 && max_ucc_error_ == 0) { | ||
for (auto key_vertex : key_vertices) { | ||
key_vertex->GetRhsCandidates() &= key_vertex->GetVertical().GetColumnIndices(); | ||
key_vertex->SetInvalid(true); | ||
} | ||
for (auto key_vertex : key_vertices) { | ||
key_vertex->GetRhsCandidates() &= key_vertex->GetVertical().GetColumnIndices(); | ||
key_vertex->SetInvalid(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why we can remove this additional max_fd_error_
check?
Remove AUCC discovery artifacts, perform additional key pruning.
e319415
to
b2465bd
Compare
TANE algorithm implementation missed key pruning (supposedly because of AUCC discovery, which required much larger search space). Since we no longer perform AUCC discovery in this algorithm, original key pruning procedure is now restored.