Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix TANE-based algorithms #396

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

iliya-b
Copy link
Contributor

@iliya-b iliya-b commented Apr 21, 2024

TANE algorithm implementation missed key pruning (supposedly because of AUCC discovery, which required much larger search space). Since we no longer perform AUCC discovery in this algorithm, original key pruning procedure is now restored.

@egshnov egshnov mentioned this pull request Sep 21, 2024
@iliya-b iliya-b force-pushed the tane-refactoring branch 3 times, most recently from 6aadb62 to c396254 Compare October 21, 2024 14:30
@iliya-b iliya-b marked this pull request as ready for review October 21, 2024 15:00
@iliya-b iliya-b force-pushed the tane-refactoring branch 3 times, most recently from e9b9c16 to e319415 Compare November 5, 2024 19:09
Copy link
Collaborator

@vs9h vs9h left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't say that I fully remember the essence of the algorithm and can fully verify the changes.
Can you provide useful links? I would like to study in more detail what is happening in these methods.

Also, is it known how much these changes affect performance?

In general, I think the changes are adequate and most likely after I study in more detail what is happening here, it will be possible to merge.

@@ -258,4 +249,4 @@ unsigned long long TaneCommon::ExecuteInternal() {

} // namespace tane

} // namespace algos
} // namespace algos
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

no newline

for (std::size_t rhs_index = vertex->GetRhsCandidates().find_first();
rhs_index != boost::dynamic_bitset<>::npos;
rhs_index = vertex->GetRhsCandidates().find_next(rhs_index)) {
Vertical rhs = static_cast<Vertical>(*schema->GetColumn((int)rhs_index));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove (int) cast, since we should pass std::size_t

Comment on lines -80 to +79
// if we seek for exact FDs then SetInvalid
if (max_fd_error_ == 0 && max_ucc_error_ == 0) {
for (auto key_vertex : key_vertices) {
key_vertex->GetRhsCandidates() &= key_vertex->GetVertical().GetColumnIndices();
key_vertex->SetInvalid(true);
}
for (auto key_vertex : key_vertices) {
key_vertex->GetRhsCandidates() &= key_vertex->GetVertical().GetColumnIndices();
key_vertex->SetInvalid(true);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain why we can remove this additional max_fd_error_ check?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants