Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

serious issue about nth_element behavior #25

Open
DingFeng9313 opened this issue Jun 13, 2022 · 15 comments
Open

serious issue about nth_element behavior #25

DingFeng9313 opened this issue Jun 13, 2022 · 15 comments

Comments

@DingFeng9313
Copy link

I am doing some rewrite of this ikdtree to adapt my situation.
And I find you used nth_element while building a tree. According to my experiments there might be a bug.

Suppose at some point division-axis is 0, which is x-axis, let's use 1-D data to explain:

Suppose we have a vector [0,1,2,2,2,2,2,2,3,4] which is points' x-axis value after nth_element() function call, y-axis value is totally different, say [0,1,2,3,4,5,6,7,8,9]。nth_element() divide these points into different group but not determinately which point is on left or right node. point (2,2) might be on the left or right.

Thus while searching, or delete or add by point. it not guaranteed to find this point.

@DingFeng9313
Copy link
Author

    // split in the middle
    DistanceType split_val = (bbox[cutfeat].low + bbox[cutfeat].high) / 2;
    ElementType min_elem, max_elem;
    computeMinMax(obj, ind, count, cutfeat, min_elem, max_elem);

    if (split_val < min_elem)
      cutval = min_elem;
    else if (split_val > max_elem)
      cutval = max_elem;
    else
      cutval = split_val;

    IndexType lim1, lim2;
    planeSplit(obj, ind, count, cutfeat, cutval, lim1, lim2);

I check the code in nanoflann library. Their splitting method is directly using mid value (not middle point) as dividing pivot. I guess they can't guarantee the tree is balanced. But still their searching is quite efficient and accurate.

@russellsch
Copy link

I was able to address this in a re-implementation I've been working on by continuing to use std::nth_element as-is, but afterwards:

  • Retrieving the median point at index of "mid"
  • Using std::partition on points after the middle element, with a predicate that returns true if their element on the split axis equal the median element
  • Using the iterator returned from std::partition minus 1 to determine the data assigned to the node (and left and right son assignment)

I believe this forces all items matching the partition element to be moved into the left son, and doesn't add too much additional overhead.

@engcang
Copy link

engcang commented Mar 14, 2023

@russellsch Hey Russellsch, I am also suffering inaccurate kdtree search result and think this is because of nth_element as DingFeng9313 mentioned above.
Can you attach the modified code block here?
I looked into your forked repository and found it is not committed yet.

@russellsch
Copy link

russellsch commented Mar 17, 2023

@engcang yes, sorry my fork doesn't have any changes - the changes I mentioned were on a from-scratch rewrite to try and get some additional speed gains via SIMD (using Eigen), templating (for uint types instead of float), and add variable number of dimensions - not based on this repo.

I wasn't seeing the performance gains I was hoping for so didn't complete it - ended up using nanoflan with a wrapper.

That said I CAN paste you my equivalent of this repos's BuildTree function (consider it CC0 licensed, use it however you like). Hope that can help you and others! (I added some additional comments and two "ADDED FIX" comments to show where my 3 suggestions above are implemented)

//Note: using PointType = Eigen::Matrix<TScalar, dims, 1>;
//Note: using PointVecType       = std::vector<PointType>;    // Note: Vector is used so points can be sorted more easily w/ std::sort
//Note: using PointEigenMatType1 = Eigen::Map<Eigen::Matrix<TScalar, dims, Eigen::Dynamic>>;

/// @brief Builds a perfectly balanced tree (or sub-tree) based on a vector of Points that are able to be sorted in-place
/// @param points Reference to vector of points that will be sorted as the tree is built
/// @param left_point_index Index of the left-most point in the vector to build sub-tree for
/// @param right_point_index Index of the right-most point in the vector to build sub-tree for
template<typename TScalar, uint8_t dims>
inline std::unique_ptr<typename IKDTree<TScalar, dims>::IKDTreeNode> IKDTree<TScalar, dims>::buildSubtree_(
        PointVecType& points, size_t left_point_index, size_t right_point_index) {
    assert(left_point_index <= right_point_index && "left_point_index > right_point_index");
    assert(!points.empty() && "points should never be empty");
    if(points.empty()) {    // TODO: Move this guarantee into build to get out of a hot loop?
        return nullptr;
    }

    auto node_ptr = std::make_unique<IKDTreeNode>();

    // Create single element nodes and bail out quickly
    if(left_point_index == right_point_index) {
        node_ptr->point = points[left_point_index];
        pullUpUpdate_(*(node_ptr.get()));
        return node_ptr;
    }

    //////////// ADDED FIX (Retrieving the median point at index of "mid")
    const size_t mid_point_index = (left_point_index + right_point_index) / 2UL;
    /// Find distance between right and left indices - size of points of interest
    const size_t index_distance = right_point_index - left_point_index + 1UL;

    /// Map for treating vector of points as an Eigen map
    const PointEigenMatType1 points_mat(&points[0][0], dims, points.size());

    const size_t mid_point_index = (left_point_index + right_point_index) / 2UL;

    // Find point whose each value is min/max of all points within range of left and right indices
    // This represents the min and max corners of a box that contains the points
    const PointType min_point = points_mat.middleCols(left_point_index, index_distance).rowwise().minCoeff();
    const PointType max_point = points_mat.middleCols(left_point_index, index_distance).rowwise().maxCoeff();

    // Find the largest axis of the box that contains the points - this is axis with maximal covariance and is used as the division axis of this node
    typename PointType::Index largest_axis;
    (max_point - min_point).maxCoeff(&largest_axis);
    assert(largest_axis >= 0);
    node_ptr->axis = static_cast<uint8_t>(largest_axis);

    // Find median and partially sort such that points are on correct side of the median within the range of left and right indices
    // Note: that elements that are equal to the split may be left or right of the nth element!
    std::nth_element(points.begin() + left_point_index, points.begin() + mid_point_index, points.begin() + right_point_index + 1UL,
            [largest_axis](const PointType& p1, const PointType& p2) { return p1[largest_axis] < p2[largest_axis]; });
    const PointType median_point = *(points.begin() + mid_point_index);

    //////////// ADDED FIX (std::partition and using returned iterator)
    // To address elements equal to the median being on both sides of the mid-point, partitioning is used on the second half to push any
    // elements matching the nth element to the left of the iterator (which points to first element to be assigned to right son)
    auto split_plus_one_it            = std::partition(points.begin() + mid_point_index + 1UL, points.begin() + right_point_index + 1UL,
                       [largest_axis, median_point](const PointType& p1) { return p1(largest_axis) == median_point(largest_axis); });
    const size_t split_plus_one_index = split_plus_one_it - points.begin();
    const size_t split_index          = split_plus_one_index - 1UL;

    node_ptr->point = points[split_index];

    if(left_point_index != split_index) {    // If mid-point is the same as left index, no need to build a left son
                                             // Check is needed so mid_point_index of 0 doesn't wrap around
        size_t left_son_rightmost_index = 0UL;
        if(split_index == 0UL) {
            left_son_rightmost_index = 0UL;
        } else {
            left_son_rightmost_index = split_index - 1UL;
        }
        node_ptr->left_son_ptr             = buildSubtree_(points, left_point_index, left_son_rightmost_index);
        node_ptr->left_son_ptr->father_ptr = node_ptr.get();
    }

    if(right_point_index != split_index) {    // If mid-point is the same as right index, no need to build a right son
        node_ptr->right_son_ptr             = buildSubtree_(points, split_index + 1UL, right_point_index);
        node_ptr->right_son_ptr->father_ptr = node_ptr.get();
    }

    // Lazy label init - by default all new nodes are initialized with pointDeleted, treeDeleted, and pushDown as false
    // Pullup - build out tree_size, invalid_num, node_min_max (range), bounding radius squared (used for searching)
    pullUpUpdate_(*(node_ptr.get()));

    return node_ptr;
}

@engcang
Copy link

engcang commented Mar 23, 2023

@russellsch Hey Russellsch, Thanks for sharing your code block. I tried it and inaccurate results of kdtree search didn't get solved, unfortunately.
But super thank you for your kind reply :)

(I am trying to set a specific property of points within radius as here: https://github.com/engcang/ikd-Tree/blob/coverage_check/ikd_Tree.cpp#L662-L671)

@chenxiaocongAI
Copy link

chenxiaocongAI commented Aug 14, 2024

hello @engcang ,I have reviewed your code, why did you use the center point of each grid for your first insertion.Will inserting the midpoint every time yield good results for lio

@engcang
Copy link

engcang commented Aug 14, 2024

@chenxiaocongAI Hi. My repository has two branches, one is for lio which does not use the center point of each grid for the first insertion, and the another is for collision checking only in path planning.

@chenxiaocongAI
Copy link

@engcang The first author mentioned that when inserting, the position of point (2,2) is not unique. I theoretically analyzed that for the first partitioning along the y-axis, the tree nodes are (4,2) and the left subtree is (1,1). The right subtree of point (1,1) is (2,3), and the left subtree of point (2,3) is (2,2). Isn't this location confirmed? If this sorting is not unique, have you addressed this issue in your branch?thank you

@engcang
Copy link

engcang commented Aug 14, 2024

@chenxiaocongAI No I could not solve the problem. I also tried @russellsch 's code without any luck.
If you solve the issue, please share the solution :)

@chenxiaocongAI
Copy link

@engcang I found a flaw in your code. Before using std:: partition, you should sort the data by the maximum axis. need change to if (((*root)->division_axis == 0 && point.x <= (*root)->point.x) || ((*root)->division_axis == 1 && point.y <= (*root)->point.y) || ((*root)->division_axis == 2 && point.z <= (*root)->point.z))
image
I'm curious about the reason for your imprecise search. Could you please provide me with your test cases so that I can further investigate the cause. thank you

@engcang
Copy link

engcang commented Aug 16, 2024

@russellsch BuildTree() function in my code is just same with the original repo's code. What and where do you suggest me to fix? and for what? I actually do not understand what you want to say :)
can you explain again?

@chenxiaocongAI
Copy link

chenxiaocongAI commented Aug 20, 2024 via email

@engcang
Copy link

engcang commented Aug 20, 2024

@chenxiaocongAI
Refer to the Set_Covered_Points function in my code - https://github.com/engcang/ikd-Tree/blob/coverage_check/ikd_Tree.cpp#L770-L779
It is now traversing the whole tree to set the point.covered as ture or false, not just searching the tree with axes - https://github.com/engcang/ikd-Tree/blob/coverage_check/ikd_Tree.cpp#L752-L765

Here are the results videos of axes tree searching and whole tree searching.
While I am moving the robot, I obtained the poitns within radius with Radius_Search, then I set the covered as true with Set_Covered_Points function.

First video: searching the tree with axes kd-tree search method to set the covered as true.
https://github.com/user-attachments/assets/ae77188a-e04f-4898-8af9-816d0303d3ef

Second video: searching the tree with whole-tree search method to set the covered as true.
https://github.com/user-attachments/assets/bfdc9d40-4ec8-40c6-aea4-f290bbf1a57f

As you can see, even I input the same points from Radius_Search to set the covered = true, kd-tree axes search cannot query the exact same points because of inaccurate partitioning of BuildTree().

@chenxiaocongAI
Copy link

chenxiaocongAI commented Aug 27, 2024

@engcang Can you give me the demo cpp file you tested? I will test if my modifications can solve your problem. Because I found that not only is there a problem with BuildTree, but when inserting the last point into BuildTree, the axis of that point defaults to the x-axis. This will result in Add-by_point inserting leaf nodes along the x-axis, which is different from the actual maximum distance along the axis, and may also introduce search or deletion errors
@engcang : int split_index = partition(begin(Storage) + mid + 1, begin(Storage) + r + 1, [&](PointType p)
{return (Storage[mid].x==p.x && Storage[mid].y==p.y && Storage[mid].z==p.z);}) - begin(Storage) - 1;
@russellsch: auto split_plus_one_it = std::partition(points.begin() + mid_point_index + 1UL, points.begin() + right_point_index + 1UL,[largest_axis, median_point](const PointType& p1) { return p1(largest_axis) == median_point(largest_axis); });
this is different.engcang is find same point,but russellsch find largest axis same point

@engcang Can you provide me with your point cloud data for testing?I tested my code on my own 8 data sets and there were no problems

@engcang
Copy link

engcang commented Aug 29, 2024

@chenxiaocongAI I do not have the data anymore,,, I think I deleted it long time ago.
If you can give me your fixed code, I will test it with new data.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants