-
Notifications
You must be signed in to change notification settings - Fork 73
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add filtering for CAGRA to C API #452
base: branch-25.02
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the PR! It'd be great to get this functionality into the cagra c-api
Co-authored-by: Micka <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
/merge |
/ok to test |
cuvs::neighbors::cagra::search( | ||
*res_ptr, search_params, *index_ptr, queries_mds, neighbors_mds, distances_mds); | ||
} else if (filter.type == BITSET) { | ||
using filter_mdspan_type = raft::device_vector_view<std::uint32_t, int64_t, raft::row_major>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the build fails because of the std::uint32_t
as the first type argument. The type arguments should be <index_t, index_t>
, see raft bitset.hpp
:
template <typename bitset_t = uint32_t, typename index_t = uint32_t>
struct bitset {
static constexpr index_t bitset_element_size = sizeof(bitset_t) * 8;
/**
* @brief Construct a new bitset object with a list of indices to unset.
*
* @param res RAFT resources
* @param mask_index List of indices to unset in the bitset
* @param bitset_len Length of the bitset
* @param default_value Default value to set the bits to. Default is true.
*/
bitset(const raft::resources& res,
raft::device_vector_view<const index_t, index_t> mask_index,
index_t bitset_len,
bool default_value = true);
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually I missed this. The function you are using here is creating a bitset from a list of indices and I don't think it is the workflow that we expect.
The C++ function accepts a bitset_view
, not a bitset
, so at this point the memory for the bitset should already allocated and we just need to transfer the pointer and the length of the bitset. The C function should also assume that the filter given in input is a bitset already allocated and filled, instead of a list of neighbors to filter. So the filter
taken as a parameter in this function should be manipulated as a bitset_view
object.
cuvs::neighbors::cagra::search( | ||
*res_ptr, search_params, *index_ptr, queries_mds, neighbors_mds, distances_mds); | ||
} else if (filter.type == BITSET) { | ||
using filter_mdspan_type = raft::device_vector_view<std::uint32_t, int64_t, raft::row_major>; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm actually I missed this. The function you are using here is creating a bitset from a list of indices and I don't think it is the workflow that we expect.
The C++ function accepts a bitset_view
, not a bitset
, so at this point the memory for the bitset should already allocated and we just need to transfer the pointer and the length of the bitset. The C function should also assume that the filter given in input is a bitset already allocated and filled, instead of a list of neighbors to filter. So the filter
taken as a parameter in this function should be manipulated as a bitset_view
object.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
/ok to test |
/ok to test |
*/ | ||
cuvsError_t cuvsCagraSearch(cuvsResources_t res, | ||
cuvsCagraSearchParams_t params, | ||
cuvsCagraIndex_t index, | ||
DLManagedTensor* queries, | ||
DLManagedTensor* neighbors, | ||
DLManagedTensor* distances); | ||
DLManagedTensor* distances, | ||
cuvsFilter filter); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This API change needs to be propagated to:
- the python package
- the example C project (
cuvs/example/c
) - probably the rust package
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done👍
@@ -480,7 +482,8 @@ def search(SearchParams search_params, | |||
k, | |||
neighbors=None, | |||
distances=None, | |||
resources=None): | |||
resources=None, | |||
filter=None): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add this parameter to the python documentation
|
||
Parameters | ||
---------- | ||
bitmap : numpy.ndarray |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update this docstring for bitset instead of bitmap
/ok to test |
Adds the CAGRA filtering feature to the C API using DLPack Tensor as blocklist