Add the deterministic version of AIR Top-K ( the radix-based topk in … #2057

ChristinaZ · 2023-12-12T09:12:37Z

Hi,

This PR is about the radix-based top-k algorithm in RAFT, we call it AIR Top-K in our just published paper (third_party/raft/cpp/include/raft/matrix/detail/select_radix.cuh)
We have recieved several feedbacks about the deterministic of AIR Top-K.
In detail, AIR Top-K will return the smallest or largest K elements.
One thing to notice is that there might be more than one "Kth smallest/largest element" for the given dataset. For example, assuming K=100, the value of the Kth element is 58, while there are three element's value are 58 and there already 99 element whose value are larger than 58. In this case, we might not output all the equaling element as we ensure the output number is K. In this example, for all three elements, we only choose one element to store it in the results.

Previously, we choose the element euqaling to the Kth value randomly. In the deterministic version, we always ensure that the ones with smaller indices will be the output.

In this PR, we only added the code in kernel and add a template parameter stable_last_filter with default value false.
It means our previous code don't need to change anything.
I think it's better to open this PR first and discuss about current implement. Then we can discuss about adding one more API to expose this API to customer.

copy-pr-bot · 2023-12-12T09:12:40Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

cjnolet · 2024-01-10T03:19:16Z

/ok to test

review-notebook-app · 2024-01-17T01:59:54Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

ChristinaZ · 2024-01-17T02:01:50Z

/ok to test

Forward-merge branch-24.02 to branch-24.04

### Brief Add another workspace memory resource that does not have the explicit memory limit. That is, after the change we have the following: 1. `rmm::mr::get_current_device_resource()` is default for all allocations, as before. It is used for the allocations with unlimited lifetime, e.g. returned to the user. 2. `raft::get_workspace_resource()` is for temporary allocations and forced to have fixed size, as before. However, it becomes smaller and should be used only for allocations, which do not scale with problem size. It defaults to a thin layer on top of the `current_device_resource`. 3. `raft::get_large_workspace_resource()` _(new)_ is for temporary allocations, which can scale with the problem size. Unlike `workspace_resource`, its size is not fixed. By default, it points to the `current_device_resource`, but the user can set it to something backed by the host memory (e.g. managed memory) to avoid OOM exceptions when there's not enough device memory left. ## Problem We have a list of issues/preference/requirements, some of which contradict others 1. We rely on RMM to handle all allocations and we often use [`rmm::mr::pool_memory_resource`](https://github.com/rapidsai/raft/blob/9fb05a2ab3d72760a09f1b7051e711d773682ef1/cpp/bench/ann/src/raft/raft_ann_bench_utils.h#L73) for performance reasons (to avoid lots of cudaMalloc calls in the loops) 2. Historically, we've used managed memory allocators as a workaround to [avoid OOM errors](https://github.com/rapidsai/raft/blob/5e80c1d2159e00a204ab5db0f5ca3f9ec43187c7/cpp/include/raft/neighbors/detail/ivf_pq_build.cuh#L1788-L1795) or [improve speed (by increasing batch sizes)](https://github.com/rapidsai/raft/blob/5e80c1d2159e00a204ab5db0f5ca3f9ec43187c7/cpp/include/raft/neighbors/detail/ivf_pq_build.cuh#L1596-L1603). 3. However, the design goal is to avoid setting allocators on our own and to give the full control to the user (hence the workaround in 2 [was removed](rapidsai@addb059#diff-f7f070424d71da5321d470416d1a4ca3605c4290c34c4a1c1d8b2240747000d2)). 4. We introduced the [workspace resource](rapidsai#1356) earlier to allow querying the available memory reliably and maximize the batch sizes accordingly (see also issue [rapidsai#1310](rapidsai#1310)). Without this, some of our batched algorithms either fail with OOM or severely underperform due to small batch sizes. 5. However, we cannot just put all of RAFT temporary allocations into the limited `workspace_resource`, because some of them scale with the problem size and would inevitably fail with OOM at some point. 6. Setting the workspace resource to the managed memory is not advisable as well for performance reasons: we have lots of small allocations in performance critical sections, so we need a pool, but a pool in the managed memory inevitably outgrows the device memory and makes the whole program slow. ## Solution I propose to split the workspace memory into two: 1. small, fixed-size workspace for small, frequent allocations 2. large workspace for the allocations that scale with the problem size Notes: - We still leave the full control over the allocator types to the user. - Neither of the workspace resource should have unlimited lifetime / returned to the user. As a result, if the user sets the managed memory as the large workspace resource, the memory is guaranteed to be released after the function call. - We have the option to use the slow managed memory without a pool for large allocations, while still using a fast pool for small allocations. - We have more flexible control over which allocations are "large" and which are "small", so hopefully using the managed memory is not so bad for performance. Authors: - Artem M. Chirkin (https://github.com/achirkin) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2322

Forward-merge branch-24.06 into branch-24.08

cjnolet · 2024-05-21T15:49:25Z

@ChristinaZ it would be great to have this feature in RAFT. I'm going to push the release to 24.08 since we're approaching code freeze for 24.06. 24.08 is in August, do you think we might be able to get this merged by then?

Store operations are void. Authors: - Aaron Siddhartha Mondal (https://github.com/aaronmondal) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2292

Forward-merge branch-24.06 into branch-24.08

Too long index file name would lead to a crash while calling the index serialization routines. Such long filenames can occur if we try to specialize many parameters for CAGRA ann index. This PR fixes the issue by replacing the long index file name with a hash. Drawback is the filename will not be descriptive. Authors: - Tamas Bela Feher (https://github.com/tfeher) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2280

Forward-merge branch-24.06 into branch-24.08

jupyer -> jupyter Authors: - Ikko Eltociear Ashimine (https://github.com/eltociear) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2308

Forward-merge branch-24.06 into branch-24.08

ChristinaZ · 2024-05-23T01:18:15Z

@ChristinaZ it would be great to have this feature in RAFT. I'm going to push the release to 24.08 since we're approaching code freeze for 24.06. 24.08 is in August, do you think we might be able to get this merged by then?

Got it. I will work on the integration as soon as possible.

Authors: - Divye Gala (https://github.com/divyegala) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2297

Forward-merge branch-24.06 into branch-24.08

Replace hyphens with underscores in `raft-ann-bench` to make it a valid Python identifier. Also add a Python 3.11 tag to `raft-ann-bench`, and use the `VERSION` file instead of an attribute. Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Divye Gala (https://github.com/divyegala) - Mike Sarahan (https://github.com/msarahan) URL: rapidsai#2333

Forward-merge branch-24.06 into branch-24.08

Change the imported package name to reflect the new name as of rapidsai#2333. Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Bradley Dice (https://github.com/bdice) - Divye Gala (https://github.com/divyegala) URL: rapidsai#2338

Forward-merge branch-24.06 into branch-24.08

- This PR is one part of the feature of rapidsai#1969 - Add the API of 'search_with_filtering' for brute force. Authors: - James Rong (https://github.com/rhdong) ```shell ***WARNING*** CPU scaling is enabled, the benchmark real time measurements may be noisy and will incur extra overhead. ----------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------------------------- KNN/float/int64_t/brute_force_filter_knn/0/0/0/manual_time 33.1 ms 69.9 ms 21 1000000#128#1000#255#0#InnerProduct#NO_COPY#SEARCH KNN/float/int64_t/brute_force_filter_knn/1/0/0/manual_time 38.0 ms 74.8 ms 18 1000000#128#1000#255#0#L2Expanded#NO_COPY#SEARCH KNN/float/int64_t/brute_force_filter_knn/2/0/0/manual_time 41.7 ms 78.5 ms 17 1000000#128#1000#255#0.8#InnerProduct#NO_COPY#SEARCH KNN/float/int64_t/brute_force_filter_knn/3/0/0/manual_time 57.5 ms 94.3 ms 12 1000000#128#1000#255#0.8#L2Expanded#NO_COPY#SEARCH KNN/float/int64_t/brute_force_filter_knn/4/0/0/manual_time 19.7 ms 56.4 ms 35 1000000#128#1000#255#0.9#InnerProduct#NO_COPY#SEARCH KNN/float/int64_t/brute_force_filter_knn/5/0/0/manual_time 26.1 ms 62.8 ms 27 1000000#128#1000#255#0.9#L2Expanded#NO_COPY#SEARCH``` Authors: - rhdong (https://github.com/rhdong) - Artem M. Chirkin (https://github.com/achirkin) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Corey J. Nolet (https://github.com/cjnolet) - Divye Gala (https://github.com/divyegala) URL: rapidsai#2294

Forward-merge branch-24.06 into branch-24.08

Contributes to rapidsai/build-planning#31 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2331

Contributes to rapidsai/build-planning#62. It looks like this some of this project's `conda` recipes have unnecessary dependencies on `setuptools`. I suspect those are left over from before the project was cut over to `scikit-build-core`. ## Notes for Reviewers How I confirmed there were no direct uses of `setuptools` in `pylibraft` and `raft-dask`: ```shell git grep -i setuptools ``` Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Jake Awe (https://github.com/AyodeAwe) URL: rapidsai#2343

… run, fix wheel dependencies (rapidsai#2349) Fixes rapidsai#2348 rapidsai#2331 introduced `rapids-build-backend` (https://github.com/rapidsai/rapids-build-backend) as the build backend for `pylibraft`, `raft-dask`, and `raft-ann-bench`. That library handles automatically modifying a wheel's dependencies based on the target CUDA version. Unfortunately, we missed a few cases in rapidsai#2331, and as a result the last few days of nightly `raft-dask` wheels had the following issues: * depending on `pylibraft` - *(does not exist, it's called `pylibraft-cu12`)* * depending on `ucx-py==0.39.*` - *(does not exist, it's called `ucx-py-cu12`)* * depending on `distributed-ucxx-cu11==0.39.*` instead of `distributed-ucxx-cu11==0.39.*,>=0.0.0a0` - *(without that alpha specifier, `pip install --pre` is required to install pre-release versions)* This wasn't caught in `raft`'s CI, but in downstream CI like `cuml` and `cugraph`, with errors like this: ```text ERROR: ResolutionImpossible: The conflict is caused by: raft-dask-cu12 24.8.0a20 depends on pylibraft==24.8.* and >=0.0.0a0 raft-dask-cu12 24.8.0a19 depends on pylibraft==24.8.* and >=0.0.0a0 ``` ([example cugraph build](https://github.com/rapidsai/cugraph/actions/runs/9315062495/job/25656684762?pr=4454#step:7:1811)) This PR: * fixes those dependency issues * modifies `raft`'s CI so that similar issues would be caught here in the future, before publishing wheels ## Notes for Reviewers ### What was the root cause of CI missing this, and how does this PR fix it? The `raft-dask` test CI jobs use this pattern to install the `raft-dask` wheel built earlier in the CI pipeline. ```shell pip install "raft_dask-cu12[test]>=0.0.0a0" --find-links dist/ ``` As described in the `pip` docs ([link](https://pip.pypa.io/en/stable/cli/pip_install/#finding-packages)), `--find-links` just adds a directory to the list of other places `pip` searches for packages. Because the wheel there had unsatisfiable constraints (e.g. `pylibraft==24.8.*` does not exist anywhere), `pip install` silently disregarded that locally-downloaded `raft_dask` wheel and backtracked (i.e. downloaded older and older wheels from https://pypi.anaconda.org/rapidsai-wheels-nightly/simple/) until it found one that wasn't problematic. This PR ensures that won't happen by telling `pip` to install **exactly that locally-downloaded file**, like this ```shell pip install "$(echo ./dist/raft_dask_cu12*.whl)[test]" ``` If that file is uninstallable, `pip install` fails and you find out via a CI failure. ### How I tested this Initially pushed a commit with just the changes to the test script. Saw the `wheel-tests-raft-dask` CI jobs fail in the expected way, instead of silently falling back to an older wheel and passing 🎉 . ```text ERROR: Could not find a version that satisfies the requirement ucx-py-cu12==0.39.* (from raft-dask-cu12[test]) (from versions: 0.32.0, 0.33.0, 0.34.0, 0.35.0, 0.36.0, 0.37.0, 0.38.0a4, 0.38.0a5, 0.38.0a6, 0.39.0a0) ERROR: No matching distribution found for ucx-py-cu12==0.39.* ``` ([build link](https://github.com/rapidsai/raft/actions/runs/9323598882/job/25668146747?pr=2349)) Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) - Dante Gama Dessavre (https://github.com/dantegd) URL: rapidsai#2349

…onstants (rapidsai#2344) Contributes to rapidsai/build-planning#31 Follow-up to rapidsai#2331 * ensures that `update-version.sh` does not remove alpha specs like `,>=0.0.0a0` in `pyproject.toml` and conda environment files * consolidates `rapids-build-backend` versions in `dependencies.yaml` - *since I was pushing a new commit here anyway, figured I'd take the opportunity to include that simplification recommended in rapidsai/cudf#15245 (comment) * adds tests that `__git_commit__` and `__version__` constants are present and that `__version__` is populated Authors: - James Lamb (https://github.com/jameslamb) Approvers: - Bradley Dice (https://github.com/bdice) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2344

pip install uses --config-settings as its argument. Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) - James Lamb (https://github.com/jameslamb) Approvers: - James Lamb (https://github.com/jameslamb) URL: rapidsai#2342

During reduction in device code (reduction.cuh), the value assigned in the residual threads during last stage are zero initilized. However, if we want to reduce some custom type, it might not have the appropriate constructor. Thus, this PR makes the change so that we call the default constructor for the residual values. Authors: - Akif ÇÖRDÜK (https://github.com/akifcorduk) Approvers: - Dante Gama Dessavre (https://github.com/dantegd) - Tamas Bela Feher (https://github.com/tfeher) URL: rapidsai#2351

Forward-merge branch-24.06 into branch-24.08

This PR removes text builds of the documentation, which we do not currently use for anything. Contributes to rapidsai/build-planning#71. Authors: - Vyas Ramasubramani (https://github.com/vyasr) Approvers: - Ben Frederickson (https://github.com/benfred) - James Lamb (https://github.com/jameslamb) URL: rapidsai#2354

Recently devcontainer names were updated to include the current user's name. However, in GitHub Codespaces, the username is not defined. As a result, the container name starts with a dash. This is not allowed by GitHub Codespaces, so it fails to launch. This PR adds a default value of `anon` to the devcontainer username. See rapidsai/cudf#15784 for more information. Authors: - Bradley Dice (https://github.com/bdice) Approvers: - Paul Taylor (https://github.com/trxcllnt) - James Lamb (https://github.com/jameslamb) URL: rapidsai#2355

github-actions bot added the cpp label Dec 12, 2023

ChristinaZ marked this pull request as ready for review December 12, 2023 09:29

ChristinaZ requested a review from a team as a code owner December 12, 2023 09:29

ChristinaZ marked this pull request as draft December 12, 2023 09:40

cjnolet added documentation Improvements or additions to documentation non-breaking Non-breaking change labels Dec 13, 2023

cjnolet assigned ChristinaZ Dec 13, 2023

ChristinaZ marked this pull request as ready for review January 8, 2024 10:59

cjnolet added doc Documentation and removed documentation Improvements or additions to documentation labels Jan 10, 2024

ChristinaZ force-pushed the determinictic_version_of_AIR_TopK branch from 0a9a38a to 7501e0a Compare January 17, 2024 01:59

ChristinaZ requested review from a team as code owners January 17, 2024 01:59

github-actions bot added CMake python ci labels Jan 17, 2024

raydouglass and others added 9 commits January 18, 2024 14:51

DOC v24.04 Updates [skip ci]

a03f0af

Merge pull request rapidsai#2100 from rapidsai/branch-24.02

9165d89

Forward-merge branch-24.02 to branch-24.04

Merge pull request rapidsai#2101 from rapidsai/branch-24.02

c3b30f3

Forward-merge branch-24.02 to branch-24.04

Merge pull request rapidsai#2103 from rapidsai/branch-24.02

3790c2a

Forward-merge branch-24.02 to branch-24.04

Merge pull request rapidsai#2112 from rapidsai/branch-24.02

5464ae7

Forward-merge branch-24.02 to branch-24.04

Merge pull request rapidsai#2114 from rapidsai/branch-24.02

e9ba740

Forward-merge branch-24.02 to branch-24.04

Merge pull request rapidsai#2115 from rapidsai/branch-24.02

38d154e

Forward-merge branch-24.02 to branch-24.04

Merge pull request rapidsai#2116 from rapidsai/branch-24.02

c90cdfa

Forward-merge branch-24.02 to branch-24.04

Merge pull request rapidsai#2120 from rapidsai/branch-24.02

c9886d7

Forward-merge branch-24.02 to branch-24.04

achirkin and others added 2 commits May 21, 2024 15:05

Merge pull request rapidsai#2327 from rapidsai/branch-24.06

d684231

Forward-merge branch-24.06 into branch-24.08

aaronmondal and others added 6 commits May 21, 2024 19:37

Fix clang intrinsic warning (rapidsai#2292)

ac6be9e

Store operations are void. Authors: - Aaron Siddhartha Mondal (https://github.com/aaronmondal) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2292

Merge pull request rapidsai#2328 from rapidsai/branch-24.06

e4417e0

Forward-merge branch-24.06 into branch-24.08

Merge pull request rapidsai#2329 from rapidsai/branch-24.06

9e0d73f

Forward-merge branch-24.06 into branch-24.08

docs: update README.md (rapidsai#2308)

0b6f542

jupyer -> jupyter Authors: - Ikko Eltociear Ashimine (https://github.com/eltociear) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2308

Merge pull request rapidsai#2330 from rapidsai/branch-24.06

56c4de9

Forward-merge branch-24.06 into branch-24.08

divyegala and others added 18 commits May 23, 2024 18:31

InnerProduct testing for CAGRA+HNSW (rapidsai#2297)

64827fc

Authors: - Divye Gala (https://github.com/divyegala) - Corey J. Nolet (https://github.com/cjnolet) Approvers: - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2297

Merge pull request rapidsai#2334 from rapidsai/branch-24.06

a584448

Forward-merge branch-24.06 into branch-24.08

Merge pull request rapidsai#2337 from rapidsai/branch-24.06

66252a3

Forward-merge branch-24.06 into branch-24.08

Merge pull request rapidsai#2339 from rapidsai/branch-24.06

403e440

Forward-merge branch-24.06 into branch-24.08

Merge pull request rapidsai#2340 from rapidsai/branch-24.06

642f475

Forward-merge branch-24.06 into branch-24.08

Use rapids-build-backend (rapidsai#2331)

b63caf8

Contributes to rapidsai/build-planning#31 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) - Corey J. Nolet (https://github.com/cjnolet) URL: rapidsai#2331

Update Changelog [skip ci]

02e4504

Merge pull request rapidsai#2352 from rapidsai/branch-24.06

8fbce93

Forward-merge branch-24.06 into branch-24.08

ChristinaZ force-pushed the determinictic_version_of_AIR_TopK branch from 7501e0a to 0e68cfa Compare June 13, 2024 06:59

ChristinaZ closed this Jun 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the deterministic version of AIR Top-K ( the radix-based topk in … #2057

Add the deterministic version of AIR Top-K ( the radix-based topk in … #2057

ChristinaZ commented Dec 12, 2023

copy-pr-bot bot commented Dec 12, 2023

cjnolet commented Jan 10, 2024

review-notebook-app bot commented Jan 17, 2024

ChristinaZ commented Jan 17, 2024

cjnolet commented May 21, 2024

ChristinaZ commented May 23, 2024

Add the deterministic version of AIR Top-K ( the radix-based topk in … #2057

Add the deterministic version of AIR Top-K ( the radix-based topk in … #2057

Conversation

ChristinaZ commented Dec 12, 2023

copy-pr-bot bot commented Dec 12, 2023

cjnolet commented Jan 10, 2024

review-notebook-app bot commented Jan 17, 2024

ChristinaZ commented Jan 17, 2024

cjnolet commented May 21, 2024

ChristinaZ commented May 23, 2024