Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Forward-merge branch-25.02 into branch-25.04 #614

Merged
merged 18 commits into from
Jan 31, 2025
Merged

Conversation

rapids-bot[bot]
Copy link

@rapids-bot rapids-bot bot commented Jan 24, 2025

Forward-merge triggered by push to branch-25.02 that creates a PR to keep branch-25.04 up-to-date. If this PR is unable to be immediately merged due to conflicts, it will remain open for the team to manually merge. See forward-merger docs for more info.

…604)

Contributes to rapidsai/build-planning#138

Updates to using UCX 1.18 in pip devcontainers here.

Also fixes some small `update-version.sh` issues, and updates references that were outdated as a result of those issues.

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Gil Forsyth (https://github.com/gforsyth)
  - https://github.com/jakirkham
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #604
@rapids-bot rapids-bot bot requested review from a team as code owners January 24, 2025 19:13
@rapids-bot rapids-bot bot requested a review from bdice January 24, 2025 19:13
Copy link
Author

rapids-bot bot commented Jan 24, 2025

FAILURE - Unable to forward-merge due to an error, manual merge is necessary. Do not use the Resolve conflicts option in this PR, follow these instructions https://docs.rapids.ai/maintainers/forward-merger/

IMPORTANT: When merging this PR, do not use the auto-merger (i.e. the /merge comment). Instead, an admin must manually merge by changing the merging strategy to Create a Merge Commit. Otherwise, history will be lost and the branches become incompatible.

jameslamb and others added 2 commits January 24, 2025 23:31
`cuvs-cu{11,12}` wheels don't currently have a runtime dependency on `libcuvs-cu{11,12}`. They need one, for library-loading:

https://github.com/rapidsai/cuvs/blob/e9983e17408e6bec6f2558f9df49be97a7255417/python/cuvs/cuvs/__init__.py#L19-L25

This was missed in #594. This PR adds it.

## Notes for Reviewers

Adding for searchability... this bug can result in issues like this at runtime when using `cuvs` installed from wheels:

> ImportError: libcuvs_c.so: cannot open shared object file: No such file or directory

Authors:
  - James Lamb (https://github.com/jameslamb)

Approvers:
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #615
Adds the CAGRA filtering feature to the C API using DLPack Tensor as blocklist

Authors:
  - Ajit Mistry (https://github.com/ajit283)
  - Ben Frederickson (https://github.com/benfred)
  - Micka (https://github.com/lowener)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Micka (https://github.com/lowener)
  - Ben Frederickson (https://github.com/benfred)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #452
@rapids-bot rapids-bot bot requested review from a team as code owners January 25, 2025 01:43
@github-actions github-actions bot added the cpp label Jan 25, 2025
This PR applies `pre-commit` hooks to normalize whitespace (trimming trailing whitespace and enforcing consistent end-of-file newlines).

These rules are already applied to most other RAPIDS repos, so this PR aligns with the norm in RAPIDS.

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - James Lamb (https://github.com/jameslamb)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #593
@rapids-bot rapids-bot bot requested review from a team as code owners January 25, 2025 04:27
@github-actions github-actions bot added the CMake label Jan 25, 2025
benfred and others added 13 commits January 25, 2025 08:22
Currently, running `NEIGHBORS_ANN_CAGRA_TEST` takes:
[0.96 hours on CUDA 11.8, V100 (x86)](https://github.com/rapidsai/cuvs/actions/runs/12913409417/job/36012418022?pr=596#step:8:1718)
[1.59 hours on CUDA 12.5, V100 (x86)](https://github.com/rapidsai/cuvs/actions/runs/12913409417/job/36012418329?pr=596#step:8:492)
[0.28 hours on CUDA 12.0, A100 (ARM)](https://github.com/rapidsai/cuvs/actions/runs/12913409417/job/36012418741?pr=596#step:8:1729)

Individual tests should be able to complete in less than an hour. Ideally, less than 10 minutes.

This PR proposes some changes to CAGRA tests:
- Each CAGRA type is now its own test executable (e.g. `NEIGHBORS_ANN_CAGRA_FLOAT_UINT32_TEST`)
- Some parameter combinations were trimmed by ~50%

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Tamas Bela Feher (https://github.com/tfeher)
  - Artem M. Chirkin (https://github.com/achirkin)

Approvers:
  - Artem M. Chirkin (https://github.com/achirkin)
  - Divye Gala (https://github.com/divyegala)

URL: #602
Renames `test` directories to `tests` for alignment with the rest of RAPIDS.

Closes rapidsai/build-planning#140.

Authors:
  - Bradley Dice (https://github.com/bdice)

Approvers:
  - Ray Douglass (https://github.com/raydouglass)
  - Divye Gala (https://github.com/divyegala)
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #590
Mostly adapted from rapidsai/raft#2026

Authors:
  - Tarang Jain (https://github.com/tarang-jain)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Artem M. Chirkin (https://github.com/achirkin)

URL: #561
This PR uses CUDA 12.8.0 to build and test.

xref: rapidsai/build-planning#139

Authors:
  - Bradley Dice (https://github.com/bdice)
  - Vyas Ramasubramani (https://github.com/vyasr)

Approvers:
  - James Lamb (https://github.com/jameslamb)
  - Ben Frederickson (https://github.com/benfred)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Vyas Ramasubramani (https://github.com/vyasr)

URL: #621
It has been reported that when the number of search results is large, for example 100, using the multi-CTA algorithm can cause a decrease in recall. This PR is intended to alleviate this low recall issue.

close #208

Authors:
  - Akira Naruse (https://github.com/anaruse)
  - Tamas Bela Feher (https://github.com/tfeher)
  - Artem M. Chirkin (https://github.com/achirkin)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)
  - tsuki (https://github.com/enp1s0)
  - Artem M. Chirkin (https://github.com/achirkin)

URL: #492
After calling `build()`, ideally the CAGRA index contains both the dataset and the graph. But when we do not have sufficient device memory, then only the graph is returned. In such case we need to pass the dataset explicitly to the serialization routines.

For serialization in HNSW format, in case we have flat hierarchy, the dataset was not passed. This PR fixes this problem by adding an optional `dataset` argument to `cagra::serialize_to_hnswlib`.

Furthermore, to improve execution time, we change from writing a single element to writing a single row of the graph and dataset at time. 

Additionally, debug messages for tracking data saving time are added.

Authors:
  - Tamas Bela Feher (https://github.com/tfeher)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Divye Gala (https://github.com/divyegala)

URL: #591
A Java API for cuVS for easy integration into Apache Lucene or other Java based projects.

Try:
```
./build.sh libcuvs
./build.sh java
```

For generating docs, ```mvn javadoc:javadoc```

Prerequisites:
* JDK 22
* Maven 3.9.6+

Todo:
* Generate project panama classes using jextract on every build
* Algorithms other than Cagra 
* Prefiltering in cagra

Authors:
  - Ishan Chattopadhyaya (https://github.com/chatman)
  - Vivek Narang (https://github.com/narangvivek10)
  - Chris Hegarty (https://github.com/ChrisHegarty)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)
  - Mike Sarahan (https://github.com/msarahan)

URL: #450
#620)

hnswlib uses an internal indexing system which assigns an ID to points, atomically, in-order that they are added to the index. When using parallelism to add points to the index, the internal ID may be different than the "label" of the point (label, for us, is just the index of the row in the dataset) as a consequence of adding points in-parallel in no deterministic order.

The bug was that I was using the label itself to write out the CPU hierarchy, when I should have been using hnswlib's internal ID for the point associated with that label.

Authors:
  - Divye Gala (https://github.com/divyegala)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #620
Includes several fixes and improvements to Vamana, primarily:
- Edge case and bug fixes for Vamana index build (details below)
- Documentation added for Vamana
- experimental namespace removed
- Reduce device memory usage by splitting reverse edge work into batches

The edge case fix adds padding to all shared memory size and offset calculations so any dataset dimension is supported (tests added that verify this). A bug was also fixed with the L2 distance metric causing incorrect results in some rare cases. 

This PR addresses the most pressing items in #393 and stabilize the index construction sufficiently to remove the experimental namespace.

Authors:
  - Ben Karsin (https://github.com/bkarsin)
  - Corey J. Nolet (https://github.com/cjnolet)

Approvers:
  - Tamas Bela Feher (https://github.com/tfeher)

URL: #558
This PR add a dedicated documentation page for filtering in the `Getting started` tab, and add the `cuvs::neighbors::filtering` namespace to the C++ documentation

Authors:
  - Micka (https://github.com/lowener)

Approvers:
  - Corey J. Nolet (https://github.com/cjnolet)

URL: #568
Add functionality to add additional vectors after build to C API

Authors:
  - Ajit Mistry (https://github.com/ajit283)
  - Corey J. Nolet (https://github.com/cjnolet)
  - Ben Frederickson (https://github.com/benfred)
  - Micka (https://github.com/lowener)

Approvers:
  - Ben Frederickson (https://github.com/benfred)

URL: #276
This PR points the shared workflow branches back to the default 25.02
branches.

xref: rapidsai/build-planning#139
@AyodeAwe AyodeAwe merged commit bb2dd17 into branch-25.04 Jan 31, 2025
45 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.