-
Notifications
You must be signed in to change notification settings - Fork 129
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MSCCLPP user buffer registration APIs and integrate with RCCL #1477
Conversation
ext-src/mem-reg.patch
Outdated
+} | ||
+ | ||
+NCCL_API ncclResult_t ncclCommDeregister(ncclComm_t comm, void* handle) { | ||
+ if (comm && handle) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shouldn't we free any resources associated with handle
here?
Example:
auto it = comm->channelScratchInfos.find(buffKey);
if (it != comm->channelScratchInfos.end()) {
comm->channelScratchInfos.erase(it);
}
Same thing can be done for channelOut and channelIn
ext-src/mem-reg.patch
Outdated
+NCCL_API ncclResult_t ncclCommDeregister(ncclComm_t comm, void* handle) { | ||
+ if (comm && handle) { | ||
+ handle = nullptr; | ||
+ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The implementation nullifies the handle
inside the function, but since it's passed as void*, the change doesn't reflect in the caller's context. This can lead to dangling pointers. I suggest passing handle as a void** to resolve this issue, if API change is possible
+NCCL_API ncclResult_t ncclCommDeregister(ncclComm_t comm, void* handle) { | |
+ if (comm && handle) { | |
+ handle = nullptr; | |
+ } | |
+NCCL_API ncclResult_t ncclCommDeregister(ncclComm_t comm, void** handle) { | |
+ if (comm && handle && *handle) { | |
+ *handle = nullptr; | |
+ } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a NCCL API and cannot be changed. I am still working on some of these fixes.
+ std::vector<mscclpp::RegisteredMemory> remoteMemories; | ||
+ | ||
+ // Creating the channels | ||
+ auto buffIt = comm->channelScratchInfos.find(buffKey); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's no guarantee this will be called from a single thread, we probably need to lock access to channelScratchInfos and the rest of the maps. I suggest using one lock if this function doesn't fall on the critical path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This map is supposed to be checked by the host process and MSCCLPP is invoked only when using one GPU per process case.
ext-src/mem-reg.patch
Outdated
/* | ||
* Send | ||
* | ||
diff --git a/apps/nccl/src/nccl.cu b/apps/nccl/src/nccl.cu | ||
index a697be2..d8497e7 100644 | ||
--- a/apps/nccl/src/nccl.cu | ||
+++ b/apps/nccl/src/nccl.cu | ||
@@ -577,6 +577,58 @@ NCCL_API ncclResult_t ncclAllGather(const void* sendbuff, void* recvbuff, size_t | ||
@@ -577,6 +577,67 @@ NCCL_API ncclResult_t ncclAllGather(const void* sendbuff, void* recvbuff, size_t |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This change does not seem to be necessary
src/register.cc
Outdated
cache->population -= 1; | ||
|
||
#ifdef ENABLE_MSCCLPP | ||
if (comm->mscclCompatible){ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here you should do the same checks that you did before registering the buffer.
if (comm->mscclCompatible && size > 0 && (size & 31) == 0 && size <= comm->mscclpp_threshold){
ext-src/mem-reg.patch
Outdated
+ncclResult_t ncclCommRegister(ncclComm_t comm, void* buff, size_t size, void** handle); | ||
+ncclResult_t ncclCommDeregister(ncclComm_t comm, void* handle); | ||
+ncclResult_t ncclBuffIsRegistered(ncclComm_t comm, const void* buff, size_t count, bool* registered); | ||
+size_t ncclBufferSize(ncclComm_t comm, void* handle); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since you're inserting functions into mscclpp, you can just give them names such as mscclppCommRegister
, and then there's no need to rename via mscclpp_nccl_syms.txt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ncclCommRegister()
and ncclCommDeRegister()
are NCCL APIs. We need to keep these names so that a workload can also be run with the mscclpp backend.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am curious why ncclCommRegister
and ncclCommDeRegister
are not implement in mscclpp? The stub functions already exist: https://github.com/microsoft/mscclpp/blob/6d26b92665383b60ffae7e4dd9ab4cc047d34721/apps/nccl/src/nccl.cu#L809C1-L817C2
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nusislam Should these changes be submitted as a mscclpp PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eventually, yes. But at the moment, will use this patch.
@corey-derochie-amd - Any more reviews? |
There is a typo I need to fix. Will push commit shortly |
src/register.cc
Outdated
else{ | ||
WARN("MSCCL++: Cannot register user-buffers on managed memory"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will do nothing, issue a warning, then fall through and return ncclSuccess
. Is that the right behaviour? Should it not return an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the expectation is that if no registration happens in MSCCL++ due to the buffer being in managed memory, no MSCCL++ or RCCL registration will happen and execution will fall back to RCCL kernel without UBR.
I actually need to do a small adjustment to not cause a RCCL deregistration error when MSCCL++ detects no registered buffer because of this. I just tested a simple fix for this.
But since we always fallback to RCCL kernel when MSCCL++ is not enabled, I think a warning without return an error is OK. Also, I haven't seen any errors returned in any UBR related routines, except when RCCL cannot find the registration handle during deregistration. Of course, this definitely makes it harder to see if anything goes wrong.
… back to rccl deRegister which will return an error
@isaki001 - I think the helper routines do not have to be named as NCCL APIs; the return type can also be changed. Other than this, the PR looks good to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some suggestions.
Thanks @corey-derochie-amd I'll add your suggestions and I'll change the names as @nusislam indicated. |
Whitespace suggestions and reducing diffs to avoid future merge conflicts Co-authored-by: corey-derochie-amd <[email protected]>
…stration with a buffer in managed memory
@@ -165,11 +165,11 @@ ncclResult_t ncclCommRegister_impl(const ncclComm_t comm, void* buff, size_t siz | |||
if(!isManagedBuffer){ | |||
INFO(NCCL_INIT, "MSCCL++: ncclCommRegister"); | |||
NCCLCHECK(mscclpp_ncclCommRegister(comm->mscclpp_comm, buff, size, handle)); | |||
return ncclSuccess; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm debating whether this is the right behaviour... Are these buffers reused in ways that might call MSCCL++ sometimes and RCCL other times? Are buffers always matched to message size, or could they use a buffer that's larger than the given message size? If so, the safest thing to do might be to register for BOTH RCCL and MSCCL++. Thoughts, @nusislam ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The cases when mscclppRegister
should be used vs. rcclRegister
should be used are mutually exclusive.
Details
This PR adds user buffer registration APIs in MSCCLPP and integrates them with the corresponding RCCL APIs.
Work item: "Internal", or link to GitHub issue (if applicable).
What were the changes?
A patch has been added in ext-src. This patch will be applied to MSCCLPP during RCCL build. Also MSCCLPP related changes have been added to the RCCL user buffer registration code.
Why were the changes made?
These changes are required to use MSCCLPP in non-HIP-graph mode.
How was the outcome achieved?
Technical details behind the work. Explain any publicly-available hardware peculiarities.
Additional Documentation:
What else should the reviewer know?
Approval Checklist
Do not approve until these items are satisfied.