Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(qp): introduce ModifyQueuePairError and PostSendError #34

Merged
merged 1 commit into from
Oct 23, 2024

Conversation

dragonJACson
Copy link
Contributor

@dragonJACson dragonJACson commented Oct 3, 2024

RDMA C functions basically use return value (when it could be used) for returning an errno, or when sometimes the function should return a pointer but return a NULL, user should read errno themselves.

Most of the time, the errno actually suggests an error from kernel, users really need a lot of experience on RDMA to understand what's going wrong by just read a simple 8bit error number.

RDMAmojo and rdma-core man pages provides some hints on errnos, while that need users to search and grab. Rust provides a powerful error handling system, why not just integrate the hints with errno together, plus some context to build our error types? I provide a trial implementation in this commit.

Sometimes the complicated error types introduce performance penalty, so I make it optional.

Related issue: #3

Copy link

codecov bot commented Oct 3, 2024

Codecov Report

Attention: Patch coverage is 7.35294% with 63 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/verbs/queue_pair.rs 7.35% 63 Missing ⚠️
Files with missing lines Coverage Δ
src/verbs/queue_pair.rs 71.49% <7.35%> (-6.31%) ⬇️

@dragonJACson dragonJACson requested a review from FujiZ October 3, 2024 13:53
@dragonJACson
Copy link
Contributor Author

Some examples:

$ ./target/debug/examples/rc_pingpong -d mlx5_0 -n 10 -s 2400 -m 256 -g 1 127.0.0.1
 local address: QPN 0x0326, PSN 0x8088ba, GID fe80:0000:0000:0000:e46d:faff:fefa:6bcb
remote address: QPN 0x0325, PSN 0xf49e1a, GID fe80:0000:0000:0000:e46d:faff:fefa:6bcb
Error: network unreachable, source gid index: 1, destination gid: fe80:0000:0000:0000:e46d:faff:fefa:6bcb

Caused by:
    No route to host (os error 113)
$ ./target/debug/examples/rc_pingpong -n 10 -s 2400 -m 256 -d mlx5_0 127.0.0.1 -g 1
Error: invalid transition from Reset to Init, possible invalid masks QueuePairAttributeMask[], possible needed masks QueuePairAttributeMask[AccessFlags, Port]

Caused by:
    Invalid argument (os error 22)

cc @FujiZ

@dragonJACson dragonJACson marked this pull request as draft October 3, 2024 13:55
@dragonJACson dragonJACson self-assigned this Oct 3, 2024
@dragonJACson dragonJACson force-pushed the dev/rc_pingpong branch 2 times, most recently from 82b75ca to 7363f1f Compare October 6, 2024 03:34
@dragonJACson dragonJACson force-pushed the dev/rc_pingpong branch 7 times, most recently from 214d994 to 23df7d8 Compare October 12, 2024 15:23
Base automatically changed from dev/rc_pingpong to main October 13, 2024 09:37
@dragonJACson dragonJACson force-pushed the dev/error-handling branch 2 times, most recently from 55beae3 to cbdd83e Compare October 13, 2024 16:29
@dragonJACson dragonJACson marked this pull request as ready for review October 22, 2024 14:34
@dragonJACson dragonJACson force-pushed the dev/error-handling branch 2 times, most recently from 7c9c8da to 1d93c3e Compare October 22, 2024 16:34
src/verbs/queue_pair.rs Outdated Show resolved Hide resolved
RDMA C functions basically use return value (when it could be used)
for returning an errno, or when sometimes the function should return
a pointer but return a NULL, user should read errno themselves.

Most of the time, the errno actually suggests an error from kernel,
users really need a lot of experience on RDMA to understand what's
going wrong by just read a simple 8bit error number.

RDMAmojo and rdma-core man pages provides some hints on errnos, while
that need users to search and grab. Rust provides a powerful error
handling system, why not just integrate the hints with errno together,
plus some context to build our error types?

I provide a trial implementation in this commit. Sometimes the complicated
error types introduce performance penalty, so I make it optional.

Signed-off-by: Luke Yue <[email protected]>
@dragonJACson dragonJACson merged commit f90a5ba into main Oct 23, 2024
4 of 5 checks passed
@dragonJACson dragonJACson deleted the dev/error-handling branch October 23, 2024 16:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants