-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix overflows up to at least tco1279 #162
Conversation
dd1e3bb
to
8622da1
Compare
@samhatfield This is the second last PR from the original GPU branch that is missing and I propose for integration. This should be much more straight forward than the previous one. I am using this for a while now without issues, and it is very useful because the overflows are very hard to find, and it is nice to be able experimenting on a single node/GPU. |
Noted, thanks Lukas. I'll try to find time to look at this in the next week. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, just some minor comments.
A common pattern in this PR is something like
2_JPIB*D%NLENGT0B*KF_LEG*C_SIZEOF(PFBUF(1))
which involves a mixture of integer types. I don't usually do this in Fortran so I'm not sure what the rules are for casting in these cases. Can someone reassure me that it's all fine? Here we have JPIB * JPIM * C_SIZE_T
.
This is one more change that Olivier ported from my branch to his branch, and now I am taking it to develop.
The upper end of the sizes that this allows is certainly more "theoretical" because we can't run simulations with it, but even for real use-cases, we might see overflows without this, and those can be very difficult to spot. With one rank, we can grow to very large domains, with multiple ranks, at some point we run into overflows of the MPI buffers, those are properly diagnosed.
This will conflict with #161 , but should be trivial to resolve.