Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed remaining issues for Windows CUDA support. #15565

Closed

Conversation

eaplatanios
Copy link
Contributor

These are changes I included late yesterday in #15444 but I believe an earlier copy of that branch was picked up when merging so they were left out. cc @ddunl

@@ -22,331 +22,961 @@ namespace gpu {
// The data below is obtained with
// xla/service/gpu/model:hlo_op_profiler_run

constexpr char kDeviceHloOpProfiles[] = R"pb(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are meant to address this error in MSVC:

error C2026: string too big, trailing characters truncated

clang-format made the diff massive but it's just breaking it down to a few concatenated strings as opposed to a single very long one.

@@ -77,8 +77,7 @@ class EraseDeadFunctionsPass

} // namespace

std::unique_ptr<mlir::OperationPass<mlir::ModuleOp>>
CreateEraseDeadFunctionsPass() {
std::unique_ptr<mlir::Pass> CreateEraseDeadFunctionsPass() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The header uses mlir::Pass and MSVC appears to be sensitive to that resulting in mismatching symbol names that fail to link.

@@ -809,10 +809,6 @@ absl::StatusOr<std::unique_ptr<GpuEvent>> GpuExecutor::CreateGpuEvent(
return std::move(gpu_event);
}

absl::StatusOr<std::unique_ptr<Event>> GpuExecutor::CreateEvent() {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of what was going on is that GPUExecutor is not an abstract class as it's sometimes instantiated in the codebase directly. This means that it needs to implement CreateEvent which is a pure virtual function. Since both the CUDA and the ROCm executors used the same implementation, I moved it to GPUExecutor this resolved the linking error. Without this change, cuda_executor.obj was ending up with a symbol named CreateEventA and the linker was failing for places in the code that needed to link to CreateEvent. I'm not quite sure how that A ends up in there and what the name mangling scheme is since I'm not familiar with C++, but that's my best attempt at an explanation of what was going on.

@ddunl ddunl added the kokoro:force-run Forces CI to rerun label Jul 31, 2024
@kokoro-team kokoro-team removed the kokoro:force-run Forces CI to rerun label Jul 31, 2024
@ddunl
Copy link
Member

ddunl commented Jul 31, 2024

LGTM, just going to double check that CI agrees with these changes then I'll try to merge. Thanks!!

copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 1, 2024
Imported from GitHub PR openxla/xla#15565

These are changes I included late yesterday in #15444 but I believe an earlier copy of that branch was picked up when merging so they were left out. cc @ddunl
Copybara import of the project:

--
eacee95f41abc49a21516ee389861d84a40eca85 by eaplatanios <[email protected]>:

Fixed some issues around compiling on Windows.

--
b12e4cf0d23c2690111125a651e486ec6a112e54 by eaplatanios <[email protected]>:

.

--
e23ef176de72cf04555242174a19a407884f3f0e by eaplatanios <[email protected]>:

.

--
bdae19b9e15c396985703bb7e88a4db6fcddc7f6 by eaplatanios <[email protected]>:

.

--
2f90e6ba564f92fafa564b104ed0ce82b7642563 by eaplatanios <[email protected]>:

.

--
57009793b74c4d7d51fb39547a70a3ec142dadab by eaplatanios <[email protected]>:

.

--
a978b1f7f70d49f1426fe46b107fdcc3618e3085 by eaplatanios <[email protected]>:

.

--
d7fe81dc9cf909a6a8d70e2be8cfffca4063493e by eaplatanios <[email protected]>:

.

--
fc40d919619330bce596555613e425cb6267eea4 by eaplatanios <[email protected]>:

.

--
326aec3fd73a67ca3c667cfeb5c88a8ffa52eb3d by eaplatanios <[email protected]>:

.

--
a7603b7e1be990ff012440c74bd2c2ecbc2b1e2f by eaplatanios <[email protected]>:

.

--
edcc97a67016584c285d84ac732952c572283119 by eaplatanios <[email protected]>:

.

--
cec244808a8df163f9a803db450ca2bebdda9315 by eaplatanios <[email protected]>:

.

--
df3eb2215eea9076cb352378c5745e113df7cc7d by eaplatanios <[email protected]>:

.

--
8997345fd1e1aa6f55e445615460124c6e14417c by eaplatanios <[email protected]>:

.

--
219a9f1bff7fb12c3407ab2e47512560001900fe by eaplatanios <[email protected]>:

.

--
73f3cd7e0135ec05c97595f795ec318fb635bd32 by eaplatanios <[email protected]>:

.

--
d266259d4f467011bcd754bbdea14cc10723a01d by eaplatanios <[email protected]>:

.

--
d5c3cce7dec3d80e43462af8e6a5de5804341526 by eaplatanios <[email protected]>:

.

--
d5840041014526901ff36efc7bd61051bf92989c by eaplatanios <[email protected]>:

.

--
091793fc2e8b05d06ee5272a1287b285356921b6 by eaplatanios <[email protected]>:

.

--
60371493299b680b147798198fdf384eb86ee952 by eaplatanios <[email protected]>:

.

--
866e013f3c2417cd6f03a935cff92fe3866dfe74 by eaplatanios <[email protected]>:

.

--
e289dfffdbe7641c94360b566e16a4c66c6401c1 by eaplatanios <[email protected]>:

.

--
57c98d8f3a8dffc85f31e2c94e6d75a93d2e2b3b by eaplatanios <[email protected]>:

.

Merging this change closes #15565

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#15565 from eaplatanios:u/eaplatanios/cpp-17-fixes 7aad2e3ffbc94d46a29a3cf5e99e364a6de082ce
PiperOrigin-RevId: 658165354
@ddunl
Copy link
Member

ddunl commented Aug 5, 2024

Apologies for the delay in merging, can you rebase? Our internal CI was happy with everything but saw some files were out of date. Apologies again for the delay.

@NaiyerRizz NaiyerRizz self-assigned this Aug 6, 2024
copybara-service bot pushed a commit to tensorflow/tensorflow that referenced this pull request Aug 6, 2024
Imported from GitHub PR openxla/xla#15565

These are changes I included late yesterday in #15444 but I believe an earlier copy of that branch was picked up when merging so they were left out. cc @ddunl
Copybara import of the project:

--
eacee95f41abc49a21516ee389861d84a40eca85 by eaplatanios <[email protected]>:

Fixed some issues around compiling on Windows.

--
b12e4cf0d23c2690111125a651e486ec6a112e54 by eaplatanios <[email protected]>:

.

--
e23ef176de72cf04555242174a19a407884f3f0e by eaplatanios <[email protected]>:

.

--
bdae19b9e15c396985703bb7e88a4db6fcddc7f6 by eaplatanios <[email protected]>:

.

--
2f90e6ba564f92fafa564b104ed0ce82b7642563 by eaplatanios <[email protected]>:

.

--
57009793b74c4d7d51fb39547a70a3ec142dadab by eaplatanios <[email protected]>:

.

--
a978b1f7f70d49f1426fe46b107fdcc3618e3085 by eaplatanios <[email protected]>:

.

--
d7fe81dc9cf909a6a8d70e2be8cfffca4063493e by eaplatanios <[email protected]>:

.

--
fc40d919619330bce596555613e425cb6267eea4 by eaplatanios <[email protected]>:

.

--
326aec3fd73a67ca3c667cfeb5c88a8ffa52eb3d by eaplatanios <[email protected]>:

.

--
a7603b7e1be990ff012440c74bd2c2ecbc2b1e2f by eaplatanios <[email protected]>:

.

--
edcc97a67016584c285d84ac732952c572283119 by eaplatanios <[email protected]>:

.

--
cec244808a8df163f9a803db450ca2bebdda9315 by eaplatanios <[email protected]>:

.

--
df3eb2215eea9076cb352378c5745e113df7cc7d by eaplatanios <[email protected]>:

.

--
8997345fd1e1aa6f55e445615460124c6e14417c by eaplatanios <[email protected]>:

.

--
219a9f1bff7fb12c3407ab2e47512560001900fe by eaplatanios <[email protected]>:

.

--
73f3cd7e0135ec05c97595f795ec318fb635bd32 by eaplatanios <[email protected]>:

.

--
d266259d4f467011bcd754bbdea14cc10723a01d by eaplatanios <[email protected]>:

.

--
d5c3cce7dec3d80e43462af8e6a5de5804341526 by eaplatanios <[email protected]>:

.

--
d5840041014526901ff36efc7bd61051bf92989c by eaplatanios <[email protected]>:

.

--
091793fc2e8b05d06ee5272a1287b285356921b6 by eaplatanios <[email protected]>:

.

--
60371493299b680b147798198fdf384eb86ee952 by eaplatanios <[email protected]>:

.

--
866e013f3c2417cd6f03a935cff92fe3866dfe74 by eaplatanios <[email protected]>:

.

--
e289dfffdbe7641c94360b566e16a4c66c6401c1 by eaplatanios <[email protected]>:

.

--
57c98d8f3a8dffc85f31e2c94e6d75a93d2e2b3b by eaplatanios <[email protected]>:

.

Merging this change closes #15565

FUTURE_COPYBARA_INTEGRATE_REVIEW=openxla/xla#15565 from eaplatanios:u/eaplatanios/cpp-17-fixes 7aad2e3ffbc94d46a29a3cf5e99e364a6de082ce
PiperOrigin-RevId: 658165354
@eaplatanios
Copy link
Contributor Author

@ddunl unfortunately, as described here I decided to give up on this because it's much more work than I was anticipating and the workaround of using WSL is likely good enough for my current use case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants