Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GCC 13.2.0 compilation of simple C code that includes offloading to NVIDIA GPU fails #184

Open
DemetriosG opened this issue Feb 28, 2024 · 2 comments

Comments

@DemetriosG
Copy link

Binary Version: GCC 13.2.0 (with POSIX threads) + LLVM/Clang/LLD/LLDB 17.0.6 + MinGW-w64 11.0.1 (UCRT) - release 5 (LATEST) for win64

Compilation command:
gcc -lm -g -Wall -fopenmp -foffload=nvptx-none sum.c -o sum.exe

Compilation error: lto-wrapper.exe: fatal error: could not find accel/nvptx-none/mkoffload in C:/tmp/mingw64/bin/../libexec/gcc/x86_64-w64-mingw32/13.2.0/;C:/tmp/mingw64/bin/../libexec/gcc/;C:/tmp/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/13.2.0/../../../../x86_64-w64-mingw32/bin/ (consider using '-B')
compilation terminated.
C:/tmp/mingw64/bin/../lib/gcc/x86_64-w64-mingw32/13.2.0/../../../../x86_64-w64-mingw32/bin/ld.exe: error: lto-wrapper failed
collect2.exe: error: ld returned 1 exit status

C code:

#include <stdio.h>
#include <stdlib.h>
#include <time.h>
#include <omp.h>

int main(int argc, char *argv[]) {
	long long int j, largeN;
	double sum = 0.0;
	clock_t tstart, tend;
	int num_dev;

	num_dev = omp_get_num_devices();

	printf("%d\n", num_dev);

	largeN = (long long int) atoll(argv[1]);

	printf("Large Number = %lld\n", largeN);

	tstart = clock();
#pragma omp parallel for private(j) reduction(+:sum)
	for(j=1; j<largeN; j++) {
		sum += (double) (1.0/j);
	}
	tend = clock();
	printf("Sum [1, ... , %lld] = %.6f\n", largeN, sum);
	printf("\nCompleted in %.2f seconds%45s\n\n", ( (double) (tend - tstart) ) / CLOCKS_PER_SEC, "");

	sum = 0.0;
	tstart = clock();
#pragma omp target map(to:largeN) map(tofrom:sum)
#pragma omp parallel for simd private(j) reduction(+:sum)
	for (j=1; j<largeN; j++) {
		sum += (double) (1.0/j);
	}
	tend = clock();
	printf("Sum [1, ... , %lld] = %.6f\n", largeN, sum);
	printf("\nCompleted in %.2f seconds%45s\n\n", ( (double) (tend - tstart) ) / CLOCKS_PER_SEC, "");

	return(0);
}

A remedy would be greatly appreciated.

@brechtsanders
Copy link
Owner

I have not been able to build GCC nvptx offloading since 10.3.0.

Unfortunately I never found enough time to deepdive into why exactly it won't build.

@DemetriosG
Copy link
Author

DemetriosG commented Mar 4, 2024 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants