Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with Intel GPUs #3

Open
haesleinhuepf opened this issue Apr 18, 2020 · 4 comments
Open

Compatibility with Intel GPUs #3

haesleinhuepf opened this issue Apr 18, 2020 · 4 comments

Comments

@haesleinhuepf
Copy link
Contributor

haesleinhuepf commented Apr 18, 2020

Hi @maarzt ,

I'm just testing on Windows 10 / Intel UHD 620 and found some issues in this benchmark:
https://github.com/maarzt/imglib2-trainable-segmentation/blob/clij-benchmark/src/test/java/net/imglib2/trainable_segmention/performance/GpuCpuComparisonBenchmark.java

It crashes on my GPU with this error:

2:570:35: error: access qualifier can only be used for pipe and image type
__kernel void separable_operation(OUTPUT_IMAGE_PARAMETER, INPUT_IMAGE_PARAMETER, __constant float* kernelValues)
                                  ^
2:518:33: note: expanded from macro 'OUTPUT_IMAGE_PARAMETER'
#define OUTPUT_IMAGE_PARAMETER  __write_only __global float* output
                                ^
2:570:59: error: access qualifier can only be used for pipe and image type
__kernel void separable_operation(OUTPUT_IMAGE_PARAMETER, INPUT_IMAGE_PARAMETER, __constant float* kernelValues)
                                                          ^
2:555:32: note: expanded from macro 'INPUT_IMAGE_PARAMETER'
#define INPUT_IMAGE_PARAMETER   __read_only __global float* input
                                ^
Error when trying to create kernel separable_operation
net.haesleinhuepf.clij.clearcl.exceptions.OpenCLException: OpenCL error: -45 -> CL_INVALID_PROGRAM_EXECUTABLE
	at net.haesleinhuepf.clij.clearcl.backend.BackendUtils.checkOpenCLErrorCode(BackendUtils.java:352)
	at net.haesleinhuepf.clij.clearcl.backend.jocl.ClearCLBackendJOCL.lambda$getKernelPeerPointer$19(ClearCLBackendJOCL.java:601)
	at net.haesleinhuepf.clij.clearcl.backend.BackendUtils.checkExceptions(BackendUtils.java:156)
	at net.haesleinhuepf.clij.clearcl.backend.jocl.ClearCLBackendJOCL.getKernelPeerPointer(ClearCLBackendJOCL.java:593)
	at net.haesleinhuepf.clij.clearcl.ClearCLCompiledProgram.createKernel(ClearCLCompiledProgram.java:137)
	at net.haesleinhuepf.clij.clearcl.ClearCLProgram.createKernel(ClearCLProgram.java:685)
	at net.haesleinhuepf.clij.clearcl.util.CLKernelExecutor.getKernel(CLKernelExecutor.java:353)
	at net.haesleinhuepf.clij.clearcl.util.CLKernelExecutor.enqueue(CLKernelExecutor.java:229)
	at net.haesleinhuepf.clij2.CLIJ2.lambda$executeSubsequently$0(CLIJ2.java:236)
	at net.haesleinhuepf.clij.clearcl.util.ElapsedTime.measure(ElapsedTime.java:97)
	at net.haesleinhuepf.clij.clearcl.util.ElapsedTime.measure(ElapsedTime.java:28)
	at net.haesleinhuepf.clij2.CLIJ2.executeSubsequently(CLIJ2.java:227)
	at net.imglib2.trainable_segmention.gpu.api.DefaultGpuApi.lambda$execute$1(DefaultGpuApi.java:57)
	at net.imglib2.trainable_segmention.gpu.api.DefaultGpuApi.handleOutOfMemoryException(DefaultGpuApi.java:65)
	at net.imglib2.trainable_segmention.gpu.api.DefaultGpuApi.execute(DefaultGpuApi.java:56)
	at net.imglib2.trainable_segmention.gpu.api.GpuScope.execute(GpuScope.java:42)
	at net.imglib2.trainable_segmention.gpu.algorithms.GpuSeparableOperation.run(GpuSeparableOperation.java:76)
	at net.imglib2.trainable_segmention.gpu.algorithms.GpuKernelConvolution.convolve(GpuKernelConvolution.java:61)
	at net.imglib2.trainable_segmention.gpu.algorithms.GpuKernelConvolution.apply(GpuKernelConvolution.java:47)
	at net.imglib2.trainable_segmention.gpu.algorithms.GpuConcatenatedNeighborhoodOperation.apply(GpuConcatenatedNeighborhoodOperation.java:49)
	at net.imglib2.trainable_segmention.gpu.compute_cache.GpuGaussContent.load(GpuGaussContent.java:54)
	at net.imglib2.trainable_segmention.gpu.compute_cache.GpuComputeCache$CacheEntry.get(GpuComputeCache.java:90)
	at net.imglib2.trainable_segmention.gpu.compute_cache.GpuComputeCache.get(GpuComputeCache.java:51)
	at net.imglib2.trainable_segmention.gpu.GpuFeatureInput.gauss(GpuFeatureInput.java:41)
	at net.imglib2.trainable_segmention.pixel_feature.filter.gauss.SingleGaussianBlurFeature.apply(SingleGaussianBlurFeature.java:51)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.lambda$apply$2(FeatureJoiner.java:62)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.genericApply(FeatureJoiner.java:71)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.apply(FeatureJoiner.java:62)
	at net.imglib2.trainable_segmention.pixel_feature.filter.AbstractGroupFeatureOp.apply(AbstractGroupFeatureOp.java:57)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.lambda$apply$2(FeatureJoiner.java:62)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.genericApply(FeatureJoiner.java:71)
	at net.imglib2.trainable_segmention.pixel_feature.filter.FeatureJoiner.apply(FeatureJoiner.java:62)
	at net.imglib2.trainable_segmention.pixel_feature.calculator.FeatureCalculator.applyUseGpu(FeatureCalculator.java:152)
	at net.imglib2.trainable_segmention.classification.Segmenter.segmentGpu(Segmenter.java:129)
	at net.imglib2.trainable_segmention.classification.Segmenter.segment(Segmenter.java:109)
	at net.imglib2.trainable_segmention.performance.ParallelSegmentationTask.lambda$run$0(ParallelSegmentationTask.java:44)
	at net.imglib2.cache.img.LoadedCellCacheLoader.get(LoadedCellCacheLoader.java:91)
	at net.imglib2.cache.img.LoadedCellCacheLoader.get(LoadedCellCacheLoader.java:51)
	at net.imglib2.cache.ref.SoftRefLoaderCache.get(SoftRefLoaderCache.java:101)
	at net.imglib2.cache.util.LoaderCacheAsCacheAdapter.get(LoaderCacheAsCacheAdapter.java:30)
	at net.imglib2.cache.util.CacheAsUncheckedCacheAdapter.get(CacheAsUncheckedCacheAdapter.java:28)
	at net.imglib2.img.cell.LazyCellImg$LazyCells.get(LazyCellImg.java:104)
	at net.imglib2.img.list.AbstractLongListImg$LongListRandomAccess.get(AbstractLongListImg.java:274)
	at net.imglib2.img.cell.CellRandomAccess.getCell(CellRandomAccess.java:136)
	at net.imglib2.img.cell.CellRandomAccess.updatePosition(CellRandomAccess.java:474)
	at net.imglib2.img.cell.CellRandomAccess.<init>(CellRandomAccess.java:130)
	at net.imglib2.img.cell.AbstractCellImg.randomAccess(AbstractCellImg.java:104)
	at net.imglib2.trainable_segmention.Utils.lambda$populateCellImg$7(Utils.java:355)
	at preview.net.imglib2.parallel.DefaultTaskExecutor.lambda$forEach$0(DefaultTaskExecutor.java:101)
	at java.util.concurrent.ForkJoinTask$AdaptedCallable.exec(ForkJoinTask.java:1424)
	at java.util.concurrent.ForkJoinTask.doExec(ForkJoinTask.java:289)
	at java.util.concurrent.ForkJoinPool$WorkQueue.runTask(ForkJoinPool.java:1056)
	at java.util.concurrent.ForkJoinPool.runWorker(ForkJoinPool.java:1692)
	at java.util.concurrent.ForkJoinWorkerThread.run(ForkJoinWorkerThread.java:157)

When removing the __write_only and __read_only from here execution of the benchmark hangs here for quite some time (minutes):

net.imglib2.trainable_segmention.performance.GpuCpuComparisonBenchmark
[WARNING] 1 exceptions occurred during plugin discovery.
log4j:WARN No appenders could be found for logger (org.bushe.swing.event.EventService).
log4j:WARN Please initialize the log4j system properly.
log4j:WARN See http://logging.apache.org/log4j/1.2/faq.html#noconfig for more info.
GPU: Intel(R) UHD Graphics 620
Time copying: 107.304 ms

Edit: It's not "hanging" - it just takes a minute or so before it continues...

Note, I had to make the OPENCL_DEVICE_NAME configurable in order to test this. It might make sense to make this available to end users as well:
https://github.com/haesleinhuepf/imglib2-trainable-segmentation/commit/d32ab1af07d0bd48ab9bcac28736f55231b3f91a

Let me know if I can help with further infos.

Cheers,
Robert

@maarzt
Copy link
Collaborator

maarzt commented Apr 20, 2020

THANK YOU for testing this!

  • I removed to __read_only and __write_only flags as you suggested.
  • The "hanging" occurs for me too, some times, not a minute but still noticeable. So I will try to fix that first, and than I will ask you if the problem still occurs.
  • Making the device name configurable would be great. Ideally it should be configurable in the UI. I'm undecided: What's the best place to do such Settings in ImageJ / Labkit?

@haesleinhuepf
Copy link
Contributor Author

  • Making the device name configurable would be great. Ideally it should be configurable in the UI. I'm undecided: What's the best place to do such Settings in ImageJ / Labkit?

Does Labkit have its own config dialog? It would fit there, right? Next to the "Use GPU" checkbox ;-)

@maarzt
Copy link
Collaborator

maarzt commented Apr 20, 2020

No Labkit doesn't have a config dialog yet. Is there an "ImageJ / scijava way" for config dialogs?

@imagejan
Copy link

@maarzt wrote:

Is there an "ImageJ / scijava way" for config dialogs?

Yes, there are OptionsPlugins. See this one in TrackMate for a simple example.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants