Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor interfaces #187

Merged
merged 50 commits into from
Mar 30, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
1cb9cc0
Backends interface: move the backends to a special directory.
isazi Oct 28, 2022
82ef365
Updated documentation.
isazi Oct 28, 2022
26a60be
Draft of backend base class. Now in use.
isazi Oct 28, 2022
902a676
Merge branch 'master' into refactor_interface
isazi Oct 31, 2022
27aeaf7
Merge remote-tracking branch 'upstream/master' into refactor_interface
stijnh Nov 4, 2022
d45ca43
Remove `kernel_options` argument from `strategies.common._cost_func`
stijnh Nov 4, 2022
0a83f88
Remove `kernel_options` and `device_options` arguments from `tune` fu…
stijnh Nov 4, 2022
5c82c2f
Make runner now no longer return their environment after each call to…
stijnh Nov 4, 2022
e1c2e6c
Add `sorted_list` method to `Searchspace` and remove `sort` kwarg fro…
stijnh Nov 4, 2022
fa2ad18
Create `Searchspace` object in `tune_kernel` and pass it explicitly t…
stijnh Nov 4, 2022
6170250
Change `Searchspace` such that it explicitly takes `tune_param` and `…
stijnh Nov 4, 2022
161c5ee
Fetch `tune_params` from `searchspace` instead of `tuning_options` wh…
stijnh Nov 4, 2022
745d33b
Remove @abstractmethod from several methods on Backend
stijnh Nov 4, 2022
aae70f9
Add CostFunc class to replace _cost_func function
stijnh Nov 4, 2022
a20cbd5
Fix most of the tests again
stijnh Nov 4, 2022
771a9c0
Refactoring the abstract backends.
isazi Nov 8, 2022
3528cad
Merge branch 'refactor_interface' of github.com:benvanwerkhoven/kerne…
isazi Nov 8, 2022
6c5b2ca
Skip test if no backend.
isazi Nov 8, 2022
babd049
Added comments.
isazi Nov 8, 2022
8dccb8e
Moving observers in their own space.
isazi Nov 8, 2022
582489b
Merge branch 'master' into refactor_interface
isazi Nov 10, 2022
1148813
Merge branch 'master' into refactor_interface
isazi Dec 1, 2022
84a3563
Merge branch 'master' into refactor_interface
isazi Jan 12, 2023
d845b33
Merge branch 'master' into refactor_interface
isazi Jan 13, 2023
d2951a7
Merge branch 'master' into refactor_interface
isazi Feb 1, 2023
0c7ec04
Fix bug.
isazi Feb 1, 2023
d034aff
Merge branch 'master' into refactor_interface
isazi Mar 17, 2023
58f9faa
Remove import of unused function from diff_evo
stijnh Mar 20, 2023
53d2ee2
Fix constructor of `Searchspace` that was lost due to bad merge
stijnh Mar 20, 2023
e24c4cb
Import `BenchmarkObserver` and others in `observers/__init__.py`
stijnh Mar 20, 2023
0613c01
Fix tests to work with new constructor for `Searchspace`
stijnh Mar 20, 2023
4801627
Fix incorrect import due to refactoring of observers
stijnh Mar 20, 2023
a8beb0c
Fix mock test for nvml after moving NVML module
stijnh Mar 20, 2023
20f7937
Have a base class for runners.
isazi Mar 20, 2023
4cf1a68
Remove code smells in comments.
isazi Mar 20, 2023
c531162
Remove code smells in comments.
isazi Mar 20, 2023
6c4d859
Require GPU backends to implement copy to shared/texture the same way…
isazi Mar 23, 2023
15f1881
Added comments to the base class.
isazi Mar 23, 2023
59d4006
Formatted using black.
isazi Mar 23, 2023
d6acee3
Split backends and runtime observers.
isazi Mar 24, 2023
ce96ec5
Fix bug in searchspace.py causing it be incompatible with numpy 1.24
stijnh Mar 27, 2023
8318a2c
Format `searchspace.py` with black
stijnh Mar 27, 2023
955460a
Add all packages to install.
isazi Mar 28, 2023
8002d29
Formatted with black.
isazi Mar 28, 2023
ca2bbdf
Merge branch 'refactor_interface' of github.com:KernelTuner/kernel_tu…
isazi Mar 28, 2023
aa01d8a
Backends reformatted with black.
isazi Mar 28, 2023
8481684
Observers formatted with black.
isazi Mar 28, 2023
3fe33f3
Formatted with black.
isazi Mar 28, 2023
3fa8ef2
Typo.
isazi Mar 28, 2023
881042a
Formatted with black.
isazi Mar 28, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 10 additions & 10 deletions doc/source/design.rst
Original file line number Diff line number Diff line change
Expand Up @@ -98,33 +98,33 @@ kernel_tuner.core.DeviceInterface
:special-members: __init__
:members:

kernel_tuner.pycuda.PyCudaFunctions
kernel_tuner.backends.pycuda.PyCudaFunctions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: kernel_tuner.pycuda.PyCudaFunctions
.. autoclass:: kernel_tuner.backends.pycuda.PyCudaFunctions
:special-members: __init__
:members:

kernel_tuner.cupy.CupyFunctions
kernel_tuner.backends.cupy.CupyFunctions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: kernel_tuner.cupy.CupyFunctions
.. autoclass:: kernel_tuner.backends.cupy.CupyFunctions
:special-members: __init__
:members:

kernel_tuner.nvcuda.CudaFunctions
kernel_tuner.backends.nvcuda.CudaFunctions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: kernel_tuner.nvcuda.CudaFunctions
.. autoclass:: kernel_tuner.backends.nvcuda.CudaFunctions
:special-members: __init__
:members:

kernel_tuner.opencl.OpenCLFunctions
kernel_tuner.backends.opencl.OpenCLFunctions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: kernel_tuner.opencl.OpenCLFunctions
.. autoclass:: kernel_tuner.backends.opencl.OpenCLFunctions
:special-members: __init__
:members:

kernel_tuner.c.CFunctions
kernel_tuner.backends.c.CFunctions
~~~~~~~~~~~~~~~~~~~~~~~~~
.. autoclass:: kernel_tuner.c.CFunctions
.. autoclass:: kernel_tuner.backends.c.CFunctions
:special-members: __init__
:members:

Expand Down
76 changes: 49 additions & 27 deletions examples/cuda/convolution_correct.py
Original file line number Diff line number Diff line change
Expand Up @@ -26,66 +26,88 @@
import kernel_tuner
from collections import OrderedDict


def tune():
with open('convolution.cu', 'r') as f:
with open("convolution.cu", "r") as f:
kernel_string = f.read()

filter_size = (17, 17)
problem_size = (4096, 4096)
size = numpy.prod(problem_size)
border_size = (filter_size[0]//2*2, filter_size[1]//2*2)
input_size = ((problem_size[0]+border_size[0]) * (problem_size[1]+border_size[1]))
border_size = (filter_size[0] // 2 * 2, filter_size[1] // 2 * 2)
input_size = (problem_size[0] + border_size[0]) * (problem_size[1] + border_size[1])

output = numpy.zeros(size).astype(numpy.float32)
input = numpy.random.randn(input_size).astype(numpy.float32)

filter = numpy.random.randn(filter_size[0]*filter_size[1]).astype(numpy.float32)
cmem_args= {'d_filter': filter }
filter = numpy.random.randn(filter_size[0] * filter_size[1]).astype(numpy.float32)
cmem_args = {"d_filter": filter}

args = [output, input, filter]
tune_params = OrderedDict()
tune_params["filter_width"] = [filter_size[0]]
tune_params["filter_height"] = [filter_size[1]]

#tune_params["block_size_x"] = [16*i for i in range(1,3)]
tune_params["block_size_x"] = [16*i for i in range(1,9)]
#tune_params["block_size_y"] = [2**i for i in range(1,5)]
tune_params["block_size_y"] = [2**i for i in range(1,6)]
# tune_params["block_size_x"] = [16*i for i in range(1,3)]
tune_params["block_size_x"] = [16 * i for i in range(1, 9)]
# tune_params["block_size_y"] = [2**i for i in range(1,5)]
tune_params["block_size_y"] = [2**i for i in range(1, 6)]

tune_params["tile_size_x"] = [2**i for i in range(3)]
tune_params["tile_size_y"] = [2**i for i in range(3)]

tune_params["use_padding"] = [0,1] #toggle the insertion of padding in shared memory
tune_params["read_only"] = [0,1] #toggle using the read-only cache
tune_params["use_padding"] = [
0,
1,
] # toggle the insertion of padding in shared memory
tune_params["read_only"] = [0, 1] # toggle using the read-only cache

grid_div_x = ["block_size_x", "tile_size_x"]
grid_div_y = ["block_size_y", "tile_size_y"]

#compute the answer using a naive kernel
params = { "block_size_x": 16, "block_size_y": 16}
# compute the answer using a naive kernel
params = {"block_size_x": 16, "block_size_y": 16}
tune_params["filter_width"] = [filter_size[0]]
tune_params["filter_height"] = [filter_size[1]]
results = kernel_tuner.run_kernel("convolution_naive", kernel_string,
problem_size, args, params,
grid_div_y=["block_size_y"], grid_div_x=["block_size_x"], lang='cupy')

#set non-output fields to None
results = kernel_tuner.run_kernel(
"convolution_naive",
kernel_string,
problem_size,
args,
params,
grid_div_y=["block_size_y"],
grid_div_x=["block_size_x"],
lang="cupy",
)

# set non-output fields to None
answer = [results[0], None, None]

#start kernel tuning with correctness verification
return kernel_tuner.tune_kernel("convolution_kernel", kernel_string,
problem_size, args, tune_params,
grid_div_y=grid_div_y, grid_div_x=grid_div_x, verbose=True, cmem_args=cmem_args, answer=answer, lang='cupy')
# start kernel tuning with correctness verification
return kernel_tuner.tune_kernel(
"convolution_kernel",
kernel_string,
problem_size,
args,
tune_params,
grid_div_y=grid_div_y,
grid_div_x=grid_div_x,
verbose=True,
cmem_args=cmem_args,
answer=answer,
lang="cupy",
)


if __name__ == "__main__":
import time
s1 = time.time()*1000

s1 = time.time() * 1000
results = tune()

e1 = time.time()*1000
print("\n Actualy time used:", e1-s1)
e1 = time.time() * 1000
print("\n Actual time used:", e1 - s1)
import json
with open("convolution_RTX_2070.json", 'w') as fp:
json.dump(results, fp)

with open("convolution_RTX_2070.json", "w") as fp:
json.dump(results, fp)
2 changes: 1 addition & 1 deletion examples/cuda/vector_add_observers.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

import numpy
from kernel_tuner import tune_kernel
from kernel_tuner.nvml import NVMLObserver
from kernel_tuner.observers.nvml import NVMLObserver

def tune():

Expand Down
2 changes: 1 addition & 1 deletion examples/opencl/vector_add_observers.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

import numpy
from kernel_tuner import tune_kernel
from kernel_tuner.nvml import NVMLObserver
from kernel_tuner.observers.nvml import NVMLObserver

def tune():

Expand Down
Empty file.
89 changes: 89 additions & 0 deletions kernel_tuner/backends/backend.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,89 @@
"""This module contains the interface of all kernel_tuner backends"""
from __future__ import print_function

from abc import ABC, abstractmethod


class Backend(ABC):
"""Base class for kernel_tuner backends"""

@abstractmethod
def ready_argument_list(self, arguments):
"""This method must implement the allocation of the arguments on device memory."""
pass

@abstractmethod
def compile(self, kernel_instance):
"""This method must implement the compilation of a kernel into a callable function."""
pass

@abstractmethod
def start_event(self):
"""This method must implement the recording of the start of a measurement."""
pass

@abstractmethod
def stop_event(self):
"""This method must implement the recording of the end of a measurement."""
pass

@abstractmethod
def kernel_finished(self):
"""This method must implement a check that returns True if the kernel has finished, False otherwise."""
pass

@abstractmethod
def synchronize(self):
"""This method must implement a barrier that halts execution until device has finished its tasks."""
pass

@abstractmethod
def run_kernel(self, func, gpu_args, threads, grid, stream):
"""This method must implement the execution of the kernel on the device."""
pass

@abstractmethod
def memset(self, allocation, value, size):
"""This method must implement setting the memory to a value on the device."""
pass

@abstractmethod
def memcpy_dtoh(self, dest, src):
"""This method must implement a device to host copy."""
pass

@abstractmethod
def memcpy_htod(self, dest, src):
"""This method must implement a host to device copy."""
pass


class GPUBackend(Backend):
"""Base class for GPU backends"""

@abstractmethod
def __init__(self, device, iterations, compiler_options, observers):
pass

@abstractmethod
def copy_constant_memory_args(self, cmem_args):
"""This method must implement the allocation and copy of constant memory to the GPU."""
pass

@abstractmethod
def copy_shared_memory_args(self, smem_args):
"""This method must implement the dynamic allocation of shared memory on the GPU."""
pass

@abstractmethod
def copy_texture_memory_args(self, texmem_args):
"""This method must implement the allocation and copy of texture memory to the GPU."""
pass


class CompilerBackend(Backend):
"""Base class for compiler backends"""

@abstractmethod
def __init__(self, iterations, compiler_options, compiler):
pass
Loading