Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using Cupy + custom Numba kernels for GPU #42

Open
wants to merge 5 commits into
base: main
Choose a base branch
from
Open

Conversation

mlazzarin
Copy link
Contributor

In this PR I tried an alternative approach w.r.t. #38.
To sum up, we have different possibilities for the implementation of the GPU backend:

  • Cupy with custom kernels written in C++ (in the main branch)
  • Numba with custom Numba kernels (explored in Add Numba GPU backend #38)
  • Cupy with custom Numba kernels (in this PR)

A comparison between the current branch and the main one is presented in the following table.
Concerning the dry run overhead of the main branch, please remember that we moved the compilation to import, otherwise it would be ~ 0.9 s

nqubits Simulation time cupy Simulation time cupy+numba Dry run overhead cupy Dry run overhead cupy+numba
3 0.00043 0.00467 0.09297702312469483 0.30436996221542356
4 0.00083 0.00888 0.09518848657608033 0.2708741545677185
5 0.00081 0.00916 0.09455618858337403 0.2714032411575317
6 0.00119 0.01318 0.09532219171524048 0.2728468656539917
7 0.00121 0.01318 0.09505038261413574 0.27390332221984864
8 0.00155 0.01749 0.09760881662368774 0.2756802082061768
9 0.00156 0.01757 0.09740451574325562 0.2755767583847046
10 0.00203 0.02177 0.09837921857833862 0.27367912530899047
11 0.00225 0.02120 0.09772248268127441 0.27559599876403806
12 0.00271 0.02601 0.09852179288864135 0.2762377142906189
13 0.00264 0.02627 0.09880340099334717 0.27665033340454104
14 0.00313 0.03065 0.10003931522369384 0.2785204529762268
15 0.00306 0.03111 0.09940198659896851 0.2762040734291077
16 0.00357 0.03532 0.10092591047286988 0.28022428750991824
17 0.00354 0.03561 0.10018761157989502 0.2839407444000244
18 0.00438 0.03952 0.10117172002792359 0.2824350595474243
19 0.00572 0.04057 0.1013898491859436 0.2807410478591919
20 0.00892 0.04637 0.10407603979110717 0.28025256395339965
21 0.01354 0.05051 0.10352401733398438 0.28388042449951173
22 0.02545 0.06747 0.1031303882598877 0.28077268600463867
23 0.04632 0.08786 0.10362286567687988 0.2811563968658447
24 0.09514 0.14064 0.10486226081848145 0.2819861888885498
25 0.18772 0.23358 0.10353651046752929 0.2827317714691162
26 0.39646 0.44629 0.10386247634887696 0.28243045806884765
27 0.79687 0.84750 0.10521702766418461 0.28218193054199214
28 1.70808 1.76715 0.10237464904785165 0.2835762977600098
29 3.36999 3.43840 0.09933676719665518 0.2804431915283203
30 7.28894 7.37794 0.09502677917480451 0.2725296497344969

The results are worse than the main branch. Please note that the implementation in this PR is fully working, apart from multi-qubit gate kernels, which don't work well without the _launch_bound_ directive.

Given these results, it seems that our implementation in main is still the best option. By the way, the implementation in this PR is very simple and doesn't even require the programmer to know C++ or CUDA, so it may be a good option for projects with different goals/constraints.

@scarrazza
Copy link
Member

Thanks for this implementation and checks. I think we should add to this PR and #38 a new label, i.e. "experimental" and keep this open until there are more progress from the numba side.

@mlazzarin mlazzarin added experimental Experimental implementation do not merge labels Nov 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
do not merge experimental Experimental implementation
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants