Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update tn #74

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Update tn #74

wants to merge 16 commits into from

Conversation

Tankya2
Copy link
Contributor

@Tankya2 Tankya2 commented Oct 4, 2024

Hi,

This PR is to address issue #36.

Added memory management code and set the Pathfinding model used to CUTENSOR.

Copy link
Member

@alecandido alecandido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The eval.py contains highly duplicated code, and this is manifest in this PR, since the same update has been repeated over and over.

Before merging this PR, increasing even more the maintenance burden, it would be worth to refactor the eval.py file (in this, or even in another PR), to avoid incurring is such a redundant diff.

src/qibotn/backends/cutensornet.py Outdated Show resolved Hide resolved
src/qibotn/eval.py Outdated Show resolved Hide resolved
operands = myconvertor.state_vector_operands()
operands = myconvertor.state_vector_operands()
else:
operands = None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the actual purpose of this?

If rank != 0, qibo_circ is fully ignored...

Even if it is somehow meaningful (I'm not seeing how, but that may be my limitation), the result could only be trivial, so you could even return immediately, without executing all the other operations...

Copy link
Contributor Author

@Tankya2 Tankya2 Oct 30, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Each rank needs the same initial set of operands for computation. Here, the operands are created in Rank 0, for all other rank the operands are just set to None. In line 86, the operands created in Rank 0 is then broadcasted to all other ranks.

@@ -62,6 +62,7 @@ def dense_vector_tn_MPI(qibo_circ, datatype, n_samples=8):
Dense vector of quantum circuit.
"""

import cuquantum.cutensornet as cutn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason to keep the imports within the functions? (instead of top-level)

I know it was like this even before this PR...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason was that not all functions require the import, specifically dense_vector_tn(), expectation_pauli_tn(), dense_vector_mps(), pauli_string_gen(). Do you think it is better to bring them to the top-level?

Comment on lines -80 to -81
# Assign the device for each process.
device_id = rank % getDeviceCount()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you remember why it was repeated before?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment may be still useful, and you could lift to the line above.

@@ -136,6 +150,7 @@ def dense_vector_tn_nccl(qibo_circ, datatype, n_samples=8):
Returns:
Dense vector of quantum circuit.
"""
import cuquantum.cutensornet as cutn
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

@@ -200,6 +230,9 @@ def dense_vector_tn_nccl(qibo_circ, datatype, n_samples=8):
stream_ptr,
)

del network
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as above

src/qibotn/eval.py Outdated Show resolved Hide resolved
@scarrazza
Copy link
Member

@alecandido @Tankya2 what is the plan for this PR?

@Tankya2
Copy link
Contributor Author

Tankya2 commented Jan 16, 2025

@alecandido @Tankya2 what is the plan for this PR?

Hi @scarrazza, @alecandido , @liweintu , I believe most issues in this PR have been addressed. The remaining concern is the presence of duplicated code across various computation modes. I propose to close this PR first if there are no other further issues and to tackle these duplicated codes in another PR.

@liweintu
Copy link
Contributor

I tested this PR branch on ASPIRE 2A+, using test_cuquantum_cutensor_backend.py. Most cases passed, but there's 1 assertion failure below.

        # Test Cuquantum
        cutn_time, result_tn = time(
            lambda: qibotn.eval.dense_vector_tn(qibo_circ, dtype).flatten()
        )

>       assert 1e-2 * qibo_time < cutn_time < 1e2 * qibo_time
E       assert 0.38179754093289375 < (100.0 * 0.00212192814797163)

Seems the time taken is too long to pass the assertion? @Tankya2

@alecandido
Copy link
Member

My main concern was that the code was extended, introducing even further duplication, without reducing the existing one.

Tests and performance benchmarks are not automated, so little can be said.

However, if you manually test the code, and this is needed soon, of course feel free to proceed. Just take note of the project needs, and schedule them for later.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants