How to determine the matrix size of padded Hamiltonian (Padding Hamiltonian Matrix) #223

tsuyoshi38 · 2023-07-20T05:21:54Z

When the code is available to treat the padded Hamiltonian matrix, we should set the good value for block_size_r (and block_size_c).

First,

Manually Given or Default Setting:

Deafult setting for block_size_r and block_size_c should be given
- block_size_r ~ 20 (suggested from our test calculations, though depends on hardware.)
Is it okay to assume that block_size_r = block_size_c
How to set proc_rows and proc_cols ?
- we may be able to follow the present method.
- But, see the constraints for ELPA shown below.

Constraints in Scalapack
Constraints in ELPA
1. block_size_r = block_size_c
2. proc_cols = M (integer) x proc_rows
3. ?? needs more than 1 blocks in the corresponding column or row. ??

tsuyoshi38 · 2023-07-31T05:44:16Z

In the last comment for the constraints related to ELPA ...

For the constraint i) (block_size_r = block_size_c), judging from the following example shown in the page of ELPA,
call elpa%set("nblk", nblk, success) ! size of the BLACS block cyclic distribution
I guess we have only one parameter for the size of block.

On the other hand, the constraints ii) and iii) for ELPA are only from our benchmark tests.
What I heard is ELPA was very inefficient if these two are not satisfied.
Since I could not find any documents or other examples showing them on the internet, it is probably better to ignore these constraints for a while.

Then, we can simply set a default value of block size without considering the number of MPI processes and matrix size of Hamiltonian.

tsuyoshi38 · 2023-08-02T05:47:05Z

I think I have finished introducing "Padding H and S matrices" to make the dimension of matrix a multiple of Block size.
Note that if the block size is small (1-5?), Scalapack is very inefficient.

I think the code is already useful for may users, but the test calculations I have doe so far may not be enough.
Later, I will explain more about the relationship between # of MPI processes, dimension of matrix (H and S), and block size of matrices. We basically want to set a good default value of the block size (Diag.BlockSizeR and Diag.BlockSizeC).
But, it is not so simple as I first thought to set the appropriate block size for the given # of MPI processes and the dimension of matrix. In addition, the appropriate size of block may strongly depend on the hardware.

Considering these situations, I wonder it is better to introduce the changes in the following 2 steps.
Stage 1: we will collect the information from the users.
Default: use the present CQ setting (without padding.)
Option : we can set Diag.BlockSizeR for padding H and S.

Stage 2. we will provide the default value of the block size.
Default: using a default setting of Diag.BlockSizeR with padding.
if the users set the inappropriate # of processes, CQ warns -> changing BlockSize?
Option: If user sets Diag.BlockSizeR, use the given value and
just warning for inappropriate settings.

If anyone has a comment or suggestion, please let me know.

tsuyoshi38 · 2023-08-02T07:35:34Z

(( no. of processes, block size, dimension of the H and S matrices ))

First, let me remind you that we have two sizes for the dimension of Hamiltonian and overlap matrices.

matrix_size = actual dimension of Hamiltonian
matrix_size_padH = size of padded H or S matrix, to be a multiple of the block size.

Usually, proc_rows & proc_cols are determined from (no. of MPI processes).
It is also possible for users to set these parameters by setting Diag.ProcRows and Diag.ProcCols .

no. of processes => proc_rows, proc_cols
When (no. of processes) < 9 => proc_rows = 1, proc_cols = (no. of processes) / (no. of parallelisation for k-points)

On the other hand, we want to set the default size of block_size_r (and c) in the future. But, the values can be also given by setting Diag.BlockSizeR and Diag.BlockSizeC. As mentioned above, CQ is very slow if block_size_r (and c) is set to be less than 5. If we assume, block_size_r (and c) is given by CQ or a user, number of block matrices along row or column is calculated.

block_size_r, blocks_size_c => blocks_r, blocks_c (no. of blocks along row or column)

Her, we have a restriction;

blocks_r needs to be equal or larger than proc_rows
blocks_c needs to be equal or larger than proc_cols

Of course, users should not set a large number of processes when the matrix size is not large. Then, we may be able to introduce a new rule or restriction.

proc_cols is equal to or larger than proc_rows.
proc_cols must be smaller than blocks_c = (matrix_size_padH/ block size)

For large systems, it should be okay. We usually use large number of MPI processes, then proc_rows is proportional to the square root of (no. of processes), while (matrix_size) is proportional to the number of atoms and (block size) should be almost constant.

On the other hand, it may cause a problem for small systems. The number of processes can be smaller than 9, and (matrix_size) and (block size) may be comparable.
But... If we simply ignore the efficiency for small systems, it may be much easier to set the value of block size.

tsuyoshi38 · 2023-08-03T05:12:58Z

I have made a branch f-proj_PHM_BlockSize.
Here, the subroutine checking the condition mentioned above (condition 4) is made and put just after matrix_size_padH is calculated.

At present, the part is in the subroutine allocate_arrays in ScalapackFormat.f90. But, It can be put also in readDiagInfo in initial_read_module.f90. I thought it is better if the initial_read_module is smaller, for the readability.

I think the code is now ready for Stage 1 and would like to put it into develop version.
It is probably better to release v1.2 first and then merge this version to develop.
(But.. I forgot how to merge the present version of f-proj_PadHamiltonianMatrix, which was made from the old version of develop, to the latest version of develop. )

And..
I think I can finish my project (Implementing Padding ...) for now. Then, we will restart it after we collect the data for appropriate block size.

davidbowler · 2023-08-03T07:33:36Z

I agree that we should release version 1.2 first so for now please don't try to merge this into develop.

tsuyoshi38 changed the title ~~How to determine the matrix size of padded Hamiltonian~~ How to determine the matrix size of padded Hamiltonian (Padding Hamiltonian Matrix) Jul 20, 2023

tsuyoshi38 self-assigned this Jul 20, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to determine the matrix size of padded Hamiltonian (Padding Hamiltonian Matrix) #223

How to determine the matrix size of padded Hamiltonian (Padding Hamiltonian Matrix) #223

tsuyoshi38 commented Jul 20, 2023 •

edited by JianboLin

Loading

tsuyoshi38 commented Jul 31, 2023 •

edited

Loading

tsuyoshi38 commented Aug 2, 2023

tsuyoshi38 commented Aug 2, 2023 •

edited

Loading

tsuyoshi38 commented Aug 3, 2023 •

edited

Loading

davidbowler commented Aug 3, 2023

How to determine the matrix size of padded Hamiltonian (Padding Hamiltonian Matrix) #223

How to determine the matrix size of padded Hamiltonian (Padding Hamiltonian Matrix) #223

Comments

tsuyoshi38 commented Jul 20, 2023 • edited by JianboLin Loading

tsuyoshi38 commented Jul 31, 2023 • edited Loading

tsuyoshi38 commented Aug 2, 2023

tsuyoshi38 commented Aug 2, 2023 • edited Loading

tsuyoshi38 commented Aug 3, 2023 • edited Loading

davidbowler commented Aug 3, 2023

tsuyoshi38 commented Jul 20, 2023 •

edited by JianboLin

Loading

tsuyoshi38 commented Jul 31, 2023 •

edited

Loading

tsuyoshi38 commented Aug 2, 2023 •

edited

Loading

tsuyoshi38 commented Aug 3, 2023 •

edited

Loading