Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gw_iter rescf=True error #15

Open
mbarbry opened this issue Nov 30, 2019 · 25 comments
Open

gw_iter rescf=True error #15

mbarbry opened this issue Nov 30, 2019 · 25 comments

Comments

@mbarbry
Copy link
Collaborator

mbarbry commented Nov 30, 2019

Dear @masmansouri and @kovalp

Here is the error that I get when setting rescf=True in gw_iter. However, it occurred at the moment only with this specific system. It did not give issues for small molecules.

AttributeError: 'gw_iter' object has no attribute '_add_suffix'
Exception ignored in: <function VHFOpt.__del__ at 0x7f7a23aaba60>
Traceback (most recent call last):
File "/home/marc/programs/pyscf/pyscf/scf/_vhf.py", line 92, in __del__
libcvhf.CVHFdel_optimizer(ctypes.byref(self._this))
AttributeError: 'VHFOpt' object has no attribute '_this'

gw_iter_error.zip

@kovalp
Copy link
Collaborator

kovalp commented Nov 30, 2019 via email

@mbarbry
Copy link
Collaborator Author

mbarbry commented Nov 30, 2019

It is probably caused by some changes after the rebase with upstream. I get this error on nao2 branch.

@kovalp
Copy link
Collaborator

kovalp commented Nov 30, 2019 via email

@kovalp
Copy link
Collaborator

kovalp commented Nov 30, 2019

@mbarbry

@mbarbry
Copy link
Collaborator Author

mbarbry commented Nov 30, 2019

To compile I usually use (ubuntu 18.04),

export CC=gcc && export FC=gfortran && export CXX=g++
cd pyscf && git fetch && git checkout nao2 && cd pyscf/lib
cp cmake_user_inc_examples/cmake.user.inc-singularity.anaconda.gnu.mkl cmake.arch.inc
mkdir build && cd build && cmake .. && make VERBOSE=1

Then I export

export PYTHONPATH=/opt/pyscf:$PYTHONPATH
export LD_LIBRARY_PATH=/opt/pyscf/pyscf/lib/deps/lib:${LD_LIBRARY_PATH}

@masmansouri
Copy link
Collaborator

Hi,
by following the above procedure which is a standard way to compile pyscf, I just successfully installed it. I used to compile with another arch file like cmake.user.inc-nao-gnu that no longer works due to the inaccessible link for cloning the libxc in tddft.org webpage. After installation using mkl root as you mentioned, when I tried to import pyscf.nao modules, for example in Ipython, it gives an error as follows:

Python 3.7.3 (default, Mar 27 2019, 22:11:17)
Type 'copyright', 'credits' or 'license' for more information
IPython 7.6.1 -- An enhanced Interactive Python. Type '?' for help.

In [1]: from pyscf.nao import gw

OSError Traceback (most recent call last)
in
----> 1 from pyscf.nao import gw

~/software/pyscf/pyscf/nao/init.py in
23 from .m_ls_part_centers import ls_part_centers
24 from .m_coulomb_am import coulomb_am
---> 25 from .m_ao_matelem import ao_matelem_c
26 from .prod_basis import prod_basis
27 from .m_comp_coulomb_den import comp_coulomb_den

~/software/pyscf/pyscf/nao/m_ao_matelem.py in
21 from pyscf.nao.m_ao_log_hartree import ao_log_hartree
22 from timeit import default_timer as timer
---> 23 from pyscf.nao.m_gaunt import gaunt_c
24
25 #

~/software/pyscf/pyscf/nao/m_gaunt.py in
15 import numpy as np
16 import scipy as sp
---> 17 from pyscf.nao.m_thrj import thrj
18 #from sympy.physics.quantum.cg import Wigner3j as w3j
19 #from sympy.physics.wigner import gaunt as gau

~/software/pyscf/pyscf/nao/m_thrj.py in
39
40 if uselibnao :
---> 41 from pyscf.nao.m_libnao import libnao
42
43 libnao.init_thrj.argtypes = (

~/software/pyscf/pyscf/nao/m_libnao.py in
14
15 from pyscf.lib import misc
---> 16 libnao = misc.load_library("libnao")

~/software/pyscf/pyscf/lib/misc.py in load_library(libname)
62 else:
63 _loaderpath = os.path.dirname(file)
---> 64 return numpy.ctypeslib.load_library(libname, _loaderpath)
65
66 #Fixme, the standard resouce module gives wrong number when objects are released

~/anaconda3/lib/python3.7/site-packages/numpy/ctypeslib.py in load_library(libname, loader_path)
155 raise
156 ## if no successful return in the libname_ext loop:
--> 157 raise OSError("no file with expected extension")
158
159 ctypes_load_library = deprecate(load_library, 'ctypes_load_library',

OSError: no file with expected extension

Apparently, it cannot import most of nao modules. Do you know what the problem is?

@mbarbry
Copy link
Collaborator Author

mbarbry commented Nov 30, 2019

Looks like it does not find the library somehow.
I will give a try in a virtualbox. Could you send me the cmake.arch that you use.

@masmansouri
Copy link
Collaborator

I downgraded the gcc to the older version, i.e. g++ (Ubuntu 7.4.0-1ubuntu1~18.04.1) 7.4.0. I also tried different mkl roots in the arch file, but all failed to import modules.

cmake.arch.inc.txt

@mbarbry
Copy link
Collaborator Author

mbarbry commented Dec 1, 2019

Dear @masmansouri and @kovalp

I managed to install and run pyscf-nao on a clean system (linux mint 19.2) following the instructions below

Instruction to install pyscf-nao

apt-get update
apt-get upgrade
apt-get -y install git wget curl htop gcc gfortran vim build-essential liblapack-dev libfftw3-dev make cmake zlib1g-dev
wget https://repo.anaconda.com/archive/Anaconda3-2019.10-Linux-x86_64.sh
bash Anaconda3-2019.10-Linux-x86_64.sh
bash
mkdir -p anaconda3/lib/mkl/lib
ln -s ~/anaconda3/lib/libmkl_* ~/anaconda3/lib/mkl/lib/

export CC=gcc && export FC=gfortran && export CXX=g++
git clone https://github.com/cfm-mpc/pyscf.git
cd pyscf
git fetch
git checkout nao2
cd pyscf/lib
cp cmake_user_inc_examples/cmake.user.inc-singularity.anaconda.gnu.mkl cmake.arch.inc

Edit the cmake.arch.inc that it properly find mkl
Use a full path for MKLROOT, and don't use ~ for your home folder path, it could not compile in my case

export LD_LIBRARY_PATH=~/anaconda3/lib/mkl/lib:{LD_LIBRARY_PATH}
mkdir build && cd build
cmake .. && make

Once pyscf is build, add the following variable to your bash

export PYTHONPATH=~/pyscf:${PYTHONPATH}
export LD_LIBRARY_PATH=~/pyscf/pyscf/lib/deps/lib:${LD_LIBRARY_PATH}

then you can go to nao test folder, and you should be able to run the tests

@kovalp
Copy link
Collaborator

kovalp commented Dec 1, 2019 via email

@masmansouri
Copy link
Collaborator

Thanks @mbarbry One clarification, which version of gcc you used to compile?

@masmansouri
Copy link
Collaborator

@mbarbry by following your recipe and gcc 7.4.0 installed and successfully passed tests.

@masmansouri
Copy link
Collaborator

masmansouri commented Dec 1, 2019

I received a lot of
"NumbaPerformanceWarning: np.dot() is faster on contiguous arrays, called on (array(float64, 2d, C), array(float64, 2d, A))"
in most of the tests in gw class, e.g. test 0055-0058 calling rf0_den. Tests passed much much slower than the former algorithm! Also, a few tests mostly in tddft failed.

@mbarbry
Copy link
Collaborator Author

mbarbry commented Dec 1, 2019

Hi @kovalp, I agree a pip package would be nice, but difficult to achieve with the number of dependencies.

I think docker is easier for such cases with many dependencies. You should try the singularity packages I wrote.
https://gitlab.com/mbarbry/AbInitioToolkit

The recipe buildPySCF-NAO-Siesta.mkl.gpu is the latest up to date.

@masmansouri I will try to see if I get also the warnings for the tests. I did not see any in the tests I run thought.

@masmansouri
Copy link
Collaborator

=============================================================
Starting TEST at 01/12/2019 --- 14:25:41, please wait...IT TAKES A FEW MINS

=============> Report: 142 succeeded and 14 failed.<=================
FAILED TEST are:
test_0044_h2_scf_gto_vs_nao_nao.py
test_0046_tddft_gto_h2.py
test_0049_high_telec.py
test_0053_gw_si_ref.py
test_0068_gw_f_atom.py
test_0077_chlocal_water.py
test_0079_recomp_hamilt_water.py
test_0081_bse_nonin_gto_be.py
test_0082_bse_gw_gto_be.py
test_0083_gw_bse_h2_ae.py
test_0085_mn_scf.py
test_0095_h2_ae_fp_rescf.py
test_0100_fireball.py
test_0160_bse_h2b_uhf_gw.py
=======================FINISHED! AT 15:10:24 ===================

Tests which are failed, mostly have problems with this new algorithm si_correlation in m_rf0 and a few have problems in original pyscf kernel like test 44 for which scf no longer has former attributes.
"AttributeError: 'scf' object has no attribute 'incore_anyway".

@kovalp
Copy link
Collaborator

kovalp commented Dec 2, 2019

test_0087_o2_gw.py

Branch nao: Ran 1 test in 15.622s

Branch nao2: Ran 1 test in 76.021s

A lot of warnings from Numba

/home/kovalp/programs/pyscf/pyscf/nao/m_rf0_den.py:122: NumbaPerformanceWarning: np.dot() is faster on contiguous arrays, called on (array(float64, 2d, A), array(float64, 2d, C))
Can we solve problems one-by-one, please? Namely, the improvements by Numba prove to be equivocal. So, why not merge from upstream first and then start to improve things.

@kovalp
Copy link
Collaborator

kovalp commented Dec 2, 2019

Could I have nao2 without the performance degrading changes?

@mbarbry
Copy link
Collaborator Author

mbarbry commented Dec 2, 2019

I will have a look at what goes wrong.
In the tests I did, the numba changes were accelerating quite a lot the calculations.

To deactivate it you can set use_numba=False in gw.py

@kovalp
Copy link
Collaborator

kovalp commented Dec 2, 2019

There is a trouble with use_numba=False

pyscf.nao.gw  dtype  <class 'numpy.float64'>
pyscf.nao.gw 108
pyscf.nao.gw 113
pyscf.nao.gw 125
pyscf.nao.gw 138
pyscf.nao.m_kmat_den sm0_sum
E
======================================================================
ERROR: test_o2_gw_0087 (__main__.KnowValues)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_0087_o2_gw.py", line 21, in test_o2_gw_0087
    gw.kernel_gw()
  File "/home/kovalp/programs/pyscf/pyscf/nao/gw.py", line 427, in make_mo_g0w0
    if not hasattr(self,'sn2eval_gw'): self.sn2eval_gw=self.g0w0_eigvals() # Comp. GW-corrections
  File "/home/kovalp/programs/pyscf/pyscf/nao/gw.py", line 393, in g0w0_eigvals
    sn2i = self.gw_corr_int(sn2eval_gw)
  File "/home/kovalp/programs/pyscf/pyscf/nao/gw.py", line 279, in gw_corr_int
    self.snmw2sf = self.get_snmw2sf()
  File "/home/kovalp/programs/pyscf/pyscf/nao/gw.py", line 236, in get_snmw2sf
    wpq2si0 = self.si_c(ww = 1j*self.ww_ia).real
  File "/home/kovalp/programs/pyscf/pyscf/nao/gw.py", line 207, in si_c
    si_correlation(rf0(self, ww), si0, ww, self.kernel_sq, self.nprod)
NameError: name 'rf0' is not defined

----------------------------------------------------------------------
Ran 1 test in 7.461s

FAILED (errors=1)

Moreover, this part of the code does not need to be optimized. It is to produce a reference result, not necessarily the fastest one.

@kovalp
Copy link
Collaborator

kovalp commented Dec 2, 2019

@masmansouri which version do you normally use for production runs?

@mbarbry
Copy link
Collaborator Author

mbarbry commented Dec 2, 2019

Ok, in the example that I have this function was always used and is the bottleneck by far.
gw_iter is not calling it, but somehow is way slower than the non-iterative one I used.

The non-iterative code (not calling kernel_gw_iter)

from __future__ import division
import os
import numpy as np
from pyscf.nao import gw_iter

dname = os.getcwd()
gw = gw_iter(label='siesta', cd=dname, verbosity=3, rescf=True, niter_max_ev=50,
                     write_w=True, gw_iter_tol=1e-4, tol_ev=1.0e-3)
gw.kernel_gw()
gw.report()

The iterative code (calling kernel_gw_iter as pointed by Massoud)

from __future__ import division
import os
import numpy as np
from pyscf.nao import gw_iter

dname = os.getcwd()
gw = gw_iter(label='siesta', cd=dname, verbosity=3, rescf=True, niter_max_ev=50,
                     write_w=True, gw_iter_tol=1e-4, tol_ev=1.0e-3)
gw.kernel_gw_iter()
gw.report()

The first code needed 4.95e3 s, while the second 8.65e4 s.

Any idear why the iterative version is way slower in this case?

test.zip

@mbarbry
Copy link
Collaborator Author

mbarbry commented Dec 2, 2019

I agree with you @kovalp that it will be smarter to get the code to upstream before to start to play further. However, there are several issues that must be solved before to be able to merge with upstream.

  • The issue that I got with scf must be resolved
  • the tests not passing must be corrected
  • I have to remove some code with the GPU implementation of TDDFT (licensing issues, I don't expect the maintainer of pyscf to accept that code)

@masmansouri
Copy link
Collaborator

@kovalp I still run my calculations using my repository
which had been pushed to nao branch on this commit's SHA:
555e061c347c8d2e49ffbd00d9f1c54eaddc9e99

@masmansouri
Copy link
Collaborator

on the weekend I tried nao2 on my laptop and errors and numba warnings are related to this branch.

@kovalp
Copy link
Collaborator

kovalp commented Dec 3, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants