Python software on Archer: Difference between revisions

From Thom Group Wiki
Jump to navigation Jump to search
No edit summary
No edit summary
 
Line 21: Line 21:




== In submission script itself, do not forget to: ==
== When installing pyscf and in the submission script: ==




1) Load python and all modules supplied centrally
1) Load python and all modules supplied centrally (and load them in your submission script)
module load python-compute
module load python-compute
module load pc-numpy
module load pc-numpy
module load pc-scipy
module load pc-scipy
module load pc-ase

2) Enter virtual environment containing all additional modules (AFTER loading python and the central modules)
2) Enter virtual environment containing all additional modules (AFTER loading python and the central modules)
source /work/e507/e507/ap837/code/venv_pyscf/bin/activate
source /work/e507/e507/ap837/code/venv_pyscf/bin/activate
[for AJWT this was in /fs2/e507/e507/...]
[for AJWT this was in /fs2/e507/e507/...]


3) Now use pip install for the rest (only needed once, when setting up):
3) Use aprun to run the job, otherwise it will run only on the shared job launcher node and not on the compute nodes.
export CRAYPE_LINK_TYPE=dynamic
pip install pyscf
pip install unittest2


4) Create a tmp folder in the work directory and set the environmental variable to pint to it. The compute nodes do not have access to the regular tmp folders.
4) Create a tmp folder in the work directory and set the environmental variable to point to it in the submission script. The compute nodes do not have access to the regular tmp folders.
export TMPDIR=/work/e507/e507/ap837/tmp
export TMPDIR=/work/e507/e507/ap837/tmp


5) Use aprun to run the job, otherwise it will run only on the shared job launcher node and not on the compute nodes.
5) If code uses intel libraries: (errors like:

6) If code uses intel libraries: (errors like:
OSError: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory
OSError: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory
OSError: libiomp5.so: cannot open shared object file: No such file or directory
OSError: libiomp5.so: cannot open shared object file: No such file or directory
Line 43: Line 50:
source /opt/intel/bin/compilervars.sh intel64
source /opt/intel/bin/compilervars.sh intel64


7) When using numpy, the following error may occur:
[5b AJWT needed to pip install unittest2
6) When using numpy, the following error may occur:
Intel MKL FATAL ERROR: Cannot load libmkl_avx.so or libmkl_def.so
Intel MKL FATAL ERROR: Cannot load libmkl_avx.so or libmkl_def.so
Tha solution is to do:
Tha solution is to do:
export LD_PRELOAD=/opt/intel/mkl/lib/intel64/libmkl_core.so:/opt/intel/mkl/lib/intel64/libmkl_sequential.so
export LD_PRELOAD=/opt/intel/mkl/lib/intel64/libmkl_core.so:/opt/intel/mkl/lib/intel64/libmkl_sequential.so
as found here http://debugjournal.tumblr.com/post/98401758462/intel-mkl-dynamic-link-library-error.
as found here http://debugjournal.tumblr.com/post/98401758462/intel-mkl-dynamic-link-library-error.

7) For PySCF, since we are using old numpy, you need to comment out warning in __init__.py in the pyscf folder

#if LooseVersion(numpy.__version__) <= LooseVersion('1.8.0'):
# raise SystemError("You're using an old version of Numpy (%s). "
# "It is recommended to upgrad numpy to 1.8.0 or newer. \n"
# "You still can use all features of PySCF with the old numpy by removing this warning msg. "
# "Some modules (DFT, CC, MRPT) might be affected because of the bug in old numpy." %
# numpy.__version__)



== Setting up virtual environment for PySCF ==
== Setting up virtual environment for PySCF ==
Line 79: Line 75:
source /work/e507/e507/$USER/pyscfEnv/bin/activate
source /work/e507/e507/$USER/pyscfEnv/bin/activate
export CRAYPE_LINK_TYPE=dynamic
export CRAYPE_LINK_TYPE=dynamic

4) Update pip

pip install pyscf

3) Install h5py, ase or other necessary modules

Latest revision as of 11:24, 15 August 2018

For running Python on work nodes on Archer, there are python-compute (native) and anaconda-compute modules available. Users are discouraged from using anaconda-compute because it is not optimised for running on Archer. There are some preinstalled packages for python-compute and they begin with pc-. For installation of additional packages, virtual environments are encouraged.

For more info see: http://www.archer.ac.uk/documentation/user-guide/python.php

When compiling Python software (like PySCF) Archer by default builds all libraries as static libraries. This leads to errors like:

 ImportError Cannot import name ...

or

 File "/work/y07/y07/cse/numpy/1.9.2-libsci/lib/python2.7/site-packages/numpy/ctypeslib.py", line 128, in load_library
   raise OSError("no file with expected extension")
 OSError: no file with expected extension

and other errors.

To prevent dynamic libraries from becoming static you must:

 export CRAYPE_LINK_TYPE=dynamic

before the compilation. For more information see: http://www.archer.ac.uk/documentation/user-guide/development.php#sec-4.6


When installing pyscf and in the submission script:

1) Load python and all modules supplied centrally (and load them in your submission script)

 module load python-compute
 module load pc-numpy
 module load pc-scipy
 module load pc-ase

2) Enter virtual environment containing all additional modules (AFTER loading python and the central modules)

 source /work/e507/e507/ap837/code/venv_pyscf/bin/activate
 [for AJWT this was in /fs2/e507/e507/...]

3) Now use pip install for the rest (only needed once, when setting up):

 export CRAYPE_LINK_TYPE=dynamic
 pip install pyscf
 pip install unittest2

4) Create a tmp folder in the work directory and set the environmental variable to point to it in the submission script. The compute nodes do not have access to the regular tmp folders.

 export TMPDIR=/work/e507/e507/ap837/tmp

5) Use aprun to run the job, otherwise it will run only on the shared job launcher node and not on the compute nodes.

6) If code uses intel libraries: (errors like:

 OSError: libmkl_intel_lp64.so: cannot open shared object file: No such file or directory
 OSError: libiomp5.so: cannot open shared object file: No such file or directory

) do:

 source /opt/intel/bin/compilervars.sh intel64

7) When using numpy, the following error may occur:

 Intel MKL FATAL ERROR: Cannot load libmkl_avx.so or libmkl_def.so

Tha solution is to do:

 export LD_PRELOAD=/opt/intel/mkl/lib/intel64/libmkl_core.so:/opt/intel/mkl/lib/intel64/libmkl_sequential.so

as found here http://debugjournal.tumblr.com/post/98401758462/intel-mkl-dynamic-link-library-error.

Setting up virtual environment for PySCF

In this order:

1) Load modules

  module load python-compute

[2) Create virtual environment]

  cd ~/work
  virtualenv --system-site-packages pyscfEnv

This will print out a directory for the virtualenv installation which will hopefully be like the one below (with $USER changed) /work/e507/e507/$USER/pyscfEnv/bin/activate

3) Enter virtual environment

  source /work/e507/e507/$USER/pyscfEnv/bin/activate
  export CRAYPE_LINK_TYPE=dynamic