Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[VASP] Add handler for PSMAXN warnings and associated failures #133

Open
rkingsbury opened this issue Nov 14, 2019 · 7 comments
Open

[VASP] Add handler for PSMAXN warnings and associated failures #133

rkingsbury opened this issue Nov 14, 2019 · 7 comments

Comments

@rkingsbury
Copy link
Contributor

rkingsbury commented Nov 14, 2019

Summary

  • VASP failure associated with PSMAXN warning is not recognized as an error by Custodian, triggering misleading ValidationError

Details

With certain combinations of ENCUT, LREAL, and pseudopotentials, VASP issues the warning

WARNING: PSMAXN for non-local potential too small

In some cases vasp still runs successfully, in other cases it will fail, e.g.

WARNING: PSMAXN for non-local potential too small
 LDA part: xc-table for Pade appr. of Perdew
 POSCAR, INCAR and KPOINTS ok, starting setup
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
...

It appears that Custodian does not recognize the above type of failure as an error. As a result, _run_job() will attempt to validate the output of the calculation (which never ran in the first place and therefore never generated output) and raise a ValidationError .

Traceback (most recent call last):
  File "/global/u2/r/rsking84/.conda/envs/cms/code/fireworks/fireworks/core/rocket.py", line 262, in run
    m_action = t.run_task(my_spec)
  File "/global/u2/r/rsking84/.conda/envs/cms/code/atomate/atomate/vasp/firetasks/run_calc.py", line 211, in run_task
    c.run()
  File "/global/u2/r/rsking84/.conda/envs/cms/code/custodian/custodian/custodian.py", line 378, in run
    self._run_job(job_n, job)
  File "/global/u2/r/rsking84/.conda/envs/cms/code/custodian/custodian/custodian.py", line 502, in _run_job
    raise ValidationError(s, True, v)
custodian.custodian.ValidationError: Validation failed: VasprunXMLValidator

The ValidationError is very difficult to troubleshoot without running vasp manually. In this situation, the contents of vasp.out are empty and std_err.txt contains only

srun: fatal: Can not execute vasp_std

Suggested solution

Information about the PSMAXN warning and associated failures is scarce, but there appear to be several possible fixes:

  1. Set LREAL=FALSE (expand the basis set in reciprocal space instead of real space)
  2. Sort the pseudopotentials such that the one with the highest ENMAX appears first in the POTCAR
  3. Lower the ENCUT value

I have had the most success with Option 1.

Option 2 has not solved the issue for me and is only applicable if the user does not specify ENCUT in the INCAR file (I think; see docs ). It is also not clear whether this fix is still relevant to the latest versions of VASP.

I'm not yet familiar enough with the architecture of Custodian to know the best way to address this, but it seems to me that, at a minimum, an error handler to catch this type of failure would be valuable. Even better would be to modify LREAL to FALSE on the fly.

Further reading on troubleshooting the VASP PSMAXN warnings:

https://cms.mpi.univie.ac.at/vasp-forum/viewtopic.php?f=3&t=8370

the reason most probably is that you join 2 potentials with very different cutoff, with the POTCAR with the SMALL cutoff (U) being the first in the list. This potentials is used to determine PSMAXN.
please
1) switch the 2 atoms in POSCAR and POTCAR (ie give the atoms such that those with the hardest potentials are first
2) OR use O_s (soft O, low cutoff)

https://cms.mpi.univie.ac.at/vasp-forum/viewtopic.php?t=14811

The warning means that PSMAXN is too small for the required cutoff energy (ENMAX) the first of the atoms given in POTCAR.
Either use a harder potential or decrease ENMAX.

Solved it by setting LREAL=FALSE

https://www.researchgate.net/post/Relaxation_in_metal_using_vasp2

"PSMAXN for non-local potential too small"
Try lowering your ENCUT parameter (how large is it, and what are the defaults in your POTCAR?), this error indicates that you go out of bounds for an array related to the potential, which is related to the cutoff energy.

http://materials.duke.edu/AFLOW/README_AFLOW.TXT)

PSMAXN
PSMAXN errors. By default aflow tries to go around PSMAXN warnings by restarting VASP with reducingly
lower ENMAX until everything is set. This can be done by tuning the INCAR schemes.

@mkhorton
Copy link
Member

The order of the POTCARs matters? That's crazy. Is this true even though we set e.g. ENCUT manually?

@rkingsbury
Copy link
Contributor Author

rkingsbury commented Nov 14, 2019

After further reading, I think those forum posts are out of date. According to the current docs:

  • If ENCUT is specified in the INCAR it overrides everything else
  • If ENCUT is not specified it is determined automatically as the maxmium ENMAX of ALL the pseudopotentials in the POTCAR. It seems like older versions of VASP just used the ENMAX of the first pseudopotential.

@rkingsbury
Copy link
Contributor Author

As an additional note, @mkhorton and I noticed that when this failure occurs, somehow the output from stdout is not written to vasp.out or std_err.txt. Full terminal output for an example failing calculation is:


OOO  PPPP  EEEEE N   N M   M PPPP

O O P P E NN N MM MM P P
O O PPPP EEEEE N N N M M M PPPP -- VERSION
O O P E N NN M M P
OOO P EEEEE N N M M P

running 16 mpi-ranks, with 4 threads/rank
distrk: each k-point on 16 cores, 1 groups
distr: one band on 1 cores, 16 groups
using from now: INCAR
vasp.6.0.8 29Jun18 (build Jun 13 2019 12:54:44) complex

POSCAR found type information on POSCAR O Ba Be Si
POSCAR found : 4 types and 7 ions
scaLAPACK will be used

 -----------------------------------------------------------------------------
|                                                                             |
|  ADVICE TO THIS USER RUNNING 'VASP/VAMP'   (HEAR YOUR MASTER'S VOICE ...):  |
|                                                                             |
|      You have a (more or less) 'small supercell' and for smaller cells      |
|      it is recommended  to use the reciprocal-space projection scheme!      |
|      The real space optimization is not  efficient for small cells and it   |
|      is also less accurate ...                                              |
|      Therefore set LREAL=.FALSE. in the  INCAR file                         |
|                                                                             |
 -----------------------------------------------------------------------------

 WARNING: PSMAXN for non-local potential too small
 LDA part: xc-table for Pade appr. of Perdew
 found WAVECAR, reading the header
 POSCAR, INCAR and KPOINTS ok, starting setup
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...
 REAL_OPT: internal ERROR:         -32         -32         -32           0
 VASP aborting ...

@mkhorton
Copy link
Member

Yes, vasp.out when running from the workflow was empty even though the standard out was present interactively, very strange.

@rkingsbury
Copy link
Contributor Author

rkingsbury commented Nov 19, 2019

After further testing, it seems that LREAL=False is not always a reliable fix for this. The other challenge is that often times the calculation can complete succesfully with a PSMAXN warning, so having a Custodian handler for it is problematic - we don't want to keep restarting the calculation just because that warning is present, but only when the calculation fails. I've updated the Handler to respond only to the REALOPT error, not the PSMAXN warning.

The real question is - how can we make sure that vasp.out gets populated correctly when these failures occur?

@mkhorton
Copy link
Member

mkhorton commented Nov 19, 2019 via email

@shyuep
Copy link
Member

shyuep commented Nov 19, 2019

I would say we want to know what is a reliable way to resolve the problem first. If there is a reliable way, then fixing even runs that theoretically could finish is fine. You can also set a counter, e.g., if you have tried to fix PSMAXN a few times, the job will be flagged as unrecoverable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants