Chemically Directed Atom Swap Hopping -- Crystal structure prediction by swapping atoms in unfavourable chemical environments with variable composition functionality


Please note that this is not the DASH software for structure solution from powder diffraction data, which is developed by the Cambridge Crystallographic Data Centre and is available at:


ChemDASH is a crystal structure prediction code originally written by Paul Sharp (see and further developed with magnetism and variable compositions by Robert Dickson at the University of Liverpool. ChemDASH as currently implemented is written in python 3.8+, and depends on the atomic simulation environment (ASE), spglib, and their subsequent dependencies. ChemDASH implements the basin hopping method to explore the potential energy surface, with atom swaps used to generate new structures. Atoms can be swapped at random, or we can use the method of directed swapping to rank each atom according to its chemical environment, with atoms in the least favourable environments prioritised for swapping. Structures in ChemDASH can be initialised by populating cation and anion sites on initialisation grids, or from a CIF file. Structural optimisation can be done using either the GULP or VASP packages. Variable compositions have been implemented by taking the energy difference between two end members and calculating the solid solution energy as the difference between the new structure and the end members.


To run a ChemDASH calculation, two input files are required: a “.atoms” file and a “.input” file. By default they must both have the same basename. With valid files and a copy of ChemDASH in the current working directory, ChemDASH is run by typing:

  python chemdash <basename>

where <basename> is the basename of both the “.atoms” and “.input” files.

ChemDASH can also be run directly in python after instantiating a ChemDASH class and calling the run_chemdash() method

from chemdash.master_code import ChemDASH

test_chemdash_run = ChemDASH(calc_name="test")

Running ChemDASH in a python script allows for more complex pre-processing and post-processing of a ChemDASH run.

The output of the calculation is written to the file “<basename>.chemdash”. If there are errors in either the “.atoms” or the “.input” files, then the calculation is stopped, with errors listed in the file “<basename>.error”. To restart a ChemDASH run, with a restart file present, run with “restart=True” in the input file.

ChemDASH does not have to be installed in the working directory, in this case, run ChemDASH with:

  python <filepath_to_ChemDASH_directory> <basename>

for example:

  python /home/software/ChemDASH/chemdash <basename>


There are a number of flags that enable ChemDASH options, these are listed by typing:

  python chemdash -h

These options are:

  -h, --help            show this help message and exit

  -i, --input           Print all options for the ".input" file with a
                        description of each option. (default: False)

  -p <input file> [<input file> ...], --parse <input file> [<input file> ...] 
                        Parse the given input file, report any errors and
                        exit. (default: None)

  -s <cif file>, --symm <cif file>, --symmetry <cif_file>
                        Use spglib to look for higher symmetry in the supplied
                        cif file, and write to a new file "<cif_file>_symm.cif".
                        (default: None)

  -w [<input file>], --write [<input file>]
                        Write an input file that includes all keywords with
                        their default values to the given file and exit.
                        (default: None)

  -v, --version         show ChemDASH version number and exit

Python Libraries

ChemDASH requires python version 3.5+, and the following python libraries:

  ase (Atomic Simulation Environment)

Atoms File

The atoms file contains a list of all of the atoms to be used in the simulation. On each line, we have the atomic symbol for a particular element, the number of atoms of that element, and the ionic charge (oxidation state) of these atoms. For example, the atom file for a single formula unit (i.e., five atoms) of Strontium Titanate (SrTiO3) reads:

  O  3 -2
  Sr 1 +2
  Ti 1 +4

An atoms file is required even when the initial structure is to be read from a CIF file. In that case, the order of atoms listed must match the order they are listed in the CIF, and vacancies can be specified using the chemical symbol “X”, i.e,

  X 5 0

Input file

The input file lists the values of all of the options for a ChemDASH calculation in the format:


where a “#” is a comment character. A minimal working example of an input file is given below:

  # General inputs
  # GULP inputs
  gulp_files=conj, bfgs
  # GULP Keywords and Options
  gulp_keywords=opti, c6, pot, conp
  gulp_options=time 5 minutes
  gulp_calc_1_options=stepmx 0.1
  gulp_calc_2_options=stepmx 0.5, lbfgs_order 5000, maxcyc 1000

The “temp” option states the value of kT in eV for the Monte–Carlo temperature that determines whether or not we hop to higher–energy basins during the run. The total number of structures explored in the run is given by “max_structures”. The other options listed in this example are explained in the following sections, and a full list of input options, with default and supported values, is given in the "Full List of Input Options" section.

Test Suite

ChemDASH has a test suite written in pytest, contained in the directory “tests”. If pytest is installed, the test suite can be run by typing:

 py.test tests

If any tests fail, please contact the developers.


Initialisation Grids

There are three possible initialisation grids in ChemDASH: “orthorhombic”, “rocksalt”, and “close_packed”. These are specified in the “grid_type” input option. There are two more input options that need to be considered. Firstly “grid_points” is used to specify the number of grid points on the ANION sublattice. This can be input as a single number for an $a \times a \times a$ grid, two comma–separated numbers to give an $a \times a \times c$ grid, or three comma–separated numbers to give an $a \times b \times c$ grid. Secondly, “cell_spacing” is used to specify the distance between anion points on the initialisation grid, this is specified in the same format as the anion grid points.

Initialise from CIF

When initialising from a CIF file, the file should be specified in the input file with the option “initial_structure_file”. A “.atoms” file is still required, with the atoms listed in the same order in both the “.atoms” file and the CIF file. In addition to setting “grid_points” and “cell_spacing”, for close–packed initialisation grids we can set the stacking sequence with “cp_stacking_sequence” using a string consisting of “A”, “B”, and “C” provided the number of layers is equal to the final value in “grid_points”. We can also choose from an “oblique” or “centred_rectangular” lattice using “cp_2d_lattice”.


ChemDASH gives the option of using a vacancy grid by setting the option “vacancy_grid” to True. A vacancy grid is a cubic grid of points placed onto the structure, with points that lie within a certain distance of an atom removed. The spacing of the vacancies is set with “vacancy_separation”, and the exclusion radius around each atom within which the points on the vacancy grid are removed is set using “vacancy_exclusion_radius”. If a vacancy grid is not used, then the leftover points from the initialisation grid are used as vacancies.


Structural optimisation in ChemDASH is handled by either GULP or VASP. The desired software is set by the input option “calculator”, with “calculator_cores” used to set how many cores are desired for parallel calculations. The option “update_atoms” (default=True) is used to decide whether to swap atoms in optimised geometries (if True), or revert to the original, unoptimised geometry for the swap.

In ChemDASH, it is possible to run structural optimisations in a number of stages, with a different set of optimisation settings for each stage. For example, different stages of the calculation can be used to switch between conjugate gradient and BFGS algorithms, or to switch to higher precision parameters as the calculation progresses. The number of stages in the calculation is set with “num_calc_stages”, and ChemDASH provides the options to set GULP/VASP options for each stage of the calculation (see below).


The filepath of the GULP executable should be given as “gulp_executable” in the input file. The keywords to be applied to ALL stages of the gulp calculation are listed in the ChemDASH input file as “gulp_keywords”, whilst keywords to apply to a particular stage of the calculation are given as “gulp_calc_<number>_keywords” (e.g., “gulp_calc_1_keywords”). Similarly, for GULP options we use “gulp_options” for all stages and “gulp_calc_<number>_options” for a particular stage in the ChemDASH input file. Both keywords and options are given as comma–separated lists. When optimising using GULP, it is possible to terminate the calculation if the gnorm is above a certain value after a particular stage by giving a value for “gulp_calc_<number>_max_gnorm”.

For each GULP calculation, the GULP output files are saved as “structure_<number>_<stage>.<gin|got|res>”. The strings for each stage are given as a comma–separated list in the ChemDASH input option “gulp_files”. GULP uses force fields to optimise structures, the file containing the forcefield for the calculation is found from the option “gulp_library”. If any elements in this forcefield use a shell, these elements need to be listed in the “gulp_shells” input option. GULP optimisation are at risk of running for an extremely long time., even with the gulp option “timeout” enabled. Therefore, there is a ChemDASH input option “calculator_time_limit” that can be used to terminate GULP calculations after the given number of seconds.


The filepath of the VASP executable should be given as “vasp_executable” in the input file. The settings to be applied to ALL stages of the VASP calculation are listed in the ChemDASH input file as “vasp_settings”, whilst settings to apply to a particular stage of the calculation are given as “vasp_calc_<number>_settings” (e.g., “vasp_calc_1_settings”). The settings required for this input into ChemDASH are the contents of a VASP INCAR file. The format for VASP settings is that of a python dictionary, which consists of a comma–separated list of “<key>:<value>” pairs. For example,

 vasp_settings=xc:PBE, prec:Normal, encut:600

The VASP k–points are provided to ChemDASH using the option “vasp_kpoints”, where one, two or three numbers can be provided to define a $k_{1}\times k_{2}\times k_{3}$ grid. For the pseudopotential, the option “vasp_pp_dir” requires the filepath of the POTCAR directory, and any elements that do not use the standard POTCAR file should be listed with their extension (i.e., characters after the chemical symbol), for example:

 vasp_settings=Li:_sv, Mg:_pv

Vasp optimisations are run until they successfully converge in a single self-consistent field loop, or they hit the limit provided by the “vasp_max_convergence_calcs” option.

Swapping Atoms

The method of ranking atoms for directed swapping is controlled by the “atom_rankings” input option. For random swapping this should be set as “random”, otherwise set it to “bvs”, “site_pot” or “bvs+” for thye respective methods of directed swapping. Note that the “site_pot” and “bvs+” directed swapping is only supported for GULP, i.e., “calculator=gulp”.

When swapping atoms in ChemDASH, the first choice made is the swap group, which is the set of atoms available for swapping. The possible groups are:

  • cations – non–trivially swap a set of cations,

  • anions – non–trivially swap a set of anions,

  • atoms – non–trivially swap any atoms, but not vacancies,

  • all – non–trivially swap any atoms and vacancies,

  • atoms–vacancies – choose a set of atoms and swap each one with a vacancy.

where the first four groups are the default set of swap groups in ChemDASH. Note that the “all" group differs from the “atoms–vacancies” group in that the “all" group consists of atom–atom swaps and/or atom–vacancy swaps, whereas the “atoms–vacancies” group is restricted to atom–vacancy swaps. In addition, custom swap groups can be specified that enable swaps to be restricted to atoms of particular species, for example, “Sr–O” would restrict swaps to Sr and O atoms, with vacancies are denoted as “X”. Custom swap groups can be constructed from any combination of elements, provided there are at least two elements present in the swap group and all of the elements are present in the structure. The choice of swap groups can be weighted by specifying the weight for each group in dictionary format. If weights are used, then a weight must be specified for each swap group. If no weights are specified, then all swap groups are equally likely to be chosen. An example of the “swap_groups” option is:

 swap_groups=cations:1, atoms:1, all:1, Sr-X:2

In this example, ChemDASH can choose between the cations, atoms, all, and Sr-X groups for each swap, with the Sr-X group being twice as likely to be chosen as the others.

Full List of Input Options

ChemDASH Input file option Description
atom_rankings The metric used to rank atoms for swapping. Supported values are: "random" (default), "bvs", "bvs+", "site_pot". Note that site potential and bvs+ directed swapping are only supported for gulp.
atoms_file File in which the species, number and oxidation state of the atoms used in this calculation are specified.
bvs_file Raw Bond Valence Sum file for this calculation. Records the bond valence sum for the atoms in each structure.
calculator The materials modelling code used for calculations. Default: gulp
calculator_cores The number of parallel cores used for the calculator. Default: 1.
calculator_time_limit Used in the bash "timeout" command, GULP calculations will automatically terminate after this amount of time has expired.
cell_spacing The spacing between two ANION grid points. Default: 2.0 A0
converge_first_structure If True, abort the run if the initial structure is not converged. Default: True
cp_2d_lattice Lattice type for anion layers in close packed grids. Supported values are: "oblique" (default) and "centred_rectangular"
cp_stacking_sequence Anion layer stacking sequence for close packed grids.
directed_num_atoms For directed swapping, the number of extra atoms available to choose between from the top of the list for each species. Default: 0
directed_num_atoms_increment For directed swapping, the amount by which to increase (decrease) the number of extra values available to choose between from the top of the list for each species when a structure is (not) repeated. Default: 0
dopable_atoms A list of atoms that may be swapped out in a swap-dope run. Default: None
doping_threshold The threshold which defines the probability of a doping step occurring at any given step. Default: 0.1
energy_file Energy file for this calculation. Records the structure number, energies and volumes of accepted structures.
energy_step_file Energy step file for this calculation. Records the structure number, energies and volumes of accepted structures for plotting.
force_vacancy_swaps If True, vacancies cannot swap with each other, they must be replaced by atoms. Default: True.
grid_points The number of points on each dimension of the ANION grid, to form an a x b x c grid for anions (cation points defined by grid type). Default: 2x2x2
grid_type Initial layout of cation and anion grids. Supported values are "orthorhombic" (default), rocksalt", close_packed". Default: "orthorhombic".
gulp_calc_1_keywords Comma-separated list of keywords for first GULP calculation. Default: None
gulp_calc_1_max_gnorm If specified, terminate a GULP calculation if the final gnorm exceeds this value after the first stage.
gulp_calc_1_options Options for first GULP calculation. Default: None
gulp_calc_2_keywords Comma-separated list of keywords for second GULP calculation. Default: None
gulp_calc_2_max_gnorm If specified, terminate a GULP calculation if the final gnorm exceeds this value after the second stage.
gulp_calc_2_options Options for second GULP calculation. Default: None
gulp_calc_3_keywords Comma-separated list of keywords for third GULP calculation. Default: None
gulp_calc_3_max_gnorm If specified, terminate a GULP calculation if the final gnorm exceeds this value after the third stage.
gulp_calc_3_options Options for third GULP calculation. Default: None
gulp_calc_4_keywords Comma-separated list of keywords for fourth GULP calculation. Default: None
gulp_calc_4_max_gnorm If specified, terminate a GULP calculation if the final gnorm exceeds this value after the fourth stage.
gulp_calc_4_options Options for fourth GULP calculation. Default: None
gulp_calc_5_keywords Comma-separated list of keywords for fifth GULP calculation. Default: None
gulp_calc_5_max_gnorm If specified, terminate a GULP calculation if the final gnorm exceeds this value after the fifth stage.
gulp_calc_5_options Options for fifth GULP calculation. Default: None
gulp_executable The filepath of the GULP executable to be used. Default: "./gulp".
gulp_files Strings appended to each of the GULP files used to distinguish each calculation.
gulp_keywords Comma-separated list of keywords for all GULP calculations. Default: "opti, pot"
gulp_library Library file containing the forcefield to be used in GULP calculations. NOTE -- this takes precedence over a library specified in "gulp_options".
gulp_options Options for all GULP calculations. Default: None
gulp_shells List of atoms to have a shell attached.
ionic_convergence_steps Integer defining the minimum number of ionic convergence steps required to achieve convergence. Default: 1
initial_mag_moments List of initial magnetic moments for a spin-polarised run. Default: None
initial_structure_file If specified, read in the initial structure from this cif file.
keep_mag_structure_constant If True, magnetic structure for each optimisation is kept constant across crystal positions. Default: False
max_structures This run of the code will terminate after this number of structures have been considered in this and all previous runs.
neighbourhood_atom_distance_limit The minimum distance allowed between atoms in the local combinatorial neighbourhood method. Default: 1.0
num_calc_stages Number of GULP/VASP calculations to be run for each structure. Default: 1.
num_neighbourhood_points The number of points used along each axis in the local combinatorial neighbourhood method. Default: 1
num_structures The number of structures we will consider in this run of the code.
number_weightings The method used to construct the weightings used to choose the number of atoms to swap. Supported values are "arithmetic" (default), "geometric", "uniform", and "pinned_pair".
output_file Output file for this calculation. Records the swaps for each structure, energies and acceptances.
output_trajectory If true, write ASE trajectory files. Default: True
pair_weighting The initial proportional probability of swapping 2 atoms compared to any other number when using the "pinned_pair" option for "number_weightings". Default: 1.0
pair_weighting_scale_factor The factor by which we increase the proportional probability of swapping 2 atoms compared to any other number when we explore new basins (we decrease by the inverse factor for repeated basins) when using the "pinned_pair" option for "number_weightings". Default: 1.0
potential_derivs_file Potential derivs file for this calculation. Records the resolved derivatives of the site potentials for each structure.
potentials_file Potentials file for this calculation. Records the site potentials for each structure.
random_dopant_atoms A pool of dopants to be randomly doped into the structure at random places.
random_seed The value used to seed the random number generator. Alternatively, the code can generate one itself, which is the default behaviour.
restart If True, use data in a numpy archive (specified by restart_file keyword) to continue a previous run. Default: False
restart_file Name of the numpy archive from which to read data in order to continue a previous run.
rng_warm_up Number of values from the RNG to generate and discard after seeding the generator. Default: 0.
save_outcar If True, retain the final OUTCAR file from each structure optimised with VASP as "OUTCAR_[structure_index]". Default: False.
search_local_neighbourhood If True, uses the local combinatorial neighbourhood method to try and lower the energy of structures prior to relaxation. Default: False
seed_bits The number of bits used in the seed of the random number generator, The allowed values are 32 and 64. Default: 64
solid_solution_end_members A series of entries of end members in a solid solution in form - label: energy. These are needed for calculating energy differences between structures of varied compositions.
swap_groups The groups of atoms that can be involved in swaps. The default groups are: "cations", "anions", "atoms", and "all" (atoms and vacancies). The input can include these, the additional swap group "atoms-vacancies" (always swap atoms with vacancies), or any custom swap group in the format [Chemical Symbol]-[Chemical Symbol]-[Chemical Symbol]. . . e.g., "Sr-X". A weighting can also be specified for each group as follows: "cations:1, atoms:2, all:3".
swap_magnetic_moments If True, the magnetic moments are swapped while atomic positions of atoms are held constant throughout the ChemDASH run. Default: False
temp The Monte-Carlo temperature (strictly, the value of kT in eV). Determines whether swaps to basins of higher energy are accepted. Default: 0.0
temp_scale_factor The factor by which we increase the temperature after rejected structures (we decrease by the inverse factor for accepted structures). Default: 1.0
testing If True, no optimisation calculations are run (by accepting all structures regardless). This is useful for testing of swap groups, doping schemes and magnetic structures without waiting for long optimization times. Default: False
update_atoms If true, swap atoms based on relaxed structures, rather than initial structures. Default: True.
vacancy_exclusion_radius The minimum allowable distance between an atom and a vacancy on the vacancy grid. Default: 2.0 A0
vacancy_grid If true, apply vacancy grids to each structure in which we will swap atoms. Default: True.
vacancy_separation The nearest neighbour distance between two vacancies on the vacancy grid. Default: 1.0 A0
vasp_calc_1_settings Settings for the first stage of the VASP calculation. Default: None.
vasp_calc_2_settings Settings for the second stage of the VASP calculation. Default: None.
vasp_calc_3_settings Settings for the third stage of the VASP calculation. Default: None.
vasp_calc_4_settings Settings for the fourth stage of the VASP calculation. Default: None.
vasp_calc_5_settings Settings for the fifth stage of the VASP calculation. Default: None.
vasp_executable The filepath of the vasp executable to be used. Default: "./vasp"
vasp_kpoints Number of k points to use in VASP calculations. Default: 1
vasp_max_convergence_calcs Maximum number of VASP calculations performed in the final stage for convergence -- we abandon the calculation after this. Default: 10.
vasp_pp_dir Path to directory containing VASP pseudopotential files. Default: ".".
vasp_pp_setups Pseudopotential file extensions for each element.
vasp_settings Settings for all VASP calculations. Default: None.
verbosity Controls the level of detail in the output. Valid options are: "verbose", "terse". Default: "verbose"

Notes on Magnetism

ChemDASH in its original form has no implementation for considering magnetic systems. In this magnetic development version a few different considerations of magnetism are implemented by the following keywords

ChemDASH Input file option Description
initial_mag_moments Specifies the initial magnetic moments for VASP calculations. Can be specified as a python list or in the VASP comma-separated form. Default: None
keep_mag_structure_constant If True, magnetic moments are fixed to their initial site positions as atoms are swapped. Default: False
swap_magnetic_moments If True, atomic positions are fixed and magnetic moments are swapped. Default: False

By default, the magnetic moments are set to the atoms in the initial structure, and the moments follow the atoms as they are swapped. using keep_mag_structure_constant fixes the magnetic moments to the original crystal site. Using swap_magnetic_moments allows for the use of ChemDASH as a magnetic swapping algorithm (although this is in its preliminary stages and should not be used at present).

A custom version of the ase module write_cif is included in this version ChemDASH to allow for the writing of the final magnetic structure to the cif file. This can be visualised with the VESTA programme.

Solid solutions

Random doping has been implemented in ChemDASH to allow for variable compositions to be explored as atoms are swapped. This implementation was driven by the interest in mixed transition metal oxides of the spinel structure which show different occupancies of octahedral and tetrahedral sites depending on the composition of the spinel structure. By giving a certain probability to random doping at any given step, the idea is that the phase space of these structures can be explored more efficiently than by running separate runs at different compositions.

ChemDASH Input file option Description
dopable_atoms A list of species that are available for doping in the solid solution routine
doping_threshold The threshold which defines the time averaged proportion of dopes over swaps. Default: 0.1
random_dopant_atoms A pool of dopants to be randomly doped into the structure at random sites within the constraints of dopable_atoms
solid_solution_end_members Entries of end members of the solid solution in the form species: energy

The implementation here is defined by four main parameters: dopable_atoms, doping_threshold, random_dopant_atoms and the solid_solution_end_members. A doping pool of random_dopant_atoms is specified as a list of length n of all possible atoms that can be doped in to the structure. The dopable_atoms parameter defined all possible species that can be involved in the doping procedure. When a doping step occurs, The atom that is doped out replaces an atom in the doping pool defined by random_dopant_atoms. This means that the composition changes are reversible. The choice of atom to dope in is random. The doping threshold defines the probability that a ChemDASH swap results in a dope over a swap. The solid solution end members define the energies for use in a doping step.


