Career planning is easy with the right tool. In the EU, such tool is supposed to be the ResearchCOMP … well, I have just made it interactive and useful at https://vladislavivanistsev.github.io/researchcomp.
You can make a self-evaluation of your skills and competences. Then you can copy, email, download the summary to use it in planning your career.
This semester I am co-organising a seminar on computer simulations (3 ECTS, LOTI.05.076). One of the aim is to gather and unite researchers from different institutes. Our common topic is using computers in research, so we are “simulants”, i.e. simulating reality via calculations. Some of core organisers are pictured in the centre, from left to right: Taavi Repän, Tauno Tiirats, Veronika Zadin, and Juhan Matthias Kahk.
My first talk was about running simulations on HPC, e.g. using apptainers. Probably because of free pizza there were two–three dozen of participants from institutes of Chemistry, Physics, and Technology, which is a surprisingly high number for the university of Tartu. It is a great start and I am looking forward to contribute more into strengthening collaboration between the institutes.
Updated colors for atoms in ASE in 2024 look like this:
For POV rendering there are several options: ASE2, ASE3, Glass, Glass2, Intermediate, JMOL, Pale, Simple, VMD. I like intermediate because it does not have an reflection and glare.
Working with cubes can be tedious. I need to show a change in electronic density of a MOF. For that I made two cubes for neutral and charged MOF. Then took their difference using cube_tools, like this.
import numpy as np
from cube_tools import cube
# Load the cube files using the cube class
cube1 = cube('mof_opt_0.0.cube')
cube2 = cube('mof_opt_2.0.cube')
# Subtract the data from cube1 from cube2
cube_diff = cube('mof_opt_2.0.cube')
cube_diff.data = cube2.data - cube1.data
# Get z-axis data and find indices where z > 13.3 (jellium density)
z_indices_above_threshold = np.where(cube_diff.Z > 13.3)[0]
# Remove densities above z = 13.3 by setting them to zero
for idx in z_indices_above_threshold:
cube_diff.data[:, :, idx] = 0
# Save the modified cube data to a new file
cube_diff.write_cube('cdd.cube')
Once I have got the charge density difference and opened it in VMD, I realised that one part of my MOF is right at the border of a periodic cell, so that part of density was split. So, I used a terminal command to shift the cube like this “cube_tools -t -24 -36 0 cdd.cube”. I had to shift manually positions of the atoms by taking into account the voxels size. Next challenge was hiding half of my MOF to clear the view. So I used this tcl syntax in VMD:
GFN2-xTB [10.1021/acs.jctc.8b01176] is a strange model. I have been testing GFN1 and GFN2 on OOH adsorption on Pt(111). GFN1 from TBLITE with ASE works well. It converges and optimizes to meaningful structures. GFN2 however behaves odd in terms of convergence and optimization. For instance, O–H bond becomes broken. I have tested GFN2 also with xtb, for which the input is quite complicated in comparison to ASE inputs. Anyway, it worked only when I specified the periodic conditions in both xtb.inp and Pt-OOH.coord files. Then I executed xtb like this:
I wish everyone a Merry Christmas and a Happy New Year!
As I present, let me share the discovery of this year.
Ferdium is a program that combines all messengers in a single window! I tried to distinguish between work and life using different messengers for years. For work, I used fleep.io. Unfortunately, they decided to close all freemium accounts and raise the prices this year. So, I switched to other messengers and eventually mixed them up. Luckily, I found Ferdium! Just see my print screen – all messengers in one app:
Go to ferdium.org to get it.
By the way, Opera provides a similar functionality, but it does not have so many app in it. For example, it does not have Element.
That worked but felt way too complicated … like I am not going to use it on a daily basis. It also reminded me the very first experience with the Meta AI in late 2022 (which everyone already forgot).
These are undeservedly less attention to optimizers than density functionals (concerning Jacob’s ladder). It is not even covered in the recent review: Best-Practice DFT Protocols for Basic Molecular Computational Chemistry. At the same time, in my current projects, the most resource-demanding was geometry optimization – the time spent on optimizing structures was much longer than a single-point calculation. Papers that introduce new (AI-based) optimizers promise significant speed-up. However, there are always some problems:
The tested systems are different from my electrochemical interfaces.
The code is not available or difficult to install.
The code is outdated and contains bugs.
Optimizers perform worse than the common ones, like QuasiNewton in ASE.
ASE wiki lists all internal and some external optimizers and provides their comparison. I have checked the most promising on a high-entropy alloy slab.
Observation 1. QuasiNewton outperforms all other optimizers. Point. I have run a standard GPAW/DFT/PBE/PW optimization with various optimizers:
Observation 2. Pre-optimizing the slab with a cheaper method does not reduce the number of optimization steps. I have preoptimized the geometry with TBLITE/DFTB/GFN1-xTB to continue with GPAW/DFT/PBE/PW. Preoptimization takes just some minutes and the obtained geometry looks similar to the DFT one but that does not reduce the number of DFT optimization steps.
Optimizer
N steps*
Time$
N steps*#
Total time#
BFGS
16
02:44:27
17
03:01:26
LBFGS
15
02:30:35
16
02:55:04
BondMin
12
02:46:27
13
02:45:07
GPMin
12
05:26:23
31
08:14:22
MLMin
38
verylong
28
12:31:29
FIRE
38
05:06:56
44
05:56:54
QuasiNewton
8
01:36:23
9
02:00:10
Note * – the printed number of steps might different from the actuall number of calculations because each calculator has a different way of reporting that number.
Note $ – the time between the end of the first and last steps.
Note # – started from the TBLITE/DFTB/GFN1-xTB preoptimized geometry.
N.B! I have done my test only once in two runs: starting with slab.xyz and preoptized geometry. Runs were on similar nodes and all optimizations were done on the same node.
Conclusion. Do not believe in claims in articles advertizing new optimizers – Run your tests before using them.
A practical finding. The usual problem with calculations that require many optimization steps is that they need to fit into HPC time limits. On the restart, ASE usually rewrites the trajectory. Some optimizers (GPMin and AI-based) could benefit from reading the full trajectory. So, I started writing two trajectories and a restart file like this.
# Restarting
if os.path.exists(f'{name}_last.gpw') == True and os.stat(f'{name}_last.gpw').st_size > 0:
atoms,calc = restart(f'{name}_last.gpw', txt=None)
parprint(f'Restart from the gpw geometry.')
elif os.path.exists(f'{name}_full.traj') == True and os.stat(f'{name}_full.traj').st_size > 0:
atoms = read(f'{name}_full.traj',-1)
parprint(f'Restart with the traj geometry.')
else:
atoms = read(f'{name}_init.xyz')
parprint(f'Start with the initial xyz geometry.')
# Optimizing
opt = QuasiNewton(atoms, trajectory=f'{name}.traj', logfile=f'{name}.log')
traj= Trajectory(f'{name}_full.traj', 'a', atoms)
opt.attach(traj.write, interval=1)
def writegpw():
calc.write(f'{name}_last.gpw')
opt.attach(writegpw, interval=1)
opt.run(fmax=0.05, steps=42)
Here are some details on the tests.
My gpaw_opt.py for DFT calculations on 24 cores:
# Load modules
from ase import Atom, Atoms
from ase.build import add_adsorbate, fcc100, fcc110, fcc111, fcc211, molecule
from ase.calculators.mixing import SumCalculator
from ase.constraints import FixAtoms, FixedPlane, FixInternals
from ase.data.vdw_alvarez import vdw_radii
from ase.db import connect
from ase.io import write, read
from ase.optimize import BFGS, GPMin, LBFGS, FIRE, QuasiNewton
from ase.parallel import parprint
from ase.units import Bohr
from bondmin import BondMin
from catlearn.optimize.mlmin import MLMin
from dftd4.ase import DFTD4
from gpaw import GPAW, PW, FermiDirac, PoissonSolver, Mixer, restart
from gpaw.dipole_correction import DipoleCorrection
from gpaw.external import ConstantElectricField
from gpaw.utilities import h2gpts
import numpy as np
import os
atoms = read('slab.xyz')
atoms.set_constraint([FixAtoms(indices=[atom.index for atom in atoms if atom.tag in [1,2]])])
# Set calculator
kwargs = dict(poissonsolver={'dipolelayer':'xy'},
xc='RPBE',
kpts=(4,4,1),
gpts=h2gpts(0.18, atoms.get_cell(), idiv=4),
mode=PW(400),
basis='dzp',
parallel={'augment_grids':True,'sl_auto':True,'use_elpa':True},
)
calc = GPAW(**kwargs)
#atoms.calc = SumCalculator([DFTD4(method='RPBE'), calc])
#atoms.calc = calc
# Optimization paramters
maxf = 0.05
# Run optimization
###############################################################################
# 2.A. Optimize structure using MLMin (CatLearn).
initial_mlmin = atoms.copy()
initial_mlmin.set_calculator(calc)
mlmin_opt = MLMin(initial_mlmin, trajectory='results_mlmin.traj')
mlmin_opt.run(fmax=maxf, kernel='SQE', full_output=True)
# 2.B Optimize using GPMin.
initial_gpmin = atoms.copy()
initial_gpmin.set_calculator(calc)
gpmin_opt = GPMin(initial_gpmin, trajectory='results_gpmin.traj', logfile='results_gpmin.log', update_hyperparams=True)
gpmin_opt.run(fmax=maxf)
# 2.C Optimize using LBFGS.
initial_lbfgs = atoms.copy()
initial_lbfgs.set_calculator(calc)
lbfgs_opt = LBFGS(initial_lbfgs, trajectory='results_lbfgs.traj', logfile='results_lbfgs.log')
lbfgs_opt.run(fmax=maxf)
# 2.D Optimize using FIRE.
initial_fire = atoms.copy()
initial_fire.set_calculator(calc)
fire_opt = FIRE(initial_fire, trajectory='results_fire.traj', logfile='results_fire.log')
fire_opt.run(fmax=maxf)
# 2.E Optimize using QuasiNewton.
initial_qn = atoms.copy()
initial_qn.set_calculator(calc)
qn_opt = QuasiNewton(initial_qn, trajectory='results_qn.traj', logfile='results_qn.log')
qn_opt.run(fmax=maxf)
# 2.F Optimize using BFGS.
initial_bfgs = atoms.copy()
initial_bfgs.set_calculator(calc)
bfgs_opt = LBFGS(initial_bfgs, trajectory='results_bfgs.traj', logfile='results_bfgs.log')
bfgs_opt.run(fmax=maxf)
# 2.G. Optimize structure using BondMin.
initial_bondmin = atoms.copy()
initial_bondmin.set_calculator(calc)
bondmin_opt = BondMin(initial_bondmin, trajectory='results_bondmin.traj',logfile='results_bondmin.log')
bondmin_opt.run(fmax=maxf)
# Summary of the results
###############################################################################
fire_results = read('results_fire.traj', ':')
parprint('Number of function evaluations using FIRE:',
len(fire_results))
lbfgs_results = read('results_lbfgs.traj', ':')
parprint('Number of function evaluations using LBFGS:',
len(lbfgs_results))
gpmin_results = read('results_gpmin.traj', ':')
parprint('Number of function evaluations using GPMin:',
gpmin_opt.function_calls)
bfgs_results = read('results_bfgs.traj', ':')
parprint('Number of function evaluations using BFGS:',
len(bfgs_results))
qn_results = read('results_qn.traj', ':')
parprint('Number of function evaluations using QN:',
len(qn_results))
catlearn_results = read('results_mlmin.traj', ':')
parprint('Number of function evaluations using MLMin:',
len(catlearn_results))
bondmin_results = read('results_bondmin.traj', ':')
parprint('Number of function evaluations using BondMin:',
len(bondmin_results))
My tblite_opt.py for DFTB calcualation with just one core. It takes some minutes but eventually crashes 🙁
# Load modules
from ase import Atom, Atoms
from ase.build import add_adsorbate, fcc100, fcc110, fcc111, fcc211, molecule
from ase.calculators.mixing import SumCalculator
from ase.constraints import FixAtoms, FixedPlane, FixInternals
from ase.data.vdw_alvarez import vdw_radii
from ase.db import connect
from ase.io import write, read
from ase.optimize import BFGS, GPMin, LBFGS, FIRE, QuasiNewton
from ase.parallel import parprint
from ase.units import Bohr
from tblite.ase import TBLite
import numpy as np
import os
# https://tblite.readthedocs.io/en/latest/users/ase.html
atoms = read('slab.xyz')
atoms.set_constraint([FixAtoms(indices=[atom.index for atom in atoms if atom.tag in [1,2]])])
# Set calculator
calc = TBLite(method="GFN1-xTB",accuracy=1000,electronic_temperature=300,max_iterations=300)
atoms.set_calculator(calc)
qn_opt = QuasiNewton(atoms, trajectory='results_qn.traj', logfile='results_qn.log', maxstep=0.1)
qn_opt.run(fmax=0.1)
To compare structures I have used MDanalysis, which unfortunately does not work with ASE traj, so I prepared xyz-files with “ase convert -n -1 file.traj file.xyz”
Between installation with conda and compilation of libraries, an intermediate path – installation of GPAW with pip – is a compromise for those who wish to text specific GPAW branches or packages.
For example, I wish to text self-interaction error correction (SIC) and evaluate Bader charges with pybader. Neither SIC nor pybader is compatible with the recent GPAW. Here is not to get a workable version.
# numba in pybader is not compatible with python 3.11, so create a conda environment with python 3.10
conda create -n gpaw-pip python=3.10
conda activate gpaw-pip
conda install -c conda-forge libxc libvdwxc
conda install -c conda-forge ase
# ensure that you install the right openmpi (not external)
conda install -c conda-forge openmpi ucx
conda install -c conda-forge compilers
conda install -c conda-forge openblas scalapack
conda install -c conda-forge pytest
pip install pybader
# Get a developer version of GPAW with SIC
git clone -b dm_sic_mom_update https://gitlab.com/alxvov/gpaw.git
cd gpaw
cp siteconfig_example.py siteconfig.py
# In the siteconfig.py rewrite
'''
fftw = True
scalapack = True
if scalapack:
libraries += ['scalapack']
'''
unset CC
python -m pip install -e .
gpaw info
A rule of thumb for choosing the initial k-point sampling is, that the product, ka, between the number of k-points, k, in any direction, and the length of the basis vector in this direction, a, should be:
ka ~ 30 Å, for d band metals
ka ~ 25 Å, for simple metals
ka ~ 20 Å, for semiconductors
ka ~ 15 Å, for insulators
Remember that convergence in this parameter should always be checked.
With the recent update, I can start using kplib (see paper) to choose the optimal generalized k-point grids. The main variable in kplib is min_distance, which is analogous to the density×2π. Read more about the min_distance at muellergroup.jhu.edu/K-Points.html.
My work was supported by the Estonian Research Council under grants PUT1107, PRG259 and STP52. My research was supported by the from the European Union’s Horizon 2020 research and innovation programme under the Marie Skłodowska-Curie grant agreement No 101031656. All related posts are tagged with MSCA.