DFT geometry optimizers

These are undeservedly less attention to optimizers than density functionals (concerning Jacob’s ladder). It is not even covered in the recent review: Best-Practice DFT Protocols for Basic Molecular Computational Chemistry. At the same time, in my current projects, the most resource-demanding was geometry optimization – the time spent on optimizing structures was much longer than a single-point calculation. Papers that introduce new (AI-based) optimizers promise significant speed-up. However, there are always some problems:

  1. The tested systems are different from my electrochemical interfaces.
  2. The code is not available or difficult to install.
  3. The code is outdated and contains bugs.
  4. Optimizers perform worse than the common ones, like QuasiNewton in ASE.

ASE wiki lists all internal and some external optimizers and provides their comparison. I have checked the most promising on a high-entropy alloy slab.

Observation 1. QuasiNewton outperforms all other optimizers. Point. I have run a standard GPAW/DFT/PBE/PW optimization with various optimizers:

Observation 2. Pre-optimizing the slab with a cheaper method does not reduce the number of optimization steps. I have preoptimized the geometry with TBLITE/DFTB/GFN1-xTB to continue with GPAW/DFT/PBE/PW. Preoptimization takes just some minutes and the obtained geometry looks similar to the DFT one but that does not reduce the number of DFT optimization steps.

OptimizerN steps*Time$N steps*#Total time#
BFGS1602:44:271703:01:26
LBFGS1502:30:351602:55:04
BondMin1202:46:271302:45:07
GPMin1205:26:233108:14:22
MLMin38verylong2812:31:29
FIRE3805:06:564405:56:54
QuasiNewton801:36:23902:00:10

Note * – the printed number of steps might different from the actuall number of calculations because each calculator has a different way of reporting that number.

Note $ – the time between the end of the first and last steps.

Note # – started from the TBLITE/DFTB/GFN1-xTB preoptimized geometry.

N.B! I have done my test only once in two runs: starting with slab.xyz and preoptized geometry. Runs were on similar nodes and all optimizations were done on the same node.

Conclusion. Do not believe in claims in articles advertizing new optimizers – Run your tests before using them.

A practical finding. The usual problem with calculations that require many optimization steps is that they need to fit into HPC time limits. On the restart, ASE usually rewrites the trajectory. Some optimizers (GPMin and AI-based) could benefit from reading the full trajectory. So, I started writing two trajectories and a restart file like this.

# Restarting
if os.path.exists(f'{name}_last.gpw') == True and os.stat(f'{name}_last.gpw').st_size > 0:
    atoms,calc = restart(f'{name}_last.gpw', txt=None)
    parprint(f'Restart from the gpw geometry.')
elif os.path.exists(f'{name}_full.traj') == True and os.stat(f'{name}_full.traj').st_size > 0:
    atoms = read(f'{name}_full.traj',-1)
    parprint(f'Restart with the traj geometry.')
else:
    atoms = read(f'{name}_init.xyz')
    parprint(f'Start with the initial xyz geometry.')

# Optimizing
opt = QuasiNewton(atoms, trajectory=f'{name}.traj', logfile=f'{name}.log')
traj= Trajectory(f'{name}_full.traj', 'a', atoms)
opt.attach(traj.write, interval=1)
def writegpw():
    calc.write(f'{name}_last.gpw')
opt.attach(writegpw, interval=1)
opt.run(fmax=0.05, steps=42)

Here are some details on the tests.

My gpaw_opt.py for DFT calculations on 24 cores:

# Load modules
from ase import Atom, Atoms
from ase.build import add_adsorbate, fcc100, fcc110, fcc111, fcc211, molecule
from ase.calculators.mixing import SumCalculator
from ase.constraints import FixAtoms, FixedPlane, FixInternals
from ase.data.vdw_alvarez import vdw_radii
from ase.db import connect
from ase.io import write, read
from ase.optimize import BFGS, GPMin, LBFGS, FIRE, QuasiNewton
from ase.parallel import parprint
from ase.units import Bohr
from bondmin import BondMin
from catlearn.optimize.mlmin import MLMin
from dftd4.ase import DFTD4
from gpaw import GPAW, PW, FermiDirac, PoissonSolver, Mixer, restart
from gpaw.dipole_correction import DipoleCorrection
from gpaw.external import ConstantElectricField
from gpaw.utilities import h2gpts
import numpy as np
import os

atoms = read('slab.xyz')
atoms.set_constraint([FixAtoms(indices=[atom.index for atom in atoms if atom.tag in [1,2]])])

# Set calculator
kwargs = dict(poissonsolver={'dipolelayer':'xy'},
              xc='RPBE',
              kpts=(4,4,1),
              gpts=h2gpts(0.18, atoms.get_cell(), idiv=4),
              mode=PW(400),
              basis='dzp',
              parallel={'augment_grids':True,'sl_auto':True,'use_elpa':True},
             )
calc = GPAW(**kwargs)

#atoms.calc = SumCalculator([DFTD4(method='RPBE'), calc])
#atoms.calc = calc

# Optimization paramters
maxf = 0.05

# Run optimization
###############################################################################

# 2.A. Optimize structure using MLMin (CatLearn).
initial_mlmin = atoms.copy()
initial_mlmin.set_calculator(calc)
mlmin_opt = MLMin(initial_mlmin, trajectory='results_mlmin.traj')
mlmin_opt.run(fmax=maxf, kernel='SQE', full_output=True)

# 2.B Optimize using GPMin.
initial_gpmin = atoms.copy()
initial_gpmin.set_calculator(calc)
gpmin_opt = GPMin(initial_gpmin, trajectory='results_gpmin.traj', logfile='results_gpmin.log', update_hyperparams=True)
gpmin_opt.run(fmax=maxf)

# 2.C Optimize using LBFGS.
initial_lbfgs = atoms.copy()
initial_lbfgs.set_calculator(calc)
lbfgs_opt = LBFGS(initial_lbfgs, trajectory='results_lbfgs.traj', logfile='results_lbfgs.log')
lbfgs_opt.run(fmax=maxf)

# 2.D Optimize using FIRE.
initial_fire = atoms.copy()
initial_fire.set_calculator(calc)
fire_opt = FIRE(initial_fire, trajectory='results_fire.traj', logfile='results_fire.log')
fire_opt.run(fmax=maxf)

# 2.E Optimize using QuasiNewton.
initial_qn = atoms.copy()
initial_qn.set_calculator(calc)
qn_opt = QuasiNewton(initial_qn, trajectory='results_qn.traj', logfile='results_qn.log')
qn_opt.run(fmax=maxf)

# 2.F Optimize using BFGS.
initial_bfgs = atoms.copy()
initial_bfgs.set_calculator(calc)
bfgs_opt = LBFGS(initial_bfgs, trajectory='results_bfgs.traj', logfile='results_bfgs.log')
bfgs_opt.run(fmax=maxf)

# 2.G. Optimize structure using BondMin.
initial_bondmin = atoms.copy()
initial_bondmin.set_calculator(calc)
bondmin_opt = BondMin(initial_bondmin, trajectory='results_bondmin.traj',logfile='results_bondmin.log')
bondmin_opt.run(fmax=maxf)

# Summary of the results
###############################################################################

fire_results = read('results_fire.traj', ':')
parprint('Number of function evaluations using FIRE:',
         len(fire_results))

lbfgs_results = read('results_lbfgs.traj', ':')
parprint('Number of function evaluations using LBFGS:',
         len(lbfgs_results))

gpmin_results = read('results_gpmin.traj', ':')
parprint('Number of function evaluations using GPMin:',
         gpmin_opt.function_calls)

bfgs_results = read('results_bfgs.traj', ':')
parprint('Number of function evaluations using BFGS:',
         len(bfgs_results))

qn_results = read('results_qn.traj', ':')
parprint('Number of function evaluations using QN:',
         len(qn_results))

catlearn_results = read('results_mlmin.traj', ':')
parprint('Number of function evaluations using MLMin:',
         len(catlearn_results))

bondmin_results = read('results_bondmin.traj', ':')
parprint('Number of function evaluations using BondMin:',
         len(bondmin_results))

Initial slab.xyz file:

45
Lattice="8.529357696932532 0.0 0.0 4.264678848466266 7.386640443507905 0.0 0.0 0.0 29.190908217261956" Properties=species:S:1:pos:R:3:tags:I:1 pbc="T T F"
Ir       0.00000000       1.62473838      10.00000000        5
Ru       2.81412943       1.62473838      10.00000000        5
Pt       5.62825885       1.62473838      10.00000000        5
Pd       1.40706471       4.06184595      10.00000000        5
Ag       4.22119414       4.06184595      10.00000000        5
Ag       7.03532356       4.06184595      10.00000000        5
Ag       2.81412943       6.49895353      10.00000000        5
Ru       5.62825885       6.49895353      10.00000000        5
Pt       8.44238828       6.49895353      10.00000000        5
Pt       0.00000000       0.00000000      12.29772705        4
Ag       2.81412943       0.00000000      12.29772705        4
Ru       5.62825885       0.00000000      12.29772705        4
Ru       1.40706471       2.43710757      12.29772705        4
Ir       4.22119414       2.43710757      12.29772705        4
Ag       7.03532356       2.43710757      12.29772705        4
Ag       2.81412943       4.87421514      12.29772705        4
Ir       5.62825885       4.87421514      12.29772705        4
Pd       8.44238828       4.87421514      12.29772705        4
Pd       1.40706471       0.81236919      14.59545411        3
Ir       4.22119414       0.81236919      14.59545411        3
Pt       7.03532356       0.81236919      14.59545411        3
Ag       2.81412943       3.24947676      14.59545411        3
Ir       5.62825885       3.24947676      14.59545411        3
Ir       8.44238828       3.24947676      14.59545411        3
Pd       4.22119414       5.68658433      14.59545411        3
Pt       7.03532356       5.68658433      14.59545411        3
Ag       9.84945299       5.68658433      14.59545411        3
Pd       0.00000000       1.62473838      16.89318116        2
Pd       2.81412943       1.62473838      16.89318116        2
Ag       5.62825885       1.62473838      16.89318116        2
Pt       1.40706471       4.06184595      16.89318116        2
Ag       4.22119414       4.06184595      16.89318116        2
Ag       7.03532356       4.06184595      16.89318116        2
Ru       2.81412943       6.49895353      16.89318116        2
Ru       5.62825885       6.49895353      16.89318116        2
Ru       8.44238828       6.49895353      16.89318116        2
Ir       0.00000000       0.00000000      19.19090822        1
Ag       2.81412943       0.00000000      19.19090822        1
Pt       5.62825885       0.00000000      19.19090822        1
Pd       1.40706471       2.43710757      19.19090822        1
Ag       4.22119414       2.43710757      19.19090822        1
Pd       7.03532356       2.43710757      19.19090822        1
Ag       2.81412943       4.87421514      19.19090822        1
Ru       5.62825885       4.87421514      19.19090822        1
Ir       8.44238828       4.87421514      19.19090822        1

My tblite_opt.py for DFTB calcualation with just one core. It takes some minutes but eventually crashes 🙁

# Load modules
from ase import Atom, Atoms
from ase.build import add_adsorbate, fcc100, fcc110, fcc111, fcc211, molecule
from ase.calculators.mixing import SumCalculator
from ase.constraints import FixAtoms, FixedPlane, FixInternals
from ase.data.vdw_alvarez import vdw_radii
from ase.db import connect
from ase.io import write, read
from ase.optimize import BFGS, GPMin, LBFGS, FIRE, QuasiNewton
from ase.parallel import parprint
from ase.units import Bohr
from tblite.ase import TBLite
import numpy as np
import os

# https://tblite.readthedocs.io/en/latest/users/ase.html

atoms = read('slab.xyz')
atoms.set_constraint([FixAtoms(indices=[atom.index for atom in atoms if atom.tag in [1,2]])])

# Set calculator
calc = TBLite(method="GFN1-xTB",accuracy=1000,electronic_temperature=300,max_iterations=300)
atoms.set_calculator(calc)
qn_opt = QuasiNewton(atoms, trajectory='results_qn.traj', logfile='results_qn.log', maxstep=0.1)
qn_opt.run(fmax=0.1)

To compare structures I have used MDanalysis, which unfortunately does not work with ASE traj, so I prepared xyz-files with “ase convert -n -1 file.traj file.xyz”

import MDAnalysis as mda
from MDAnalysis.analysis.rms import rmsd
import sys

def coord(file_name):
    file  = mda.Universe(f"{file_name}.xyz")
    atoms = file.select_atoms("index 1:9")
    return  atoms.positions.copy()

print(rmsd(coord(sys.argv[1]),coord(sys.argv[2])))

An instruction on installation of GPAW. TBLITE can be installed as “conda install -c conda-forge tblite”.

GPAW installation with pip

Between installation with conda and compilation of libraries, an intermediate path – installation of GPAW with pip – is a compromise for those who wish to text specific GPAW branches or packages.

For example, I wish to text self-interaction error correction (SIC) and evaluate Bader charges with pybader. Neither SIC nor pybader is compatible with the recent GPAW. Here is not to get a workable version.

# numba in pybader is not compatible with python 3.11, so create a conda environment with python 3.10
conda create -n gpaw-pip python=3.10 
conda activate gpaw-pip

conda install -c conda-forge libxc libvdwxc
conda install -c conda-forge ase
# ensure that you install the right openmpi (not external)
conda install -c conda-forge openmpi ucx
conda install -c conda-forge compilers
conda install -c conda-forge openblas scalapack
conda install -c conda-forge pytest
pip install pybader

# Get a developer version of GPAW with SIC
git clone -b dm_sic_mom_update https://gitlab.com/alxvov/gpaw.git
cd gpaw
cp siteconfig_example.py siteconfig.py

# In the siteconfig.py rewrite
'''
fftw = True
scalapack = True
if scalapack:
    libraries += ['scalapack']
'''

unset CC
python -m pip install -e .
gpaw info

k-points with kplib and gpaw

Choosing optimal k-points is a tricky task. In GPAW, one can set them manually, using size or density and following a rule of thumb:

calc = GPAW(kpts={'size': (4, 4, 4), 'gamma': True})
# or
calc = GPAW(kpts={'density': 2.5, 'gamma': True})

A rule of thumb for choosing the initial k-point sampling is, that the product, ka, between the number of k-points, k, in any direction, and the length of the basis vector in this direction, a, should be:

  • ka ~ 30 Å, for d band metals
  • ka ~ 25 Å, for simple metals
  • ka ~ 20 Å, for semiconductors
  • ka ~ 15 Å, for insulators

Remember that convergence in this parameter should always be checked.

https://wiki.fysik.dtu.dk/gpaw/tutorialsexercises/structureoptimization/surface/surface.html

The corresponding densities (ka/2π) are:

  • ka/2π ~ 4.8 Å, for d band metals
  • ka/2π ~ 4.0 Å, for simple metals
  • ka/2π ~ 3.2 Å, for semiconductors
  • ka/2π ~ 2.4 Å, for insulators

With the recent update, I can start using kplib (see paper) to choose the optimal generalized k-point grids. The main variable in kplib is min_distance, which is analogous to the density×2π. Read more about the min_distance at muellergroup.jhu.edu/K-Points.html.

Here is an example of my conda environment

conda create -n gpaw23 python=3.9
conda activate gpaw23
conda install -c conda-forge cxx-compiler
pip install kplib # from pypi.org/project/kpLib
conda install -c conda-forge gpaw

Here is a working example:

from ase import Atoms
from ase.parallel import parprint
from gpaw import GPAW, PW
from kpLib import get_kpoints
from pymatgen.io.ase import AseAtomsAdaptor

atoms = Atoms(cell=[[1.608145, -2.785389, 0.0], [1.608145, 2.785389, 0.0], [0.0, 0.0, 5.239962]],
              symbols=['Ga', 'Ga', 'N', 'N'],
              positions=[[ 1.608145  , -0.92846486,  2.61536983],
                         [ 1.608145  ,  0.92846486,  5.23535083],
                         [ 1.608145  , -0.92846486,  4.58957792],
                         [ 1.608145  ,  0.92846486,  1.96959692]],
              pbc=True)
structure = AseAtomsAdaptor.get_structure(atoms)
kpts_data = get_kpoints(structure, minDistance=30, include_gamma=False)
    
parprint("Found lattice with kplib: ")
parprint(f"Nominal kpts: {kpts_data['num_total_kpts']}")
parprint(f"Distinct kpts: {kpts_data['num_distinct_kpts']}")

atoms.calc = GPAW(xc='PBE',
                  mode=PW(400),
                  kpts=kpts_data['coords'],
                  symmetry={'point_group': True,
                            'time_reversal': True,
                            'symmorphic': False,
                            'tolerance': 1e-4},
                  txt='gpaw-out.txt')
energy = atoms.get_total_energy()

parprint(f"Total energy: {energy}")
parprint(f"kpts passed to GPAW: {len(atoms.calc.get_bz_k_points())}")
parprint(f"kpts in GPAW IBZ: {len(atoms.calc.get_ibz_k_points())}")

Installing GPAW with conda

[Updated on 20.04.2022, 15.04.2023, 10.06.2023, 03.10.2023, 04.06.2024]

In short, in a clean environment, everything should work with just five lines:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

Initialize conda. If it is in the .bashch, source it. If not, source “PATHTOCONDA/miniconda3/etc/profile.d/conda.sh”.

conda create --name gpaw -c conda-forge python=3.12
conda activate gpaw
conda install -c conda-forge openmpi ucx
conda install -c conda-forge gpaw=24.1.0=*openmpi*

For details, see the description below.

1. Install conda – software and environment management system.

Here is the official instruction: docs.conda.io/projects/conda/en/latest/user-guide/install/linux.html

On June 2024, run these:

wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh

If you wish to autostart conda, allow it to write to your .bashrc.

P.S. Here are good intros to conda:

N.B! If the locale is not set, add it to your .bashrc export

LC_ALL=en_US.UTF-8

Without it python might give a segmentation fault (core dumped) error.

2. Create a conda virtual environment:

conda create --name gpaw -c conda-forge python=3.12

If needed, remove the environment as:

conda remove --name gpaw --all

You can check the available environments as:

conda env list

3. Activate the virtual environment.

conda activate gpaw

4. Install gpaw:

Ensure that no interfering modules and environments are loaded.

Purge modules by executing:

module purge

To check whether some code (like mpirun) has an alternative path, try:

which codename

or

codename --version

There should be no mpirun, ase, libxc, numpy, scipy, etc. Otherwise, the installation with conda will most probably fail due to conflicting paths.

4.1. It is safer to install using gpaw*.yml file from vliv/conda directory on FEND:

conda env create -f gpaw.yml

Note that there are many yml files with different versions of GPAW.

4.2. Pure installation is simple but might not work:

conda install -c conda-forge openmpi

conda install -c conda-forge gpaw=*=*openmpi*

In 2022, there were problems with openmpi. Downgrading to version 4.1.2 helped:

conda install -c conda-forge openmpi=4.1.2

You might wish to install ucx but be aware that there are many problems with it, e. g. depending on mlx version:

conda install -c conda-forge ucx

If you get an error about GLIBCXX, try upgrading gcc:

conda install -c conda-forge gcc=12.1.0

4.3. To quickly check the installation, run “gpaw -P 2 test” or “gpaw info”.

The installation might fail. In case you succeed, save the yml file as:

conda env export | grep -v "^prefix: " > gpaw.yml

Now you can use it to install gpaw as:

conda env create -f gpaw.yml

To properly test the installation install pytest and follow wiki.fysik.dtu.dk/gpaw/devel/testing.html. That might take hours.

conda install -c conda-forge pytest pytest-xdist 

5. If needed, install extra packages within your specific conda environment (gpaw).

To apply D4 dispersion correction:

conda install -c conda-forge dftd4 dftd4-python

To analyze trajectories:

conda install -c conda-forge mdanalysis

To analyze electronic density (some might not work):

pip install git+https://github.com/funkymunkycool/Cube-Toolz.git
pip install git+https://github.com/theochem/grid.git
pip install git+https://github.com/theochem/denspart.git
pip install pybader
pip install cpmd-cube-tools
conda install -c conda-forge chargemol

To use catlearn:

pip install catlearn

To work with crystal symmetries:

conda install -c conda-forge spglib

Extra for visualization (matplotlib comes with ASE):

conda install -c conda-forge pandas seaborn bokeh jmol

To use notebooks (you might need to install firefox as well):

conda install -c conda-forge jupyterlab nodejs jupyter_contrib_nbextensions 

6. Run calculations by adding these lines to the submission script:

Note1: Check the path and change the USERNAME

Note2: Turn off ucx.

Note3: You may play with the number of openmp threads.

module purge
source "/groups/kemi/USERNAME/miniconda3/etc/profile.d/conda.sh"
conda activate gpaw
export OMP_NUM_THREADS=1
export OMPI_MCA_pml="^ucx"
export OMPI_MCA_osc="^ucx"
mpirun gpaw python script.py

Note4: Check an example in vliv/conda/sub directory.

7. Speeding-up calculations.

Add the “parallel” keyword to GPAW calculator:

parallel = {'augment_grids':True,'sl_auto':True},

For more options see wiki.fysik.dtu.dk/gpaw/documentation/parallel_runs/parallel_runs.html#manual-parallel. For LCAO mode, try ELPA. See wiki.fysik.dtu.dk/gpaw/documentation/lcao/lcao.html#notes-on-performance.

parallel = {'augment_grids':True,'sl_auto':True,'use_elpa':True},

For calculations with vdW-functionals, use libvdwxc:

xc = {'name':'BEEF-vdW', 'backend':'libvdwxc'},

8. If needed, add fixes.

To do Bayesian error estimation (BEE) see doublelayer.eu/vilab/2022/03/30/bayesian-error-estimation-for-rpbe/.

To use MLMin/NEB apply corrections from github.com/SUNCAT-Center/CatLearn/pulls

9. Something worth trying:

Atomic Simulation Recipes:

asr.readthedocs.io/en/latest/

gpaw-tools:

github.com/lrgresearch/gpaw-tools/

www.sciencedirect.com/science/article/pii/S0927025622000155

ase-notebook (won’t install at FEND because of glibc 2.17):

github.com/chrisjsewell/ase-notebook

ase-notebook.readthedocs.io/en/latest/

Optimizers:

gitlab.com/gpatom/ase-gpatom

gitlab.com/egarijo/bondmin/

gpaw benchmarking:

github.com/OleHolmNielsen/GPAW-benchmark-2021

github.com/mlouhivu/gpaw-benchmarks

members.cecam.org/storage/presentation/Ask_Hjorth_Larsen-1622631504.pdf

d4 parameters fitting:

github.com/dftd4/dftd4-fit

k-point grid choosing:

gitlab.com/muellergroup/kplib

Working abroad

Sometimes for some reason some VPN might not work as one expects. Then one can login to a node with a key.

First, check if you have a public key already. On your own computer in a terminal run:

cat ~/.ssh/id_rsa.pub

If this command worked, then skip the next step. If an error appeared, run this:

ssh-keygen

It will ask for a password. This is the password you give to the key. You can make one without password, in which case it will not ask for a password each time you try to use it.

You must have an access to the cluster somehow to run the following command:

ssh-copy-id -i ~/.ssh/id_rsa.pub <your-username>@cluster-address

Well, you can also just copy the string from id_rsa.pub on your machine to the id_rsa.pub on your cluster.

This trick does not work for all machines. So, you might use the established connection to connect to other machines. In my case I wanted to work with some notebooks, so I did:

ssh -N -L localhost:PORT1:localhost:PORT2 <your-username>@cluster-address

and so it worked.

Installing Gromacs and Lammps

FFTW:

./configure –enable-float –enable-shared –enable-sse2
make -j N
make install

===

Gromacs:

$ tar xvfz gromacs-x.y.z.tar.gz
$ ls
gromacs-x.y.z
$ mkdir build
$ cd build
$ cmake ../gromacs-x.y.z -DCMAKE_INSTALL_PREFIX=/home/yourUser/opt/gromacs.x.y.z -DGMX_CPU_ACCELERATION=SSE2 -DGMX_SIMD=SSE2
$ make -j N
$ make install

===

LAMMPS:

$ git clone git://git.lammps.org/lammps-ro.git LAMMPS
$ make yes-molecule
$ make mpi

===

To the .bashrc add:

#Gromacs

source /home/yourUser/opt/gromacs.x.y.z/bin/GMXRC
#or source /your/installation/prefix/here/bin/GMXRC

#LAMMPS
export LD_LIBRARY_PATH=~/LAMMPS/src:$LD_LIBRARY_PATH
export PATH=~/LAMMPS/src:$PATH

gsissh at ubuntu

To access one of the PRACE supercomputers, I was required to use gsissh. The corresponding gsi-openssh-clients package is not in the standard Ubuntu repositories. I have downloaded it from http://toolkit.globus.org/ftppub/gt5/5.2/stable/packages/deb/ubuntu/12.04/pool/contrib/g/gsi-openssh/.Besides that also globus-proxy-utils is needed. This one was easy to install via apt-get install globus-proxy-utils. Finally, everything got working when the bundle of X509 trusted certificates was downloaded from http://software.ligo.org/gridtools/debian/pool/main/o/osg-ca-certs/. Note, for some reason .globus folder did not appear in my home directory. To use gsissh, I had to create it and then move my user certificate to it with proper correct permissions -rw-------.

P.S. Do not forget to create a proxy certificate (grid-proxy-init) before login to your supercomputer.