Restarting calculations and performing multi-step jobs

Restarting calculations and performing multi-step jobs

Although running Q-Chem jobs with saved scratch space is relatively straightforward as outlined in in https://manual.q-chem.com/5.3/sect_running.html and https://manual.q-chem.com/5.3/sec_parallel.html, the advantages of doing so might not be clear. Many key quantities are stored in saved scratch, such as MO coefficients, density matrices, the molecular Hessian, and eigenvectors from TDDFT, among others. Some reasons for saving the scratch directory are

  • post-processing disk-intensive or higher-precision data with external tools or scripts,
  • restarting calculations that ran out of time but saved data that can be used as a restart guess, and
  • running large multi-step jobs that aren’t part of a single input file.

The latter is useful because a common pitfall with using the multi-step input files (with the @@@ delimiter) is that if an early step completes, but a later step doesn’t for some reason, much of your precious computation time is lost. You can use -save, but because the shared scratch is overwritten for each job step, the finalized scratch results from earlier steps aren’t available. It also lets you use different command-line options between jobs, such as for controlling parallelism, which isn’t possible with multi-step inputs.

The general strategy for running multi-step jobs in the shell might look like

#!/bin/bash

set -euo pipefail

qchem -save myexpensivejob1.in myexpensivejob1.out myexpensivejob
qchem -save myexpensivejob2.in myexpensivejob2.out myexpensivejob
cp -a {${QCSCRATCH},/permanent/location}/myexpensivejob

If keeping the scratch directory for each step is important, a copy needs to be made after each step:

#!/bin/bash

set -euo pipefail

qchem -save myexpensivejob1.in myexpensivejob1.out myexpensivejob
cp -a ${QCSCRATCH}/myexpensivejob /permanent/location/myexpensivejob1
qchem -save myexpensivejob2.in myexpensivejob2.out myexpensivejob
cp -a ${QCSCRATCH}/myexpensivejob /permanent/location/myexpensivejob2

where the scratch directory at the end of each step is kept separately, if this is important, since ${QCSCRATCH}/myexpensivejob will get reused and modified with each successive calculation that points to it.

Restarting jobs

The keywords for restarting jobs and splitting jobs into multiple steps are often used in conjunction:

Example: restarting an NMR calculation

Part 1: An NMR job that runs out of iterations during the NMR portion

$molecule
  0 1
  H        0.00000        0.00000        0.00000
  C        1.10000        0.00000        0.00000
  F        1.52324        1.22917        0.00000
  F        1.52324       -0.61459        1.06450
  F        1.52324       -0.61459       -1.06450
$end

$rem
   method               b3lyp
   basis                6-31g*
   moprop               1
   moprop_maxiter_1st   4   ! too small, for demonstration only
$end

Part 2: Restarting the incomplete NMR job

$molecule
read
$end

$rem
   method               b3lyp
   basis                6-31g*
   scf_guess            read
   skip_scfman          true   ! no need to redo the scf
   moprop               1
   moprop_restart       1
   guess_px             1      ! read in last guess for perturbed density
$end

Splitting complex jobs

Example: anharmonic frequencies

This example for calculating VCI(4) frequencies of formic acid is adapted from https://manual.q-chem.com/5.3/Ch10.S10.SS4.html by recognizing that the geometry optimization, harmonic frequency calculation, and anharmonic frequency calculation can be performed in separate steps.

Part 1: optimize the geometry

$molecule
   0 1
   C
   O, 1, CO
   H, 1, CH, 2, A
   H, 1, CH, 2, A, 3, D

   CO = 1.2
   CH = 1.0
   A  = 120.0
   D  = 180.0
$end

$rem
jobtype                     opt
method                      edf2
basis                       6-31g*
geom_opt_tol_displacement   1
geom_opt_tol_gradient       1
geom_opt_tol_energy         1
$end

Part 2: form the molecular Hessian and compute harmonic frequencies

$molecule
read
$end

$rem
jobtype       freq
method        edf2
basis         6-31g*
scf_guess     read
skip_scfman   true     ! no need to run SCF again
$end

Part 3: compute anharmonic frequencies and thermochemistry

$molecule
read
$end

$rem
jobtype       freq
method        edf2
basis         6-31g*
scf_guess     read
skip_scfman   false    ! if true, doesn't run SCF for finite difference steps!
skip_drvman   1        ! read in the already-computed harmonic frequencies
anhar         true
vci           4
$end

This also works for examining isotope and temperature effects on frequencies and thermochemistry, which does not require recomputing the full molecular Hessian.

Example: TDDFT Hessian

Part 1: Compute excited state energies with TDDFT

$rem
method = b3lyp
basis = 6-31+g*
cis_n_roots = 6
cis_singlets = true
cis_triplets = true
rpa = true
$end

$molecule
0 1
C           0  0  0.0
O           0  0  1.21
$end

Part 2: Read in the excitation vectors from step 1 to speed up computing the molecular Hessian of the first excited state

$rem
jobtype = freq
method = b3lyp
basis = 6-31+g*
scf_guess = read
skip_scfman = true       ! no need to run SCF again
cis_n_roots = 6
cis_singlets = true
cis_triplets = true
rpa = true
cis_guess_disk = true
cis_guess_disk_type = 1  ! read both singlet and triplet vectors
cis_state_deriv = 1      ! find frequencies of the lowest (first) excited state
vibman_print = 1
$end

$molecule
read
$end

Another use of splitting a TDDFT or CIS-like calculation is first calculating energies, and then reusing the amplitudes for plotting or multipole analysis with skip_cis_rpa.

1 Like