Restarting calculations and performing multi-step jobs

ericb · July 7, 2020, 12:16am

Restarting calculations and performing multi-step jobs

Although running Q-Chem jobs with saved scratch space is relatively straightforward as outlined in in https://manual.q-chem.com/5.3/sect_running.html and https://manual.q-chem.com/5.3/sec_parallel.html, the advantages of doing so might not be clear. Many key quantities are stored in saved scratch, such as MO coefficients, density matrices, the molecular Hessian, and eigenvectors from TDDFT, among others. Some reasons for saving the scratch directory are

post-processing disk-intensive or higher-precision data with external tools or scripts,
restarting calculations that ran out of time but saved data that can be used as a restart guess, and
running large multi-step jobs that aren’t part of a single input file.

The latter is useful because a common pitfall with using the multi-step input files (with the @@@ delimiter) is that if an early step completes, but a later step doesn’t for some reason, much of your precious computation time is lost. You can use -save, but because the shared scratch is overwritten for each job step, the finalized scratch results from earlier steps aren’t available. It also lets you use different command-line options between jobs, such as for controlling parallelism, which isn’t possible with multi-step inputs.

The general strategy for running multi-step jobs in the shell might look like

#!/bin/bash

set -euo pipefail

qchem -save myexpensivejob1.in myexpensivejob1.out myexpensivejob
qchem -save myexpensivejob2.in myexpensivejob2.out myexpensivejob
cp -a {${QCSCRATCH},/permanent/location}/myexpensivejob

If keeping the scratch directory for each step is important, a copy needs to be made after each step:

#!/bin/bash

set -euo pipefail

qchem -save myexpensivejob1.in myexpensivejob1.out myexpensivejob
cp -a ${QCSCRATCH}/myexpensivejob /permanent/location/myexpensivejob1
qchem -save myexpensivejob2.in myexpensivejob2.out myexpensivejob
cp -a ${QCSCRATCH}/myexpensivejob /permanent/location/myexpensivejob2

where the scratch directory at the end of each step is kept separately, if this is important, since ${QCSCRATCH}/myexpensivejob will get reused and modified with each successive calculation that points to it.

Restarting jobs

The keywords for restarting jobs and splitting jobs into multiple steps are often used in conjunction:

SCF: scf_guess read
- Related is skip_scfman true, which is commonly set in later parts of a calculation when using already-converged orbitals.
Coupled cluster: cc_restart true
- This only works for orbital-optimized methods in the original ccman and requires having saved the T amplitudes with cc_saveampl.
CIS: cis_guess_disk true
- See cis_guess_disk_type for control over reading singlet and/or triplet amplitudes.
GVB: gvb_restart true
ASCI:asci_restart true
CDFTCI: cdftci_restart 2
Transport: trans_restart 1 in the $trans-method section
Orbital localization: see ercalc
Response properties (NMR and (hyper)polarizabilities): moprop_restart true
AIMD: aimd_init_veloc restart
FSSH: fssh_continue 1

Example: restarting an NMR calculation

Part 1: An NMR job that runs out of iterations during the NMR portion

$molecule
  0 1
  H        0.00000        0.00000        0.00000
  C        1.10000        0.00000        0.00000
  F        1.52324        1.22917        0.00000
  F        1.52324       -0.61459        1.06450
  F        1.52324       -0.61459       -1.06450
$end

$rem
   method               b3lyp
   basis                6-31g*
   moprop               1
   moprop_maxiter_1st   4   ! too small, for demonstration only
$end

Part 2: Restarting the incomplete NMR job

$molecule
read
$end

$rem
   method               b3lyp
   basis                6-31g*
   scf_guess            read
   skip_scfman          true   ! no need to redo the scf
   moprop               1
   moprop_restart       1
   guess_px             1      ! read in last guess for perturbed density
$end

Splitting complex jobs

Example: anharmonic frequencies

This example for calculating VCI(4) frequencies of formic acid is adapted from https://manual.q-chem.com/5.3/Ch10.S10.SS4.html by recognizing that the geometry optimization, harmonic frequency calculation, and anharmonic frequency calculation can be performed in separate steps.

Part 1: optimize the geometry

$molecule
   0 1
   C
   O, 1, CO
   H, 1, CH, 2, A
   H, 1, CH, 2, A, 3, D

   CO = 1.2
   CH = 1.0
   A  = 120.0
   D  = 180.0
$end

$rem
jobtype                     opt
method                      edf2
basis                       6-31g*
geom_opt_tol_displacement   1
geom_opt_tol_gradient       1
geom_opt_tol_energy         1
$end

Part 2: form the molecular Hessian and compute harmonic frequencies

$molecule
read
$end

$rem
jobtype       freq
method        edf2
basis         6-31g*
scf_guess     read
skip_scfman   true     ! no need to run SCF again
$end

Part 3: compute anharmonic frequencies and thermochemistry

$molecule
read
$end

$rem
jobtype       freq
method        edf2
basis         6-31g*
scf_guess     read
skip_scfman   false    ! if true, doesn't run SCF for finite difference steps!
skip_drvman   1        ! read in the already-computed harmonic frequencies
anhar         true
vci           4
$end

This also works for examining isotope and temperature effects on frequencies and thermochemistry, which does not require recomputing the full molecular Hessian.

Example: TDDFT Hessian

Part 1: Compute excited state energies with TDDFT

$rem
method = b3lyp
basis = 6-31+g*
cis_n_roots = 6
cis_singlets = true
cis_triplets = true
rpa = true
$end

$molecule
0 1
C           0  0  0.0
O           0  0  1.21
$end

Part 2: Read in the excitation vectors from step 1 to speed up computing the molecular Hessian of the first excited state

$rem
jobtype = freq
method = b3lyp
basis = 6-31+g*
scf_guess = read
skip_scfman = true       ! no need to run SCF again
cis_n_roots = 6
cis_singlets = true
cis_triplets = true
rpa = true
cis_guess_disk = true
cis_guess_disk_type = 1  ! read both singlet and triplet vectors
cis_state_deriv = 1      ! find frequencies of the lowest (first) excited state
vibman_print = 1
$end

$molecule
read
$end

Another use of splitting a TDDFT or CIS-like calculation is first calculating energies, and then reusing the amplitudes for plotting or multipole analysis with skip_cis_rpa.

Brandon_Meza · October 25, 2020, 5:38pm

Ericb,

Thank you so much for this information.

I am working in BOMD, so I am interested in the restarting option for AIMD. I tried aimd_init_veloc restart, but all my files in the $AIMD directory are overwritten.

My question, is there a way to append the new $AIMD files when restarting the BOMD calculation?

Thank you so much for your time.

jherbert · October 28, 2020, 6:18am

um, ‘cat’ ? The trajectory data for AIMD files that is printed in the scratch drive is simple ASCII text format, you can just concatenate new data with previous data.

ramon_trevino · December 8, 2021, 10:16pm

Thanks Ericb,

I have a question about the SCF restart. Can I restart a Frequency job if it did not finish the SCF calculations?

Additionally, is there a way to restart a frequency calculation that has reached some point in the vibrational analysis section (shown below) but did not finish due to the time limit?

                    VIBRATIONAL ANALYSIS                      
                    --------------------                      
                                                              
     VIBRATIONAL FREQUENCIES (CM**-1) AND NORMAL MODES         
  FORCE CONSTANTS (mDYN/ANGSTROM) AND REDUCED MASSES (AMU)    
               INFRARED INTENSITIES (KM/MOL)

Thanks

jherbert · December 9, 2021, 3:03pm

Can I restart a Frequency job if it did not finish the SCF calculations?

Answer is ‘yes’, but if the SCF has not completed then really you will need to restart the SCF. If you have saved the scratchfiles (to “jobname.scr”, say) then just add SCF_GUESS = READ to the $rem and use
qchem -nt <no_threads> -save jobname.in jobname.out jobname.scr
This will use the old scratch files, so you should basically be able to pick up at the SCF iteration where the previous job completed. This will work, in particular, with JOBTYPE=FREQ.

Additionally, is there a way to restart a frequency calculation that has reached some point in the vibrational analysis section (shown below) but did not finish due to the time limit?

Unfortunately I think that the answer to this is ‘no’, at the present time. (Work on this is underway.) Someone can correct me if I’m wrong about this…

ramon_trevino · December 9, 2021, 5:51pm

Thanks jherbert,

The SCF restart seems to have worked.