Restarting calculations and performing multi-step jobs
Although running Q-Chem jobs with saved scratch space is relatively straightforward as outlined in in https://manual.q-chem.com/5.3/sect_running.html and https://manual.q-chem.com/5.3/sec_parallel.html, the advantages of doing so might not be clear. Many key quantities are stored in saved scratch, such as MO coefficients, density matrices, the molecular Hessian, and eigenvectors from TDDFT, among others. Some reasons for saving the scratch directory are
- post-processing disk-intensive or higher-precision data with external tools or scripts,
- restarting calculations that ran out of time but saved data that can be used as a restart guess, and
- running large multi-step jobs that arenât part of a single input file.
The latter is useful because a common pitfall with using the multi-step input files (with the @@@
delimiter) is that if an early step completes, but a later step doesnât for some reason, much of your precious computation time is lost. You can use -save
, but because the shared scratch is overwritten for each job step, the finalized scratch results from earlier steps arenât available. It also lets you use different command-line options between jobs, such as for controlling parallelism, which isnât possible with multi-step inputs.
The general strategy for running multi-step jobs in the shell might look like
#!/bin/bash
set -euo pipefail
qchem -save myexpensivejob1.in myexpensivejob1.out myexpensivejob
qchem -save myexpensivejob2.in myexpensivejob2.out myexpensivejob
cp -a {${QCSCRATCH},/permanent/location}/myexpensivejob
If keeping the scratch directory for each step is important, a copy needs to be made after each step:
#!/bin/bash
set -euo pipefail
qchem -save myexpensivejob1.in myexpensivejob1.out myexpensivejob
cp -a ${QCSCRATCH}/myexpensivejob /permanent/location/myexpensivejob1
qchem -save myexpensivejob2.in myexpensivejob2.out myexpensivejob
cp -a ${QCSCRATCH}/myexpensivejob /permanent/location/myexpensivejob2
where the scratch directory at the end of each step is kept separately, if this is important, since ${QCSCRATCH}/myexpensivejob
will get reused and modified with each successive calculation that points to it.
Restarting jobs
The keywords for restarting jobs and splitting jobs into multiple steps are often used in conjunction:
- SCF:
scf_guess read
- Related is
skip_scfman true
, which is commonly set in later parts of a calculation when using already-converged orbitals.
- Related is
- Coupled cluster:
cc_restart true
- This only works for orbital-optimized methods in the original ccman and requires having saved the T amplitudes with
cc_saveampl
.
- This only works for orbital-optimized methods in the original ccman and requires having saved the T amplitudes with
- CIS:
cis_guess_disk true
- See
cis_guess_disk_type
for control over reading singlet and/or triplet amplitudes.
- See
- GVB:
gvb_restart true
- ASCI:
asci_restart true
- CDFTCI:
cdftci_restart 2
- Transport:
trans_restart 1
in the$trans-method
section - Orbital localization: see
ercalc
- Response properties (NMR and (hyper)polarizabilities):
moprop_restart true
- AIMD:
aimd_init_veloc restart
- FSSH:
fssh_continue 1
Example: restarting an NMR calculation
Part 1: An NMR job that runs out of iterations during the NMR portion
$molecule
0 1
H 0.00000 0.00000 0.00000
C 1.10000 0.00000 0.00000
F 1.52324 1.22917 0.00000
F 1.52324 -0.61459 1.06450
F 1.52324 -0.61459 -1.06450
$end
$rem
method b3lyp
basis 6-31g*
moprop 1
moprop_maxiter_1st 4 ! too small, for demonstration only
$end
Part 2: Restarting the incomplete NMR job
$molecule
read
$end
$rem
method b3lyp
basis 6-31g*
scf_guess read
skip_scfman true ! no need to redo the scf
moprop 1
moprop_restart 1
guess_px 1 ! read in last guess for perturbed density
$end
Splitting complex jobs
Example: anharmonic frequencies
This example for calculating VCI(4) frequencies of formic acid is adapted from https://manual.q-chem.com/5.3/Ch10.S10.SS4.html by recognizing that the geometry optimization, harmonic frequency calculation, and anharmonic frequency calculation can be performed in separate steps.
Part 1: optimize the geometry
$molecule
0 1
C
O, 1, CO
H, 1, CH, 2, A
H, 1, CH, 2, A, 3, D
CO = 1.2
CH = 1.0
A = 120.0
D = 180.0
$end
$rem
jobtype opt
method edf2
basis 6-31g*
geom_opt_tol_displacement 1
geom_opt_tol_gradient 1
geom_opt_tol_energy 1
$end
Part 2: form the molecular Hessian and compute harmonic frequencies
$molecule
read
$end
$rem
jobtype freq
method edf2
basis 6-31g*
scf_guess read
skip_scfman true ! no need to run SCF again
$end
Part 3: compute anharmonic frequencies and thermochemistry
$molecule
read
$end
$rem
jobtype freq
method edf2
basis 6-31g*
scf_guess read
skip_scfman false ! if true, doesn't run SCF for finite difference steps!
skip_drvman 1 ! read in the already-computed harmonic frequencies
anhar true
vci 4
$end
This also works for examining isotope and temperature effects on frequencies and thermochemistry, which does not require recomputing the full molecular Hessian.
Example: TDDFT Hessian
Part 1: Compute excited state energies with TDDFT
$rem
method = b3lyp
basis = 6-31+g*
cis_n_roots = 6
cis_singlets = true
cis_triplets = true
rpa = true
$end
$molecule
0 1
C 0 0 0.0
O 0 0 1.21
$end
Part 2: Read in the excitation vectors from step 1 to speed up computing the molecular Hessian of the first excited state
$rem
jobtype = freq
method = b3lyp
basis = 6-31+g*
scf_guess = read
skip_scfman = true ! no need to run SCF again
cis_n_roots = 6
cis_singlets = true
cis_triplets = true
rpa = true
cis_guess_disk = true
cis_guess_disk_type = 1 ! read both singlet and triplet vectors
cis_state_deriv = 1 ! find frequencies of the lowest (first) excited state
vibman_print = 1
$end
$molecule
read
$end
Another use of splitting a TDDFT or CIS-like calculation is first calculating energies, and then reusing the amplitudes for plotting or multipole analysis with skip_cis_rpa
.