Is this too little resources

Hello,

I am running a fairly large EOM-CC job and it has crashed with an error. I am guessing that I don’t have enough scratch space to finish the calculation (correctly if I am wrong). Is there a way to estimate how much space is needed to complete this job? Thanks

The last few lines of the output with the error

    34   -1322.39795345   4.46e-12   1.23e-09
    35   -1322.39795345   3.79e-12   7.68e-10
 ------------------------------------------------------------------------------
         -1322.39795345                           CCSD T converged.

End of double precision
 SCF energy                 = -1320.23983636
 MP2 energy                 = -1322.35147842
 CCSD correlation energy    =    -2.15811709
 CCSD total energy          = -1322.39795345

 CCSD  T1^2 = 0.0307  T2^2 = 0.8579  Leading amplitudes:

 Amplitude    Orbitals with energies
 -0.0278       62 (A) A                  ->    65 (A) A
              -0.2242                          0.0160
 -0.0148       56 (A) A                  ->    94 (A) A
              -0.4268                          0.1365
  0.0146       62 (A) A                  ->    72 (A) A
              -0.2242                          0.0467
  0.0145       62 (A) A                  ->    69 (A) A
              -0.2242                          0.0407

 Amplitude    Orbitals with energies
  0.0287       58 (A) A      58 (A) B    ->    85 (A) A      85 (A) B
              -0.3390       -0.3391            0.1055        0.1063
 -0.0287       58 (A) A      58 (A) B    ->    85 (A) B      85 (A) A
              -0.3390       -0.3391            0.1063        0.1055
 -0.0287       58 (A) B      58 (A) A    ->    85 (A) A      85 (A) B
              -0.3391       -0.3390            0.1055        0.1063
  0.0287       58 (A) B      58 (A) A    ->    85 (A) B      85 (A) A
              -0.3391       -0.3390            0.1063        0.1055

 Computing CCSD intermediates for later calculations in double precision

 Q-Chem fatal error occurred in module /path/trunk-12072023/ccman2/qchem/ccman2_main.C, line 26:

 libvmm::page_file_posix<T>::advance(size_t, pos_t&), /path/trunk-12072023/libvmm/libvmm/page_file_posix.C (99), io_exception
fseek (Invalid argument)


 Please submit a crash report at q-chem.com/reporter

The computer on which the job crashed has 8.6 TiB of scratch space.

The input file

$comment
Cis isomer (rings on the same side on NN but it's a non-planar structure)

EOM-EA-CCSD/cc-pVDZ opt for the 1A' state starting from a charge +1 singlet

Neutral doublet reference for an EOM-EE job.
$end

$molecule
0 2
  C       2.6781911187    -1.7756773868    -1.2763959690
  C       2.4057235373    -0.4046638393    -1.1512733827
  C       3.4588997986    -2.4380817126    -0.3118390466
  C       3.9917334619    -1.7144238678     0.7690647817
  C       3.7479526134    -0.3358675487     0.8827525478
  C       2.9263566315     0.3099010754    -0.0563771273
  N       2.7996008146     1.7528834600     0.0526783944
  N       1.6825562332     2.3225413788     0.0523575229
  C       0.4341513897     1.5835562463     0.0752758955
  C      -0.5897558499     2.0732921462    -0.7541272382
  C      -1.8560458532     1.4742368630    -0.7567668371
  C      -2.1528722867     0.4053131171     0.1236385311
  C      -1.1297503516    -0.0370198315     0.9989495106
  C       0.1475839200     0.5360767829     0.9720632202
  H       2.2747285899    -2.3305092641    -2.1309847067
  H       1.7914396841     0.1112218491    -1.8961246432
  H       3.6619771828    -3.5101034326    -0.4104190551
  H       4.6149450286    -2.2194968328     1.5156033542
  H       4.1824488983     0.2511047765     1.6992698212
  H      -0.3735716924     2.9261719430    -1.4075532441
  H      -2.6441103785     1.8434881736    -1.4229081613
  H      -1.3553981905    -0.8449428646     1.7044090349
  H       0.9173910708     0.1730687125     1.6600598445
  O      -3.3697643668    -0.1613755506     0.1412332454
  Ca     -5.2816560827    -1.0567832255     0.1711274484
$end

$rem
! convergence
  scf_convergence = 9
  cc_convergence = 10
  eom_davidson_conv = 10

! method
  method = eom-ccsd
  ee_states = [9]

! basis
  basis = aug-cc-pVDZ

! EOM properties
  cc_eom_prop = true
  cc_eom_prop_te = true
  state_analysis = true

! Transition properties
  cc_trans_prop = 1             ! 1: ccsd ref -> others
  nto_pairs = 2
  molden_format = true
  wfa_orb_thresh = 1

! memory
  mem_total = 201840
  cc_memory = 161472
$end

Pawel, this could be also a faulty disk, which sometimes causes IO issues. What does the error stream report for the job?

Does SMART report failure? If you have access to the kernel, are there any funny kernel messages?

Can you try CC_BACKEND = XM, which uses libxm as an allocator instead of libvmm?

Hi Pavel,

The errorstr was empty after the job has failed, but the stdoutstr reported:
Error: in the serial run

It seems to be something with the drive, the raid says that it is in a good shape but the kernel says that there is an “impedeing failure” to the drive. I will check what’s happening there.

Thanks!

SMART may record the failure, but it may not recognize it as critical depending on a type of failure.

I would unmount the filesystem and run the filesystems check utility (e2fsck for ext) to make sure that the bad blocks are marked and avoided.