When doing EOM-CCSD calculations for a sizable system, I noticed that there is a slow step (which sometimes can take over a day) right after the ground-state CCSD is done, with the output showing “Computing CCSD intermediates for later calculations in double precision”, which is not very informative. What does the code actually compute at this step? Maybe some extra prints would be helpful.
Also, how well does OpenMP parallelization work for CCMAN2? For the calculation I ran with 32-threads OpenMP (41 acitve O, 326 active V, giving 30G CC memory), the ccman2 timing shows:
CCSD calculation: CPU 103514.51 s wall 38743.84 s
EOMEE-CCSD calculation: CPU 51362.58 s wall 43177.38 s
Total ccman2 time: CPU 208896.81 s wall 87844.00 s
which does not seem to scale so well. Would requesting more CC memory help improve the parallel performance?
It seems you are doing standard (non-RI or CD) calculations. In this step requisite transformed integrals are computed, e.g., those in Table III here: General implementation of the resolution-of-the-identity and Cholesky representations of electron repulsion integrals within coupled-cluster and equation-of-motion methods: Theory and benchmarks: The Journal of Chemical Physics: Vol 139, No 13
The efficiency of calculations and parallel performance are very sensitive to the memory settings. It seems you are using very small memory. I recommend to use about ~75% of the available RAM for the maximum efficiency.
You may also consider using single precision execution.
Here is a typical setup for a large job at USC:
! use SP code
cc_sp_dm = 1
eom_aresp_single_prec = 1
cc_sp_t_conv = 4
cc_sp_e_conv = 6
cc_erase_dp_integrals = 1
Another option would be to use RI/CD, but I would play with the above settings first.
Thanks very much for your reply! I will give a try based on your suggestions. The implementation paper is also very informative.