When doing EOM-CCSD calculations for a sizable system, I noticed that there is a slow step (which sometimes can take over a day) right after the ground-state CCSD is done, with the output showing “Computing CCSD intermediates for later calculations in double precision”, which is not very informative. What does the code actually compute at this step? Maybe some extra prints would be helpful.
Also, how well does OpenMP parallelization work for CCMAN2? For the calculation I ran with 32-threads OpenMP (41 acitve O, 326 active V, giving 30G CC memory), the ccman2 timing shows:
CCSD calculation: CPU 103514.51 s wall 38743.84 s
EOMEE-CCSD calculation: CPU 51362.58 s wall 43177.38 s
Total ccman2 time: CPU 208896.81 s wall 87844.00 s
which does not seem to scale so well. Would requesting more CC memory help improve the parallel performance?
The efficiency of calculations and parallel performance are very sensitive to the memory settings. It seems you are using very small memory. I recommend to use about ~75% of the available RAM for the maximum efficiency.
You may also consider using single precision execution.