It appears the wall clock time (wct) printed at the end of the job for the test install run is not correct when using Q-Chem 5.4.2 and BrianQC 1.2.1.
For pure Q-Chem the total job wct at the end of the output is 27.12s with the SCF being the dominant part at a wct of 27s. For the Q-Chem/GPU run (4 RTX3090) the total job wct at the end of the output is 21.12s with the SCF being a minor part at a wct of 5s.
The overall results are the same within the difference one would expect for a CPU vs GPU run. Is this a bug or is something happening after the GPU SCF that does not occur with the CPU run?
I tried to upload the output files but was not able to do so, due to the restrictions on what files can be uploaded.
Could you upload snippets of the SCF in both cases and also final timings for comparison? It is possible that the job is not large enough and setup overheads affect total timings.
For example, I just ran a UHF job:
A unrestricted SCF calculation will be
performed using DIIS
SCF converges when DIIS error is below 1.0e-08
---------------------------------------
Cycle Energy DIIS error
---------------------------------------
1 -791.0714228516 1.78e-02
2 -781.2238086150 1.10e-03
3 -781.4506097363 2.83e-04
4 -781.4849381940 1.20e-04
5 -781.4951773957 8.24e-05
6 -781.5060415704 6.08e-05
7 -781.5122876425 3.49e-05
8 -781.5140718598 2.44e-05
9 -781.5147629741 1.08e-05
10 -781.5148858552 7.91e-06
11 -781.5149653048 4.77e-06
12 -781.5150066082 3.36e-06
13 -781.5150343661 2.12e-06
14 -781.5150469314 1.14e-06
15 -781.5150494988 6.37e-07
16 -781.5150498530 2.54e-07
17 -781.5150499106 1.01e-07
18 -781.5150499181 4.74e-08
19 -781.5150499157 1.56e-08
20 -781.5150499165 5.82e-09 Convergence criterion met
---------------------------------------
SCF time: CPU 1627.19s wall 206.00s
<S^2> = 2.020230773
SCF energy in the final basis set = -781.5150499165
Total energy in the final basis set = -781.5150499165
Total job time: 208.07s(wall), 1634.32s(cpu)
---- BrianQC HCore successfully initialized ----
---- BrianQC J/K successfully initialized ----
BrianQC JK build time 4.0000000000 (s)
A unrestricted SCF calculation will be
performed using DIIS
SCF converges when DIIS error is below 1.0e-08
---------------------------------------
Cycle Energy DIIS error
---------------------------------------
BrianQC JK build time 3.0e+00 (s)
1 -791.0714228186 1.78e-02
BrianQC JK build time 2.00e+00 (s)
2 -781.2238086040 1.10e-03
BrianQC JK build time 3.00e+00 (s)
3 -781.4506097186 2.83e-04
BrianQC JK build time 2.00e+00 (s)
4 -781.4849381687 1.20e-04
BrianQC JK build time 2.00e+00 (s)
5 -781.4951773655 8.24e-05
BrianQC JK build time 3.00e+00 (s)
6 -781.5060415358 6.08e-05
BrianQC JK build time 2.00e+00 (s)
7 -781.5122876068 3.49e-05
BrianQC JK build time 3.00e+00 (s)
8 -781.5140718229 2.44e-05
BrianQC JK build time 2.00e+00 (s)
9 -781.5147629329 1.08e-05
BrianQC JK build time 3.00e+00 (s)
10 -781.5148858147 7.91e-06
BrianQC JK build time 2.00e+00 (s)
11 -781.5149652641 4.77e-06
BrianQC JK build time 2.00e+00 (s)
12 -781.5150065681 3.36e-06
BrianQC JK build time 2.00e+00 (s)
13 -781.5150343259 2.12e-06
BrianQC JK build time 2.00e+00 (s)
14 -781.5150468913 1.14e-06
BrianQC JK build time 2.00e+00 (s)
15 -781.5150494596 6.37e-07
BrianQC JK build time 2.00e+00 (s)
16 -781.5150498136 2.54e-07
BrianQC JK build time 2.00e+00 (s)
17 -781.5150498724 1.01e-07
BrianQC JK build time 2.00e+00 (s)
18 -781.5150498789 4.74e-08
BrianQC JK build time 2.00e+00 (s)
19 -781.5150498758 1.56e-08
20 -781.5150498758 5.82e-09 Convergence criterion met
---------------------------------------
SCF time: CPU 1268.92s wall 115.00s
<S^2> = 2.020230773
SCF energy in the final basis set = -781.5150498758
Total energy in the final basis set = -781.5150498758
Total job time: 130.65s(wall), 1289.95s(cpu)
You will see here that the non-SCF time (difference between total and SCF timings) is 2 sec without GPU vs. 15 sec with GPU, which is just a manifestation of additional setup required for GPU computing with BrianQC.