NBO 6 Hanging, It is

Hi,

I am trying to run an NBO6 calculation on a molecular dimer of about 80 atoms.
The test in the manual for NBO runs fine, as does an H2 case.
For this however, despite setting up as closely as possible to the cases that work, nothing happens past

Begin NBO analysis for the ground state

QCLOCALSCR/save contains

11.0    16.0    175.0  31.0   4.0    49.0  54.0   6.0    771.0  819.0   93.0   99.0      molecule  rem       tmp2.z2c    zmat2
121.0   1622.0  177.0  320.0  412.0  51.0  58.0   621.0  80.0   90.0    97.0   hostfile  NBO       remfrgm   tmp.genbas  zmat3
1325.0  174.0   179.0  36.0   46.0   53.0  593.0  63.0   803.0  9121.0  976.0  input0    OPT       tmp1.z2c  zmat

When I strace the active process (there’s only 1 running thread @ 100%), I just see sched_yield() = 0 in an endless progression. GDB also shows

(gdb) info threads
  Id   Target Id         Frame 
  36   Thread 0x2b7ebec23780 (LWP 173352) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  35   Thread 0x2b7ebf025800 (LWP 173353) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  34   Thread 0x2b7ec7427880 (LWP 173354) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  33   Thread 0x2b7ebf427900 (LWP 173355) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  32   Thread 0x2b7ebf829980 (LWP 173356) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  31   Thread 0x2b7ebfc2ba00 (LWP 173357) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  30   Thread 0x2b7ec4400a80 (LWP 173358) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  29   Thread 0x2b7ec4802b00 (LWP 173359) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  28   Thread 0x2b7ec4c04b80 (LWP 173360) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  27   Thread 0x2b7ec5006c00 (LWP 173361) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  26   Thread 0x2b7ec5408c80 (LWP 173362) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  25   Thread 0x2b7ec580ad00 (LWP 173363) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  24   Thread 0x2b7ec5c0cd80 (LWP 173364) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  23   Thread 0x2b7ec600ee00 (LWP 173365) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  22   Thread 0x2b7ec6410e80 (LWP 173366) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  21   Thread 0x2b7ec6812f00 (LWP 173367) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  20   Thread 0x2b7ec6c14f80 (LWP 173368) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  19   Thread 0x2b7ec7017000 (LWP 173369) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  18   Thread 0x2b7ec782a080 (LWP 173370) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  17   Thread 0x2b7ec7c2c100 (LWP 173371) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  16   Thread 0x2b7f14401180 (LWP 173372) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  15   Thread 0x2b7f14803200 (LWP 173373) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  14   Thread 0x2b7f14c05280 (LWP 173374) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  13   Thread 0x2b7f15007300 (LWP 173375) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  12   Thread 0x2b7f15409380 (LWP 173376) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  11   Thread 0x2b7f1580b400 (LWP 173377) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  10   Thread 0x2b7f15c0d480 (LWP 173378) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  9    Thread 0x2b7f1600f500 (LWP 173379) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  8    Thread 0x2b7f16411580 (LWP 173380) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  7    Thread 0x2b7f16813600 (LWP 173381) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  6    Thread 0x2b7f16c15680 (LWP 173382) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  5    Thread 0x2b7f17017700 (LWP 173383) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  4    Thread 0x2b7f17419780 (LWP 173384) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  3    Thread 0x2b7f1781c800 (LWP 173385) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
  2    Thread 0x2b7f17c1f880 (LWP 173386) "qcprog.exe_s" 0x00002b7eb229e9f5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
* 1    Thread 0x2b7eb1c6e9c0 (LWP 173337) "qcprog.exe_s" 0x00002b7eb2894727 in sched_yield () from /lib64/libc.so.6

Not sure what’s happening here. Is the NBO6 communication broken in some cases?
Full input

$molecule
   read
$end

$rem
   method scan
   unrestricted true
   basis def2-TZVPP
   scf_guess read
   scf_algorithm sgm_ls
   mem_total 50000
   nbo 1
   run_nbo6 1
$end

NBOEXE points to nbo6.i4.exe

Thanks - Chris

Can you please attach the $molecule also? The load balance issue is likely a function of NBO6 running as a serial program regardless of how many threads are used by Q-Chem.

Thanks John, sure:

$molecule
0 1
 C     -4.55889600    1.76177400    0.32729100
 C     -4.21009000    0.41020200    0.59884400
 C     -3.60795300    2.79999200    0.62472800
 C     -5.79793500    2.09036700   -0.23806300
 C     -2.95958100    0.05271700    1.12020300
 C     -5.16793600   -0.62255700    0.32403800
 C     -2.80783100    3.67967900    0.89377000
 C     -6.77564700    1.06744300   -0.49181000
 C     -6.15078500    3.43038700   -0.57086300
 H     -2.34505200    0.73989700    1.34694000
 C     -2.59064600   -1.25986500    1.31439600
 C     -6.45640600   -0.27045800   -0.18432600
 C     -4.78693700   -1.95120700    0.55872500
 Si    -1.62714400    4.99618200    1.41673600
 C     -8.03639100    1.43979200   -1.02567100
 C     -7.35698000    3.73747300   -1.09607400
 H     -5.52208000    4.12498800   -0.41990700
 C     -1.29693900   -1.62554600    1.78198400
 C     -3.53165600   -2.30299900    1.00642900
 C     -7.43464800   -1.30823800   -0.36970200
 H     -5.42280800   -2.63910100    0.40404000
 H     -2.35247800    6.23516600    1.79114500
 H     -0.67159500    5.24474700    0.32767500
 H     -0.95432900    4.45467000    2.62544300
 H     -8.69483000    0.77111000   -1.17439400
 C     -8.32038000    2.72257500   -1.32744700
 H     -7.56110100    4.63952900   -1.31297200
 H     -0.67500700   -0.94615100    2.01679300
 C     -0.94074700   -2.92461100    1.89546800
 C     -3.12018200   -3.64362000    1.15284200
 C     -8.25267800   -2.19434900   -0.47927200
 H     -9.16749000    2.94748300   -1.69450200
 H     -0.06746200   -3.14783300    2.19662500
 C     -1.85845900   -3.95886600    1.57004400
 H     -3.73394000   -4.34290000    0.95782400
 Si    -9.47285800   -3.57473100   -0.51673600
 H     -1.59476400   -4.86875000    1.64178300
 H    -10.80004100   -2.98935900   -0.83027900
 H     -9.47099700   -4.20443100    0.82726000
 H     -9.04937700   -4.51634100   -1.58343700
 H     -0.49424000   -0.20562800   -0.80835500
 C      0.40597900    0.08856800   -0.73440900
 C      0.69067100    1.47704700   -0.68178400
 C      1.40558700   -0.81848100   -0.67869100
 C      1.96541000    1.90186100   -0.57304600
 H     -0.01807400    2.10852500   -0.72448500
 H      1.19388500   -1.74261200   -0.71982300
 C      2.76964100   -0.42349400   -0.55931900
 C      3.04563700    0.98567500   -0.49037900
 H      2.14357600    2.83489800   -0.55080900
 C      3.81693000   -1.35336000   -0.52253600
 C      4.38290500    1.41178600   -0.36159800
 C      3.53319200   -2.75882600   -0.64425600
 C      5.16489600   -0.93573800   -0.34722100
 C      5.44978400    0.46743400   -0.24964700
 C      4.69681800    2.81523200   -0.35118300
 C      3.29674300   -3.94877400   -0.76484600
 C      6.22372100   -1.84937600   -0.26038400
 C      6.77993600    0.85422000   -0.03367800
 C      4.99143700    3.98944300   -0.37946000
 Si     2.98141000   -5.74159400   -1.05990800
 H      6.04057500   -2.77400100   -0.37351000
 C      7.51992300   -1.45333200   -0.01603900
 C      7.80810700   -0.05086500    0.11999700
 H      6.98154900    1.78114100    0.00663800
 Si     5.52629400    5.74704600   -0.52248600
 H      1.79328900   -5.84842900   -1.94283000
 H      2.78760200   -6.49110900    0.20601300
 H      4.18770400   -6.23436300   -1.77394000
 C      8.58444200   -2.38568500    0.13744500
 C      9.12791800    0.33680800    0.42954200
 H      4.32931400    6.55104900   -0.87371800
 H      6.54813100    5.81327500   -1.59687700
 H      6.06593200    6.15856800    0.79793400
 H      8.41402400   -3.31360600    0.02154900
 C      9.83299400   -1.96991600    0.44659300
 C     10.11801100   -0.58740800    0.60538800
 H      9.33329900    1.26087100    0.51700900
 H     10.52706500   -2.60941200    0.55822200
 H     10.99604300   -0.30596300    0.83433900
$end

Two things:
(1) I am able to reproduce this crash with the following (much shorter and simpler) sample job. Seems like a bug, I will look into it.
(2) For your job, the SCF has trouble converging unless you set THRESH=12 to deal with the linear dependencies, in which case DIIS (default SCF algorithm) converges in 9 cycles. I would not use SGM-LS unless you are trying to locate a non-Aufbau determinant.

$rem
method scan
unrestricted true
basis def2-TZVPP
! scf_guess read
mem_total 50000
nbo 1
run_nbo6 1
thresh 12
$end

$molecule
0 1
O
H 1 0.95
H 1 0.92 2 104.5
$end

Thanks John. Unfortunately the real job is a non-Aufbau broken-symmetry target.

I’d been trying H2 with varying success, but seemed like I had it working by the end. Regardless, glad you could reproduce this.

Sure, for non-Aufbau that’s a good algorithm (although I might plug my own student’s STEP algorithm also), but in any case I would set THRESH=12 because otherwise there are linear dependency issues. A warning to that effect gets printed in the output file and in general for medium to large molecules you should take that warning seriously.

As for the specific NBO issue, the problem on my end is that NBO6 has to be downloaded and installed (whereas NBO5 ships with Q-Chem). It appears that the Weinhold group stopped supporting NBO6 some time in 2020, so I cannot download it to try, and I also think that we (at Q-Chem) need to update the manual to reflect this fact. I will post a bug ticket about that.

As for your calculation, can you get what you want from NBO5? To run NBO5 you just need NBO=TRUE so delete the RUN_NBO6 keyword from your $rem.

Thanks again. I think what I want to do at the moment is fully implemented in NBO5, assuming no bug fixes etc. in NBO6.

NBO changed their interface to ESS codes in NBO6, so I assume that NBO7+ would work the same unless one of the newer NBO methods needed some additional data structure that wasn’t enabled in the NBO6 interface.

Anyway, looks like NBO5 ran OK to completion. Regardless, I hope the NBO6/7 interoperability with Q-Chem can get fixed, there are a number of intriguing developments in the NBO code. Thanks for your help!

My NBO contact suggests that NBO7 is quite a bit different, may be why the Weinhold group is no longer willing to provide backwards support. I put in an inquiry to see if we can get this done.

1 Like

Glad to hear you’re able to use the NBO5 interface! If you want to use NBO6 in the future (or if anyone else is looking in this thread after encountering a similar problem), this hanging behavior is a recently-discovered issue with running the NBO6 interface in parallel in Q-Chem 5.4.1, and is related to a problem with Intel 17 compilers. I have submitted a ticket about it in our system, but for now, the best work-around is to manually set the environment variable ‘KMP_INIT_AT_FORK=FALSE’.

Thank you Shannon, confirmed that setting KMP_INIT_AT_FORK to FALSE in my job script gets past the hang and lets NBO 6 run as expected.

Cheers; Chris

1 Like

Also confirmed that the NBO 6 analysis led to a qualitatively and quantitatively different STERIC analysis, so glad there was a workaround!

Does Q-Chem work with NBO7?

I have not tried but Q-Chem staff tell me that the NBO6 interface can also run NBO7, need to set both NBO = TRUE and RUN_NBO7 = TRUE in the $rem section. Advice via Eric Glendening is: “ask the user to set the NBOEXE environment variable to nbo7.i4.exe rather than to nbo6.i4.exe”. Note that both NBO6 and NBO7 need to be obtained separately, they do not come with Q-Chem.

1 Like

Thanks. I already have NBO7 so that is not a problem.