Installation problems with CUDA-2.2 OpenMM

The functionality of OpenMM will (eventually) include everything that one would need to run modern molecular simulation.
User avatar
Siddharth Srinivasan
Posts: 223
Joined: Thu Feb 12, 2009 6:49 pm

Installation problems with CUDA-2.2 OpenMM

Post by Siddharth Srinivasan » Wed Jul 29, 2009 12:06 pm

Hi there

I've installed CUDA toolkit 2.2, and confirmed that it works. Its installed in /opt/cuda. My ccmake configure looks like
{{{
BUILD_TESTING ON
CMAKE_BACKWARDS_COMPATIBILITY 2.4
CMAKE_BUILD_TYPE Debug
CMAKE_INSTALL_PREFIX /Network/Cluster/home/siddharth/OpenMM3/
CUDA_BUILD_CUBIN ON
CUDA_BUILD_TYPE Device
CUDA_INSTALL_PREFIX /opt/cuda
CUDA_NVCC_FLAGS -maxrregcount=32;-use_fast_math;-O0
DART_ROOT DART_ROOT-NOTFOUND
DL_LIBRARY /usr/lib64/libdl.so
FOUND_CUBLAS /opt/cuda/lib/libcublas.so
FOUND_CUBLASEMU /opt/cuda/lib/libcublasemu.so
FOUND_CUFFT /opt/cuda/lib/libcufft.so
FOUND_CUFFTEMU /opt/cuda/lib/libcufftemu.so
OPENMM_BUILD_BROOK_LIB OFF
OPENMM_BUILD_CUDA_LIB ON
OPENMM_VERSION 1.0.0
SVNVERSION /usr/bin/svnversion
SVNVERSION_EXE SVNVERSION_EXE-NOTFOUND

}}}
When Im building from source, I get the following error
{{{
[ 48%] Built target TestReferenceVerletIntegrator
Linking CXX shared library ../../../libOpenMMCuda_d.so
/usr/lib/gcc/x86_64-pc-linux-gnu/4.1.2/../../../../x86_64-pc-linux-gnu/bin/ld: skipping incompatible /Network/Cluster/home/siddharth/Downloads/OpenMMPreview3-Source/src/platforms/cuda/cudpp/linux/libcudpp.a when searching for -lcudpp
/usr/lib/gcc/x86_64-pc-linux-gnu/4.1.2/../../../../x86_64-pc-linux-gnu/bin/ld: cannot find -lcudpp
collect2: ld returned 1 exit status
make[2]: *** [libOpenMMCuda_d.so] Error 1
make[1]: *** [platforms/cuda/sharedTarget/CMakeFiles/OpenMMCuda_d.dir/all] Error 2
make: *** [all] Error 2
}}}
When I build with no CUDA support, things work fine. Any suggestions?

User avatar
Peter Eastman
Posts: 2553
Joined: Thu Aug 09, 2007 1:25 pm

RE: Installation problems with CUDA-2.2 OpenMM

Post by Peter Eastman » Wed Jul 29, 2009 12:27 pm

It looks like for some reason, it doesn't like the precompiled CUDPP library. Try compiling CUDPP yourself and see if that makes it happy. This message included instructions for how to do that on 64 bit Linux:

https://simtk.org/forum/message.php?msg_id=2193

Peter

User avatar
Siddharth Srinivasan
Posts: 223
Joined: Thu Feb 12, 2009 6:49 pm

RE: Installation problems with CUDA-2.2 OpenMM

Post by Siddharth Srinivasan » Wed Jul 29, 2009 2:16 pm

Hi Peter

Thanks for the link to that post, with a few hacks for my own installation everything seems to wosr smoothly, at least the compilation. My process was
1. Install CUDA, and everything runs smoothly on the GPU as well as in emulation mode
2. Install OpenMM from source in $OPENMM, and follow the directions in the post. Everything seems to install correctly, it finds CUDA and the appropriate libraries
3. Set LD_LIBRARY_PATH to include the above installed $OPENMM/lib and CUDA
4. Install and compile the OpenMM examples. I edit the Makefile to find OpenMM libraries. The examples all run, but always say
{{{
REMARK Using OpenMM platform Reference
MODEL 1
REMARK 250 time=0.000 ps; energy=-297.963 kcal/mole
}}}
indicating the reference platform. How do I get it to use Cuda?

User avatar
Siddharth Srinivasan
Posts: 223
Joined: Thu Feb 12, 2009 6:49 pm

RE: Installation problems with CUDA-2.2 OpenMM

Post by Siddharth Srinivasan » Wed Jul 29, 2009 2:20 pm

I should also say that when I build the tests in the source, I get all the TestReference*, as well as the TestCuda* executables that all say "Done", but Im not sure if that means the CUDA tests run on the GPU or not.

User avatar
Peter Eastman
Posts: 2553
Joined: Thu Aug 09, 2007 1:25 pm

RE: Installation problems with CUDA-2.2 OpenMM

Post by Peter Eastman » Wed Jul 29, 2009 2:31 pm

That's a good sign. All the TestCuda* tests explicitly specify the CUDA platform, so if those are passing, it means CUDA is working for you.

So we need to figure out why your own program is using the reference platform. I assume you are not specifying a Platform when you create your context, so that it picks one automatically? If so, that suggests the CUDA platform is not getting loaded. You can verify that by calling Platform::getNumPlatforms() and Platform::getPlatform(i).getName() to see exactly which platforms are available.

Make sure you're actually loading the plugin that contains the CUDA platform. At the start of your program you should be calling

Platform::loadPluginsFromDirectory(Platform::getDefaultPluginsDirectory());

If you're doing that and it still isn't getting loaded, next make sure you have all necessary environment variables set correctly, since that would be the most likely reason for the plugin not getting loaded. That includes LD_LIBRARY_PATH and OPENMM_PLUGIN_DIR if you've installed OpenMM somewhere other than /usr/local/openmm.

Peter

User avatar
Siddharth Srinivasan
Posts: 223
Joined: Thu Feb 12, 2009 6:49 pm

RE: Installation problems with CUDA-2.2 OpenMM

Post by Siddharth Srinivasan » Wed Jul 29, 2009 2:34 pm

Whoops my bad, I forgot to set OPENMM_PLUGIN_DIR to point to the $OPENMM/lib/plugins dir. My new problem is a segfault after running the HelloArgon example
{{{
MODEL 250
ATOM 1 AR AR 1 0.255 0.000 0.000 1.00 0.00
ATOM 2 AR AR 1 5.000 0.000 0.000 1.00 0.00
ATOM 3 AR AR 1 9.745 0.000 0.000 1.00 0.00
ENDMDL
MODEL 251
ATOM 1 AR AR 1 0.227 0.000 0.000 1.00 0.00
ATOM 2 AR AR 1 5.000 0.000 0.000 1.00 0.00
ATOM 3 AR AR 1 9.773 0.000 0.000 1.00 0.00
ENDMDL
MODEL 252
ATOM 1 AR AR 1 0.201 0.000 0.000 1.00 0.00
ATOM 2 AR AR 1 5.000 0.000 0.000 1.00 0.00
ATOM 3 AR AR 1 9.799 0.000 0.000 1.00 0.00
ENDMDL
*** glibc detected *** ./HelloArgon: double free or corruption (fasttop): 0x0000000000509cc0 ***
======= Backtrace: =========
/lib/libc.so.6[0x7f636380654b]
/lib/libc.so.6(__libc_free+0x8c)[0x7f6363809c7c]
/Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so[0x7f6363ebfb91]
/lib/libc.so.6(__cxa_finalize+0x9d)[0x7f63637c8a4d]
/Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so[0x7f6363e9d5d3]
======= Memory map: ========
00400000-00406000 r-xp 00000000 00:22 3489873846 /Network/Cluster/home/siddharth/Downloads/openmmExamples/HelloArgon
00506000-00507000 rw-p 00006000 00:22 3489873846 /Network/Cluster/home/siddharth/Downloads/openmmExamples/HelloArgon
00507000-01ca9000 rw-p 00507000 00:00 0 [heap]
7f6358000000-7f6358021000 rw-p 7f6358000000 00:00 0
7f6358021000-7f635c000000 ---p 7f6358021000 00:00 0
7f635f789000-7f635f889000 rw-s 3f014f000 00:1b 2878 /dev/nvidia0
7f635f889000-7f635f989000 rw-s 3f08f3000 00:1b 2878 /dev/nvidia0
7f635f989000-7f635fa89000 rw-s 3f041d000 00:1b 2878 /dev/nvidia0
7f635fa89000-7f635fb89000 rw-s 3f01ac000 00:1b 2878 /dev/nvidia0
7f635fb89000-7f635fb8a000 rw-s 3f0034000 00:1b 2878 /dev/nvidia0
7f635fb8a000-7f635fb8b000 rw-s d9c04000 00:1b 2878 /dev/nvidia0
7f635fb8b000-7f635fb8c000 rw-s 3f05e6000 00:1b 2878 /dev/nvidia0
7f635fb8c000-7f635ff8e000 rw-s 42e9f4000 00:1b 2878 /dev/nvidia0
7f635ff8e000-7f6360390000 rw-s 431d73000 00:1b 2878 /dev/nvidia0
7f6360390000-7f636162e000 r-xp 00000000 00:22 3760723877 /Network/Cluster/home/siddharth/OpenMM3/lib/plugins/libOpenMMCuda.so
7f636162e000-7f636172e000 ---p 0129e000 00:22 3760723877 /Network/Cluster/home/siddharth/OpenMM3/lib/plugins/libOpenMMCuda.so
7f636172e000-7f6361732000 rw-p 0129e000 00:22 3760723877 /Network/Cluster/home/siddharth/OpenMM3/lib/plugins/libOpenMMCuda.so
7f6361732000-7f6361737000 rw-p 7f6361732000 00:00 0
7f6361d18000-7f6361d20000 r-xp 00000000 00:0e 2077892 /lib64/librt-2.6.1.so
7f6361d20000-7f6361e1f000 ---p 00008000 00:0e 2077892 /lib64/librt-2.6.1.so
7f6361e1f000-7f6361e21000 rw-p 00007000 00:0e 2077892 /lib64/librt-2.6.1.so
7f6361e21000-7f6361e35000 r-xp 00000000 00:0e 2077887 /lib64/libpthread-2.6.1.so
7f6361e35000-7f6361f35000 ---p 00014000 00:0e 2077887 /lib64/libpthread-2.6.1.so
7f6361f35000-7f6361f37000 rw-p 00014000 00:0e 2077887 /lib64/libpthread-2.6.1.so
7f6361f37000-7f6361f3b000 rw-p 7f6361f37000 00:00 0
7f6361f3b000-7f636204d000 r-xp 00000000 00:22 25291 /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM_d.so
7f636204d000-7f636214d000 ---p 00112000 00:22 25291 /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM_d.so
7f636214d000-7f6362156000 rw-p 00112000 00:22 25291 /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM_d.so
7f6362156000-7f6362157000 rw-p 7f6362156000 00:00 0
7f6362157000-7f6362193000 r-xp 00000000 00:0e 575127 /opt/cuda/lib/libcudart.so.2.2
7f6362193000-7f6362292000 ---p 0003c000 00:0e 575127 /opt/cuda/lib/libcudart.so.2.2
7f6362292000-7f6362294000 rw-p 0003b000 00:0e 575127 /opt/cuda/lib/libcudart.so.2.2
7f6362294000-7f6363586000 r-xp 00000000 00:22 3760723887 /Network/Cluster/home/siddharth/OpenMM3/lib/plugins/libOpenMMCuda_d.so
7f6363586000-7f6363685000 ---p 012f2000 00:22 3760723887 /Network/Cluster/home/siddharth/OpenMM3/lib/plugins/libOpenMMCuda_d.so
7f6363685000-7f636368c000 rw-p 012f1000 00:22 3760723887 /Network/Cluster/home/siddharth/OpenMM3/lib/plugins/libOpenMMCuda_d.so
7f636368c000-7f6363691000 rw-p 7f636368c000 00:00 0
7f6363691000-7f6363694000 r-xp 00000000 00:0e 2077885 /lib64/libdl-2.6.1.so
7f6363694000-7f6363793000 ---p 00003000 00:0e 2077885 /lib64/libdl-2.6.1.so
7f6363793000-7f6363795000 rw-p 00002000 00:0e 2077885 /lib64/libdl-2.6.1.so
7f6363795000-7f63638e3000 r-xp 00000000 00:0e 2077901 /lib64/libc-2.6.1.so
7f63638e3000-7f63639e2000 ---p 0014e000 00:0e 2077901 /lib64/libc-2.6.1.so
7f63639e2000-7f63639e5000 r--p 0014d000 00:0e 2077901 /lib64/libc-2.6.1.so
7f63639e5000-7f63639e7000 rw-p 00150000 00:0e 2077901 /lib64/libc-2.6.1.so
7f63639e7000-7f63639ec000 rw-p 7f63639e7000 00:00 0
7f63639ec000-7f63639f8000 r-xp 00000000 00:0e 2077779 /lib64/libgcc_s.so.1
7f63639f8000-7f6363af8000 ---p 0000c000 00:0e 2077779 /lib64/libgcc_s.so.1
7f6363af8000-7f6363af9000 rw-p 0000c000 00:0e 2077779 /lib64/libgcc_s.so.1
7f6363af9000-7f6363b7a000 r-xp 00000000 00:0e 2077677 /lib64/libm-2.6.1.so
7f6363b7a000-7f6363c79000 ---p 00081000 00:0e 2077677 /lib64/libm-2.6.1.so
7f6363c79000-7f6363c7b000 rw-p 00080000 00:0e 2077677 /lib64/libm-2.6.1.so
7f6363c7b000-7f6363d5e000 r-xp 00000000 00:0e 2151807443 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.1.2/libstdc++.so.6.0.8
7f6363d5e000-7f6363e5e000 ---p 000e3000 00:0e 2151807443 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.1.2/libstdc++.so.6.0.8
7f6363e5e000-7f6363e64000 r--p 000e3000 00:0e 2151807443 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.1.2/libstdc++.so.6.0.8
7f6363e64000-7f6363e67000 rw-p 000e9000 00:0e 2151807443 /usr/lib64/gcc/x86_64-pc-linux-gnu/4.1.2/libstdc++.so.6.0.8
7f6363e67000-7f6363e79000 rw-p 7f6363e67000 00:00 0
7f6363e79000-7f6363efe000 r-xp 00000000 00:22 24657 /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so
7f6363efe000-7f6363ffd000 ---p 00085000 00:22 24657 /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so
7f6363ffd000-7f6364001000 rw-p 00084000 00:22 24657 /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so
7f6364001000-7f6364003000 rw-p 7f6364001000 00:00 0
7f6364003000-7f6364020000 r-xp 00000000 00:0e 2077900 /lib64/ld-2.6.1.so
7f6364100000-7f6364104000 rw-p 7f6364100000 00:00 0
7f6364109000-7f636410a000 rw-p 7f6364109000 00:00 0
7f636410a000-7f636410b000 rw-s d9c02000 00:1b 2878 /dev/nvidia0
7f636410b000-7f636410c000 rw-s 432945000 00:1b 2878 /dev/nvidia0
7f636410c000-7f636411d000 rw-s 42e1cd000 00:1b 2878 /dev/nvidia0
7f636411d000-7f636411f000 rw-p 7f636411d000 00:00 0
7f636411f000-7f6364120000 r--p 0001c000 00:0e 2077900 /lib64/ld-2.6.1.so
7f6364120000-7f6364121000 rw-p 0001d000 00:0e 2077900 /lib64/ld-2.6.1.so
7fff6c10a000-7fff6c120000 rw-p 7ffffffe9000 00:00 0 [stack]
7fff6c1fd000-7fff6c1fe000 r-xp 7fff6c1fd000 00:00 0 [vdso]
ffffffffff600000-ffffffffff601000 r-xp 00000000 00:00 0 [vsyscall]
Aborted

}}}

Any ideas? It looks like it completed execution, and failed during cleanup?

User avatar
Siddharth Srinivasan
Posts: 223
Joined: Thu Feb 12, 2009 6:49 pm

RE: Installation problems with CUDA-2.2 OpenMM

Post by Siddharth Srinivasan » Wed Jul 29, 2009 2:39 pm

Right thanks Peter

I figured the problem almost immediately after posting, it WAS the OPENMM_PLUGIN_DIR issue as you suspected. I should add I'm only trying the examples OpenMM provides, I have not started using my own code yet unless all the examples run successfully. I have the work done in a branch that seems to work for my own code, but Ill wait until I actually try it with CUDA before I update. I now get segfaults using the Cuda platform, as I recently posted

User avatar
Peter Eastman
Posts: 2553
Joined: Thu Aug 09, 2007 1:25 pm

RE: Installation problems with CUDA-2.2 OpenMM

Post by Peter Eastman » Wed Jul 29, 2009 2:39 pm

Try running it in gdb. It should stop when it hits the error. Then type "bt" to get a trace of exactly where it's failing.

Peter

User avatar
Siddharth Srinivasan
Posts: 223
Joined: Thu Feb 12, 2009 6:49 pm

RE: Installation problems with CUDA-2.2 OpenMM

Post by Siddharth Srinivasan » Wed Jul 29, 2009 2:47 pm

Hmm deciphering this would be a bit beyond my debugging skills, but
{{{
Program received signal SIGABRT, Aborted.
0x00007fa4b3fc1b05 in raise () from /lib/libc.so.6
(gdb) bt
#0 0x00007fa4b3fc1b05 in raise () from /lib/libc.so.6
#1 0x00007fa4b3fc33be in abort () from /lib/libc.so.6
#2 0x00007fa4b3ffac97 in ?? () from /lib/libc.so.6
#3 0x00007fa4b400254b in ?? () from /lib/libc.so.6
#4 0x00007fa4b4005c7c in free () from /lib/libc.so.6
#5 0x00007fa4b46bbb91 in __tcf_3 () from /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so
#6 0x00007fa4b3fc4a4d in __cxa_finalize () from /lib/libc.so.6
#7 0x00007fa4b46995d3 in __do_global_dtors_aux () from /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so
#8 0x00007fffbc919430 in ?? ()
#9 0x00007fa4b46e5d01 in _fini () from /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so
#10 0x0000000000000000 in ?? ()
(gdb) list
68
69 // Advance state many steps at a time, for efficient use of OpenMM.
70 integrator.step(10); // (use a lot more than this normally)
71 }
72 }
73
74 int main()
75 {
76 try {
77 simulateArgon();
(gdb) bt
#0 0x00007fa4b3fc1b05 in raise () from /lib/libc.so.6
#1 0x00007fa4b3fc33be in abort () from /lib/libc.so.6
#2 0x00007fa4b3ffac97 in ?? () from /lib/libc.so.6
#3 0x00007fa4b400254b in ?? () from /lib/libc.so.6
#4 0x00007fa4b4005c7c in free () from /lib/libc.so.6
#5 0x00007fa4b46bbb91 in __tcf_3 () from /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so
#6 0x00007fa4b3fc4a4d in __cxa_finalize () from /lib/libc.so.6
#7 0x00007fa4b46995d3 in __do_global_dtors_aux () from /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so
#8 0x00007fffbc919430 in ?? ()
#9 0x00007fa4b46e5d01 in _fini () from /Network/Cluster/home/siddharth/OpenMM3/lib/libOpenMM.so
#10 0x0000000000000000 in ?? ()
(gdb)

}}}

User avatar
Michael Sherman
Posts: 804
Joined: Fri Apr 01, 2005 6:05 pm

RE: Installation problems with CUDA-2.2 OpenMM

Post by Michael Sherman » Wed Jul 29, 2009 3:11 pm

Hi, Sid. Do you also see this problem with the HelloSodiumChloride example? The reason I ask is that the HelloArgon example allocates the OpenMM System, Integrator, and Context on the stack and does no explicit cleanup while the NaCl one allocates them on the heap and cleans them up at the end. It might be easier to figure out what's going wrong there. Regards, Sherm

POST REPLY