Problem with the openmm + cuda with geforce fx 670

James Starlight · Post by **James Starlight** » Mon Jan 07, 2013 2:02 am

Dear OpenMM users!

I've forced with the problem on the usage of the OpenMM with the CUDA to gpu-accelerated molecular dynamics using gromacs. First I'm using 304.48 nvidia driver and cuda-toolkit 4.29 installed from debian packages. I'm using DEbian-64bit OS with the gpu geforce gtx 670. After that I've built from binaries openMM 4.1. When I've started test script I've obtain an error

Code: Select all

 python testInstallation.pyThere are 3 Platforms available:

1 Reference - Successfully computed forces
2 Cuda
Traceback (most recent call last):
  File "testInstallation.py", line 29, in <module>
    simulation = Simulation(pdb.topology, system, integrator, platform)
  File "/usr/local/lib/python2.7/dist-packages/simtk/openmm/app/simulation.py", line 77, in __init__
    self.context = mm.Context(system, integrator, platform)
  File "/usr/local/lib/python2.7/dist-packages/simtk/openmm/openmm.py", line 4594, in __init__
    this = _openmm.new_Context(*args)
Exception: cudaMemcpyToSymbol: SetSim copy to cSim failed invalid device symbol

The same error also was obtained in gromacs's md_run program as well as in some graphical software (E.g in VMD )

Code: Select all

 Info) Creating CUDA device pool and initializing hardware... 
CUDA error: invalid device symbol, CUDAClearDevice.cu line 62
Info) Detected 1 available CUDA accelerator:
Info) [0] GeForce GTX 670     7 SM_3.0 @ 0.98 GHz, 2.0GB RAM, KTO, OIO, ZCP

so it seems that the problem might be in some cuda options but I dont know how I could fix it. Could you explain me step-by-step possible solutions for such task?

Thanks for help

James S.

James Starlight · Post by **James Starlight** » Mon Jan 07, 2013 3:44 am

Also I've tried to use cuda-toolkit-5.0 with the same openmm but problem were the same

James S.

Peter Eastman · Post by **Peter Eastman** » Mon Jan 07, 2013 11:33 am

Hi James,

Different CUDA versions are not binary compatible with each other, so if you want to use the prebuilt OpenMM binaries, you have to use the exact CUDA version (both toolkit and driver) they were compiled against. For OpenMM 4.1.1, that means CUDA 4.1, which you can download from https://developer.nvidia.com/cuda-toolkit-41-archive. Any other version will not work correctly.

Alternatively, you could recompile OpenMM from source. If you do that, it should work fine with newer CUDA versions. Also, you could try the beta of OpenMM 5.0 we recently posted. That was compiled against CUDA 5.0.

Peter

James Starlight · Post by **James Starlight** » Mon Jan 07, 2013 11:02 pm

Peter,

thanks for suggestions! I've tried to install that version of cuda-toolkit as well as SDK with the re-installation of the openMM 4.1 from binaries but the errors in both the gromacs and openMM's test script were the same.

Also I've tried to built open mm from source and obtain error during make install step ( there were no errors on previous compilation steps ).

Code: Select all

[ 44%] Building CXX object platforms/opencl/sharedTarget/CMakeFiles/OpenMMOpenCL.dir/__/src/OpenCLPlatform.cpp.o
In file included from /usr/local/openmm/platforms/opencl/src/OpenCLContext.h:39:0,
                 from /usr/local/openmm/platforms/opencl/src/OpenCLPlatform.cpp:27:
/usr/local/openmm/platforms/opencl/src/cl.hpp: In function ‘cl_int cl::UnloadCompiler()’:
/usr/local/openmm/platforms/opencl/src/cl.hpp:1556:12: error: ‘::clUnloadCompiler’ has not been declared
/usr/local/openmm/platforms/opencl/src/cl.hpp: In constructor ‘cl::Image2D::Image2D(const cl::Context&, cl_mem_flags, cl::ImageFormat, size_t, size_t, size_t, void*, cl_int*)’:
/usr/local/openmm/platforms/opencl/src/cl.hpp:2200:19: error: ‘::clCreateImage2D’ has not been declared
/usr/local/openmm/platforms/opencl/src/cl.hpp: In constructor ‘cl::Image2DGL::Image2DGL(const cl::Context&, cl_mem_flags, GLenum, GLint, GLuint, cl_int*)’:
/usr/local/openmm/platforms/opencl/src/cl.hpp:2245:19: error: ‘::clCreateFromGLTexture2D’ has not been declared
/usr/local/openmm/platforms/opencl/src/cl.hpp: In constructor ‘cl::Image3D::Image3D(const cl::Context&, cl_mem_flags, cl::ImageFormat, size_t, size_t, size_t, size_t, size_t, void*, cl_int*)’:
/usr/local/openmm/platforms/opencl/src/cl.hpp:2299:19: error: ‘::clCreateImage3D’ has not been declared
/usr/local/openmm/platforms/opencl/src/cl.hpp: In constructor ‘cl::Image3DGL::Image3DGL(const cl::Context&, cl_mem_flags, GLenum, GLint, GLuint, cl_int*)’:
/usr/local/openmm/platforms/opencl/src/cl.hpp:2345:19: error: ‘::clCreateFromGLTexture3D’ has not been declared
/usr/local/openmm/platforms/opencl/src/cl.hpp: In member function ‘cl_int cl::CommandQueue::enqueueMarker(cl::Event*) const’:
/usr/local/openmm/platforms/opencl/src/cl.hpp:3389:13: error: ‘::clEnqueueMarker’ has not been declared
/usr/local/openmm/platforms/opencl/src/cl.hpp: In member function ‘cl_int cl::CommandQueue::enqueueWaitForEvents(const std::vector<cl::Event>&) const’:
/usr/local/openmm/platforms/opencl/src/cl.hpp:3396:13: error: ‘::clEnqueueWaitForEvents’ has not been declared
/usr/local/openmm/platforms/opencl/src/cl.hpp: In member function ‘cl_int cl::CommandQueue::enqueueBarrier() const’:
/usr/local/openmm/platforms/opencl/src/cl.hpp:3511:13: error: ‘::clEnqueueBarrier’ has not been declared
make[2]: *** [platforms/opencl/sharedTarget/CMakeFiles/OpenMMOpenCL.dir/__/src/OpenCLPlatform.cpp.o] Error 1
make[1]: *** [platforms/opencl/sharedTarget/CMakeFiles/OpenMMOpenCL.dir/all] Error 2
make: *** [all] Error 2

James Starlight · Post by **James Starlight** » Tue Jan 08, 2013 2:07 am

That problem was solved by replacing openCL options from cache file.

Now I've forced with the problem at the end of the compilation

Code: Select all

[ 95%] Creating OpenMM Python swig input files...
Traceback (most recent call last):
  File "/usr/local/openmm/python/src/swig_doxygen/swigInputBuilder.py", line 616, in <module>
    main()
  File "/usr/local/openmm/python/src/swig_doxygen/swigInputBuilder.py", line 590, in main
    pythonappendFilename, skipAdditionalMethods)
  File "/usr/local/openmm/python/src/swig_doxygen/swigInputBuilder.py", line 135, in __init__
    root = etree.parse(os.path.join(inputDirname, file)).getroot()
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1183, in parse
    tree.parse(source, parser)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 656, in parse
    parser.feed(data)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1643, in feed
    self._raiseerror(v)
  File "/usr/lib/python2.7/xml/etree/ElementTree.py", line 1507, in _raiseerror
    raise err
xml.etree.ElementTree.ParseError: mismatched tag: line 172, column 670
make[2]: *** [python/src/swig_doxygen/swig_lib/python/pythonprepend.i] Error 1
make[1]: *** [wrappers/python/CMakeFiles/BuildModule.dir/all] Error 2
make: *** [all] Error 2

Why this occurs ? Actually I have updated swig and dohygen. Finally below you can found output of cmake file

Code: Select all

-- The C compiler identification is GNU 4.6.3
-- The CXX compiler identification is GNU 4.6.3
-- Check for working C compiler: /usr/bin/gcc
-- Check for working C compiler: /usr/bin/gcc -- works
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++
-- Check for working CXX compiler: /usr/bin/c++ -- works
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Could not find 'svnversion' executable; 'about' will be wrong. (Cygwin provides one on Windows.)
-- Found OPENCL: /usr/lib/x86_64-linux-gnu/libOpenCL.so  
-- Found Doxygen: /usr/bin/doxygen (found version "1.8.1.2") 
-- Configuring done
-- Generating done
-- Build files have been written to: /usr/local/openmm

Peter Eastman · Post by **Peter Eastman** » Tue Jan 08, 2013 11:22 am

Hi James,

I'm not certain what's causing that, but my first guess is that you've somehow gotten some corrupt files in your build directory, perhaps by canceling an earlier build. Try deleting the "python" and "wrappers/python" subdirectories of your build directory, rerunning CMake, and then building again.

Also note that if you just want to use OpenMM with Gromacs, you don't actually need the Python components. So another option is to just tell it not to build them by turning off the OPENMM_BUILD_PYTHON_WRAPPERS option in CMake.

Peter

James Starlight · Post by **James Starlight** » Tue Jan 08, 2013 1:08 pm

Peter,

thank you again for suggestions. Indeed without python modules openMM have been compiled without problems.

But that have not solved main issue

In particular I've tried to compile openMM with Cuda-5 and Cuda-4.1 but on the any test-programs I've obtain the bellow error

Code: Select all

There are 2 Platforms available:

1 Reference - Successfully computed forces
2 Cuda
Traceback (most recent call last):
  File "testInstallation.py", line 29, in <module>
    simulation = Simulation(pdb.topology, system, integrator, platform)
  File "/usr/local/lib/python2.7/dist-packages/simtk/openmm/app/simulation.py", line 77, in __init__
    self.context = mm.Context(system, integrator, platform)
  File "/usr/local/lib/python2.7/dist-packages/simtk/openmm/openmm.py", line 4594, in __init__
    this = _openmm.new_Context(*args)
Exception: cudaMemcpyToSymbol: SetSim copy to cSim failed invalid device symbol

Also below you can find log from DeviceQuery of the Cuda-5

Code: Select all

Detected 1 CUDA Capable device(s)

Device 0: "GeForce GTX 670"
  CUDA Driver Version / Runtime Version          5.0 / 5.0
  CUDA Capability Major/Minor version number:    3.0
  Total amount of global memory:                 2047 MBytes (2146762752 bytes)
  ( 7) Multiprocessors x (192) CUDA Cores/MP:    1344 CUDA Cores
  GPU Clock rate:                                980 MHz (0.98 GHz)
  Memory Clock rate:                             3004 Mhz
  Memory Bus Width:                              256-bit
  L2 Cache Size:                                 524288 bytes
  Max Texture Dimension Size (x,y,z)             1D=(65536), 2D=(65536,65536), 3D=(4096,4096,4096)
  Max Layered Texture Size (dim) x layers        1D=(16384) x 2048, 2D=(16384,16384) x 2048
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 65536
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  2048
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     2147483647 x 65535 x 65535
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     Yes
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      Yes
  Device PCI Bus ID / PCI location ID:           2 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >

deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 5.0, CUDA Runtime Version = 5.0, NumDevs = 1, Device0 = GeForce GTX 670

Unfortunately I could not compile deviceQuery with the SDK-4.1 ( there were no problems with the 5-.0 SDK) due to that error

Code: Select all

make[2]: Entering directory `/home/own/SDK/C/src/nbody'
/usr/bin/ld: cannot find -lXi
/usr/bin/ld: cannot find -lXmu
collect2: ld returned 1 exit status
make[2]: *** [../../bin/linux/release/nbody] Error 1
make[2]: Leaving directory `/home/own/SDK/C/src/nbody'
make[1]: *** [src/nbody/Makefile.ph_build] Error 2
make[1]: Leaving directory `/home/own/SDK/C'
make: *** [all] Error 2

so if you also know possible ways to fix it I'll be very thankful

Do you know any possible solutions of the compatibility of my video card with the openMM ? Might I try to use latest nvidia gpu driver (now i'm using 304.54) for instance with the 4.1 CUDA ?

James S.

Peter Eastman · Post by **Peter Eastman** » Tue Jan 08, 2013 1:54 pm

Oh well, it was a good try!

Here are my thoughts on this...

OpenMM includes three different platforms: reference, OpenCL, and CUDA. The CUDA platform in OpenMM 4.1.1 was derived from an older code base that imposed a lot of limitations on it. It doesn't support as many features as the other platforms and is usually slower than OpenCL. In OpenMM 5.0, we threw it out and wrote a completely new CUDA platform (closely modeled after the OpenCL platform).

So if you're using OpenMM 4.1.1, you really should use the OpenCL platform instead. I don't know why the CUDA platform is getting that error, but since all that code has been discontinued, I don't think it's worth much effort trying to debug it. Just use OpenCL.

In OpenMM 5.0, on the other hand, we now have a good CUDA implementation that should be significantly faster than the OpenCL platform on your GTX 670. So if you want to try that, it would be worth it for the improved performance.

Peter

James Starlight · Post by **James Starlight** » Tue Jan 08, 2013 11:05 pm

Peter,

could you tell me where I can download openmm-5.0 ?
I've subscribed to mailing list but could not found any links.

James

James Starlight · Post by **James Starlight** » Wed Jan 09, 2013 5:21 am

By the way latest Gromacs release consist of self-built native openMM. Today I've compiled that source with the cuda-5.0 sdk toolkit and driver. During the run of simulation I have obtained the below message from md_run program

Code: Select all

1 GPU detected:
  #0: NVIDIA GeForce GTX 670, compute cap.: 3.0, ECC:  no, stat: compatible

1 GPU auto-selected to be used for this run: #0

Program mdrun-openmm, VERSION 4.6-beta3
Source code file:
/home/own/gromacs-4.6-beta3/src/kernel/openmm_wrapper.cpp, line: 1367

Fatal error:
OpenMM exception caught while initializating: Error setting device
flags cannot set while device is active in this process

it seems that gpu have been detected correctly but I have no suggestion about that error. I've tried to use different cuda.s releases but the output was the same.
Finally I'm not quite sure if gromacs suports openCL. In anyway I'd like to test the newest openMM.

James

Problem with the openmm + cuda with geforce fx 670

Problem with the openmm + cuda with geforce fx 670

Re: Problem with the openmm + cuda with geforce fx 670

Re: Problem with the openmm + cuda with geforce fx 670

Re: Problem with the openmm + cuda with geforce fx 670

Re: Problem with the openmm + cuda with geforce fx 670

Re: Problem with the openmm + cuda with geforce fx 670

Re: Problem with the openmm + cuda with geforce fx 670

Re: Problem with the openmm + cuda with geforce fx 670

Re: Problem with the openmm + cuda with geforce fx 670

Re: Problem with the openmm + cuda with geforce fx 670