Hello,
I am using openmm customforce and tesla c1060.
I am simulating about 8000 atoms and with integration time of 1 fs. Since the current integrator does not allow to fix some of the atoms, I have to change the velocity of the atoms at every integration step using setvelocities function.
In my case simulatig the system for 1 ns takes about 4 days. However I have seen simulations of much larger systems using other MD softwares such as LAMMPS which takes much less time.
I was wondering if you have any bechmark test results which has compared openmm with other software packages?
Thank you
OpenMM computational performannce
- Joshua Adelman
- Posts: 20
- Joined: Thu Feb 21, 2008 4:42 pm
RE: OpenMM computational performannce
Hi Kasra,
Since you're coming on and off of the GPU every integration step, you are paying a huge overhead which is killing the performance. This is not how OpenMM is intended to be used, so you can't really make a fair comparison to other software that has a built-in fixed atom functionality.
Instead of your current scheme, perhaps you can get away with restraining the subset of atoms using the HarmonicBondForce with a sufficiently high force constant, rather than fix them. If you then integrate many time steps per integrate.step() call before coming off the GPU to check energies, etc, then you should see a significant speedup in your simulations.
Best wishes,
Josh
Since you're coming on and off of the GPU every integration step, you are paying a huge overhead which is killing the performance. This is not how OpenMM is intended to be used, so you can't really make a fair comparison to other software that has a built-in fixed atom functionality.
Instead of your current scheme, perhaps you can get away with restraining the subset of atoms using the HarmonicBondForce with a sufficiently high force constant, rather than fix them. If you then integrate many time steps per integrate.step() call before coming off the GPU to check energies, etc, then you should see a significant speedup in your simulations.
Best wishes,
Josh
- Joshua Adelman
- Posts: 20
- Joined: Thu Feb 21, 2008 4:42 pm
RE: OpenMM computational performannce
In my original answer I suggested using HarmonicBondForce to restrain, instead of fix atoms. I made a mistake in that answer; you should use CustomExternalForce to restrain an atom to a position in space.
Josh
Josh
- Peter Eastman
- Posts: 2602
- Joined: Thu Aug 09, 2007 1:25 pm
RE: OpenMM computational performannce
Even with the transfers to and from the GPU, that should take seconds to run, or minutes at most, not days!
Make sure you're using the OpenCL platform. CUDA is much slower for custom forces, and reference is much, much slower than that.
What GPU are you using?
What custom force are you using? With what expression, parameters, etc.?
Have you profiled to make sure the slowness actually results from OpenMM, and not from your own code?
Peter
Make sure you're using the OpenCL platform. CUDA is much slower for custom forces, and reference is much, much slower than that.
What GPU are you using?
What custom force are you using? With what expression, parameters, etc.?
Have you profiled to make sure the slowness actually results from OpenMM, and not from your own code?
Peter
- Kasra Momeni
- Posts: 23
- Joined: Sat Nov 14, 2009 12:06 pm
RE: OpenMM computational performannce
Hello Peter,
I am using the CUDA platform which is about 40 times faster than the reference platform.
I am also using Tesla C1060 cards.
I am using a modified version of NaCl program. I can e-mail it to you if you would liked to.
Thank you very much,
Kasra
I am using the CUDA platform which is about 40 times faster than the reference platform.
I am also using Tesla C1060 cards.
I am using a modified version of NaCl program. I can e-mail it to you if you would liked to.
Thank you very much,
Kasra
- Peter Eastman
- Posts: 2602
- Joined: Thu Aug 09, 2007 1:25 pm
RE: OpenMM computational performannce
Yes, that would be great.
Peter
Peter
- Peter Eastman
- Posts: 2602
- Joined: Thu Aug 09, 2007 1:25 pm
RE: OpenMM computational performannce
Hi Kasra,
Looking over your source code, I see several issues.
1. You're writing out a PDB frame to disk for every single time step. This is very expensive. The actual simulation time is probably negligible compared to this. Typically one only writes results to disk once every hundreds or thousands of time steps.
2. You should use OpenCL instead of CUDA. It's much faster for custom forces.
3. Your method of constraining atoms is fairly expensive, because it involves downloading velocities, modifying them, and uploading them again every time step. Consider instead using a harmonic restraint plus a large mass.
4. Consider using a cutoff on the nonbonded force. Without a cutoff, the cost scales as O(n^2), so with 8000 atoms it's very slow.
Peter
Looking over your source code, I see several issues.
1. You're writing out a PDB frame to disk for every single time step. This is very expensive. The actual simulation time is probably negligible compared to this. Typically one only writes results to disk once every hundreds or thousands of time steps.
2. You should use OpenCL instead of CUDA. It's much faster for custom forces.
3. Your method of constraining atoms is fairly expensive, because it involves downloading velocities, modifying them, and uploading them again every time step. Consider instead using a harmonic restraint plus a large mass.
4. Consider using a cutoff on the nonbonded force. Without a cutoff, the cost scales as O(n^2), so with 8000 atoms it's very slow.
Peter