GPU performance question

Patrick Wintrode · Post by **Patrick Wintrode** » Sat Mar 30, 2019 10:14 am

Hi. We've got the current version of openmm with CUDA10 on a machine with an rtx2080 GPU. The DHFR benchmark with explicit solvent, PME electrostatics and 2 fs time step estimates 444 ns/day.

For our first run with a real system, we have a protein in explicit solvent, 79,000 atoms, octahedral box and PME electrostatics, CHARMM36 force field, with a 2 fs time step. The input was generated with the CHARMM GUI and the PME and othe non-bonded parameters are what it chose.

We're getting about 55 ns/day. The workstation is located in a normal office that is not really optimized for air conditioning, so the GPU slowing down to avoid overheating is probably an issue.

I'm just wondering if 55 ns/day sounds in the right ball park for a system of this size on this type of GPU, or whether we should put some additional work into optimizing things before moving ahead.

Thanks.

Peter Eastman · Post by **Peter Eastman** » Sat Mar 30, 2019 1:48 pm

That sounds a bit on the slow side, but it depends on your settings. The cost for each time step should be roughly linear in the number of atoms. Your system has a bit over three times as many atoms as the DHFR system. But CHARMM is a bit slower than the Amber force field used in the benchmark, and an octrahedral box also slows it down a little. So something in the range of 4-5x slower would be expected. Instead, you're getting about 8x slower, which suggests there is some other difference between the simulations. Here are some of the more likely possibilities.

What is the cutoff distance for direct space nonbonded interactions?

What is the PME error tolerance?

What precision mode are you using?

What integrator are you using?

Do you have a barostat? If so, what is its frequency set to?

What constraints are you using?

Kyle Kihn · Post by **Kyle Kihn** » Mon Apr 01, 2019 8:20 am

Hi all,
Here is the CHARMM GUI input file parameters. Does this help answer our question of why it seems to be running slow?
nstep = 50000000 # Number of steps to run
dt = 0.002 # Time-step (ps)

nstout = 1000 # Writing output frequency (steps)
nstdcd = 5000 # Writing coordinates trajectory frequency (steps)

coulomb = PME # Electrostatic cut-off method
ewaldTol = 0.0005 # Ewald error tolerance
vdw = Force-switch # vdW cut-off method
r_on = 1.0 # Switch-on distance (nm)
r_off = 1.2 # Switch-off distance (nm)

temp = 303.15 # Temperature (K)
fric_coeff = 1 # Friction coefficient for Langevin dynamics

pcouple = yes # Turn on/off pressure coupling
p_ref = 1.0 # Pressure (Pref or Pxx, Pyy, Pzz; bar)
p_type = isotropic # MonteCarloBarostat type
p_freq = 100 # Pressure coupling frequency (steps)

cons = HBonds # Constraints mehtod

rest = no # Turn on/off restraints

Thank you

Peter Eastman · Post by **Peter Eastman** » Mon Apr 01, 2019 9:56 am

You're using a longer cutoff distance than in the benchmark (1.2 instead of 0.9 nm), so that may be part of the reason. You can get a sense of how much difference that makes by running the benchmark script with the longer cutoff. Add the argument --pme-cutoff=1.2. The vdw switching function may also account for a little of it.

I can't tell from the parameter list what precision mode or integrator you're using. Can you figure out what those are?

Patrick Wintrode · Post by **Patrick Wintrode** » Mon Apr 01, 2019 10:28 am

Hi. We're using single precision and the Langevin integrator.

Thanks again for your help.

GPU performance question

GPU performance question

Re: GPU performance question

Re: GPU performance question

Re: GPU performance question

Re: GPU performance question