Page 1 of 1

Parallel Runs

Posted: Tue Oct 29, 2019 12:42 pm
by gideonsimpson
I'm looking to do some ensemble sampling with OpenMM, and was hoping to distribute the jobs using the Python multiprocessing module, but I end up with a pickling issue related to SWIG (can't pickle SWIGPyObjects) Has anyone gotten something like this to work?

Re: Parallel Runs

Posted: Tue Oct 29, 2019 12:47 pm
by peastman
Systems, States, and Integrators can all be serialized to XML, which provides a portable format for doing this sort of thing. Just use XmlSerializer.serialize() to encode an object, and XmlSerializer.deserialize() to reconstruct it again. Will that meet your needs, or are there other types of objects you need to transfer?

Re: Parallel Runs

Posted: Mon Jan 27, 2020 5:56 pm
by gideonsimpson
Coming back to this (finally), I'm struggling with something else, though it be an issue with Python multiprocessing and not openmm. The following is an exmaple:

Code: Select all

from simtk import openmm, unit
from simtk.openmm import *
from simtk.openmm.app import *
from simtk.unit import *
from openmmtools import testsystems

x0 = testsystems.AlanineDipeptideVacuum().positions.value_in_unit(nanometer)

def short_run(x_init):
    integrator = LangevinIntegrator(300.0*kelvin, 1/picosecond, 0.002*picoseconds)
    topo = testsystems.AlanineDipeptideVacuum().topology
    sys = testsystems.AlanineDipeptideVacuum().system
    simulation = Simulation(topo, sys, integrator)
    
    simulation.context.setPositions(x_init)
    simulation.step(num_steps)
    
    state = simulation.context.getState(getPositions=True)
    x_final = state.getPositions(asNumpy=True).value_in_unit(nanometer)
    return x_final

import multiprocessing as mp
pool = mp.Pool(4)
results = pool.map(short_run, itertools.repeat(x0,4))
And the code just hangs. Checking lines one by one, the line that seems to cause the trouble is when the Simulation object is created. I had the same problem when I instead tried to directly construct a Context object. Again, it wouldn't surprise me if this is actually an issue with multiprocessing and not openmm, but if anyone has any thoughts, let me know.

Re: Parallel Runs

Posted: Mon Jan 27, 2020 6:00 pm
by peastman
I don't know of any reason creating the Simulation would hang. There are some cases where it might throw an exception, for example if all the processes are trying to use the same GPU and it's set to exclusive mode. Do you know which platform it's using? Try explicitly telling it to use the CPU platform by add Platform.getPlatformByName('CPU') as a a fourth argument to the Simulation constructor.

Re: Parallel Runs

Posted: Wed Jan 29, 2020 10:18 am
by gideonsimpson
Yup, specifying the platform solved the problem.

Re: Parallel Runs

Posted: Wed Jan 29, 2020 12:15 pm
by peastman
So it hangs with some platforms but works with others? Which ones does it hang with? Assuming it's the GPU based ones, are the different Contexts all using the same GPU, or does each use a different GPU (specifying the 'DeviceIndex' property when you create the Context)? If they all use the same GPU, is that GPU set to exclusive mode?