2010-07-05 03:57
Submitted by:
John Chodera (jchodera)
Assigned to:
Peter Eastman (peastman)
Memory leak during VerletIntegrator integration of CustomNonbondedForce on OpenCL platform

Detailed description
The attached C++ example code creates a 150-particle Kob-Andersen system using a CustomNonbondedForce and makes repeated calls to VerletIntegrator::step() to integrate dynamics on the OpenCL platform. On my OS X machines, this leaks memory at a rate of ~ 1 MB/s. The code contains a set of boolean switches to allow you to add a comparable NonbondedForce term in place fo the CustomNonbondedForce, in which case no leak is observed.

Add A Comment: Notepad


Message  ↓
Date: 2010-07-08 18:36
Sender: Peter Eastman

I don't think we have any Macs around here with ATI GPUs, but I'll ask around.

Date: 2010-07-08 17:54
Sender: John Chodera

Actually, hold on a tic. Instruments reported the leak was due to a malloc in clhConstRefSetAddress, found in libclh.dylib. This library is part of the NVIDIA OS X GeForce GL driver series, found in /System/Library/Extensions/GeForceGLDriver.bundle/Contents/MacOS/. The call immediately prior is glrCompExecuteNativeKernel, in GeForceGLDriver. I wonder if this might be an NVIDIA driver bug after all.

Is there a way to run this on an OS X 10.6 machine with a non-NVIDIA GPU, or via the CPU? That might help us identify whether it's an NVIDIA bug or not.

Date: 2010-07-07 20:09
Sender: Peter Eastman

Unfortunately, I don't have any inside contacts to Apple's OpenCL team. There's https://bugreport.apple.com, but there are no guarantees about when or if they'll actually fix it. You could also go through their developer technical support (http://developer.apple.com/programs/mac/support.html) which has a higher chance of success, but requires you to pay.

Date: 2010-07-07 19:59
Sender: John Chodera

Well, I've just had a chance to try this on NCSA Lincoln, a Linux machine on which Cuda 3.0 is installed. It doesn't appear to leak, suggesting the leak is in the OS X OpenCL drivers or the NVIDIA-supplied GTX 285 drivers. This is unfortunate.

Do you have any ideas as to how we could coordinate with them to track down the problem and get it fixed?

Date: 2010-07-07 00:53
Sender: John Chodera

I think I figured out how to run the OS X developer tool "Instruments" to look for leaked memory, and during execution, observed a leak described by the attached image "instruments-leak.png".

It's unclear to me whether this memory might be cleaned up at the end of execution, as the "Leaks" section didn't always identify this as a proper leak, but the "ObjectAlloc" instrument did list it as growing in memory consumption during execution. From the looks of it, I can't tell if this is an issue of needing to instruct the OpenCL drivers to explicitly clean up something or if this should be handled in automatically.

If you think this is a bug in OS X OpenCL's implementation, could you coordinate with the Apple developers on this?


Attached Files:

Size Name Date By Download
2.41 KBmemoryleaktest-customnonbondedforce.tgz2010-07-05 03:57jchoderamemoryleaktest-customnonbondedforce.tgz
92.76 KBinstruments-leak.png2010-07-07 00:53jchoderainstruments-leak.png


Field Old Value Date By
File Added307: instruments-leak.png2010-07-07 00:53jchodera
File Added305: memoryleaktest-customnonbondedforce.tgz2010-07-05 03:57jchodera