OpenMM segfault

The functionality of OpenMM will (eventually) include everything that one would need to run modern molecular simulation.
User avatar
Peter Eastman
Posts: 2593
Joined: Thu Aug 09, 2007 1:25 pm

Re: OpenMM segfault

Post by Peter Eastman » Wed Apr 17, 2013 12:45 pm

Apparently this is a known bug that was introduced in a recent version of the APP SDK. See http://devgurus.amd.com/message/1288668. It should be fixed in their next update. In the mean time, try downgrading to an earlier version (http://developer.amd.com/tools/heteroge ... d-archive/) and see if that works.

Peter

User avatar
Silvio a Beccara
Posts: 10
Joined: Tue Nov 24, 2009 4:30 am

Re: OpenMM segfault

Post by Silvio a Beccara » Wed Apr 17, 2013 11:59 pm

Hi Peter,

thanks for your help, I will try and downgrade my AMD SDK and see if that works.

Silvio

peastman wrote:Apparently this is a known bug that was introduced in a recent version of the APP SDK. See http://devgurus.amd.com/message/1288668. It should be fixed in their next update. In the mean time, try downgrading to an earlier version (http://developer.amd.com/tools/heteroge ... d-archive/) and see if that works.

Peter

User avatar
Christopher Ryan
Posts: 5
Joined: Fri Feb 24, 2012 11:49 am

Re: OpenMM segfault

Post by Christopher Ryan » Mon May 13, 2013 12:21 pm

Hi Peter,

The link you posted (http://devgurus.amd.com/message/1288668) corresponds only to the "Setting of real/effective user Id to 0/0 failed..." error message, which does not interfere with how the programs ultimately run. It seems this message is unrelated to the segfault Silvio and I have encountered, though I apologize if I've misunderstood something.

Also, though OpenCL via the AMD APP SDK runs unofficially on non-AMD CPUs (I think), the Intel SDK for OpenCL seems only to run on Intel hardware. NVIDIA's OpenCL implementation runs only on NVIDIA GPUs, not CPUs. I want to run OpenMM parallelized on an AMD CPU cluster, and it seems that if I can't figure out the segfault I'm seeing then this won't be possible. Any advice would be great!

Thanks,
Chris

ps: I haven't had time to downgrade the AMD APP SDK, but I can still try if you think it would be helpful. Since installation of this software requires root privileges by default, an administrator needs to hack the install script a little to install it in my user directory. So it's not a quick test for the two of us, necessarily.

User avatar
Peter Eastman
Posts: 2593
Joined: Thu Aug 09, 2007 1:25 pm

Re: OpenMM segfault

Post by Peter Eastman » Fri May 17, 2013 10:39 am

The error discussed in that thread was exactly the same one:
FATAL: Module fglrx not found.
Error! Fail to load fglrx kernel module! Maybe you can switch to root user to load kernel module directly
So far as I know, the only solution to this problem is to downgrade, or else to wait for AMD to post a new update that fixes it. You could try contacting AMD directly and see if they have any other solution.

Sorry about that!

Peter

User avatar
Christopher Ryan
Posts: 5
Joined: Fri Feb 24, 2012 11:49 am

Re: OpenMM segfault

Post by Christopher Ryan » Fri May 17, 2013 11:00 am

Hi Peter,

I realize that the error message is exactly the same one. However, in that forum post the user notes that the error does not cause the program to run incorrectly:
But I care about an error that I still have

Setting of real/effective user Id to 0/0 failed
FATAL: Module fglrx not found.
Error! Fail to load fglrx kernel module! Maybe you can switch to root user to load kernel module directly


what does it mean? examples works, but....
This is also mentioned in the AMD APP SDK release notes that I linked to earlier in this post:
Known Issues...

Executing samples on Linux using the CPU runtime reports the following message, but
continues to execute as expected :

FATAL: Module fglrx not found.
Error! Fail to load fglrx kernel module! Maybe you can switch to root user
to load kernel module directly
This is consistent with what I've seen when running the AMD APP SDK, that example programs finish properly. But segfaults happen for Silvio and I when using OpenMM together with this OpenCL implementation. It seems to me that the segfault we are seeing could therefore be unrelated to the error message, but perhaps not. I'll try downgrading when possible.

Thanks again,
Chris

User avatar
Peter Eastman
Posts: 2593
Joined: Thu Aug 09, 2007 1:25 pm

Re: OpenMM segfault

Post by Peter Eastman » Mon May 20, 2013 5:21 pm

You think the segfault is unrelated to the error message? I suppose it's possible, though in that case I have no idea what would be causing the segfault. It's happening deep inside the OpenCL compiler.

Have you tried OpenMM 5.1 to see whether this still happens? I have no reason at all to think it would have changed, but it would be good to check just to be sure.

Peter

POST REPLY