OpenCL unavailable device issue
Posted: Wed Aug 18, 2010 7:40 am
I was wondering if anyone has seen the following happen before, and if there is a simple solution, short of a system restart.
I recently killed a pyopenmm script that uses the OpenCL kernel prematurely using `kill <pid>`. I've done this in the past without issues, but this time, when I went to restart the program after making some small parameter changes, I got the following error:
Traceback (most recent call last):
File "teststatestability_12.py", line 70, in <module>
context = openmm.Context(system, integrator, platform)
File "/home/jadelman/python-2.6.5/lib/python2.6/site-packages/simtk/chem/openmm/openmm.py", line 1257, in __init__
this = _openmm.new_Context(*args)
Exception: Error initializing context: clCreateContextFromType (-2)
The -2 error equates to CL_DEVICE_NOT_AVAILABLE, I believe. THe machine has two C1060s and the other device was not effected, and deviceQuery showed both GPUs.
It seems like killing OpenMM might have been ungraceful and left the GPU in some sort of memory locked state. I've done this before without issue, so it is not an ongoing problem as far as I can tell. Our solution thus far has been to just restart the machine, which seems to work, but is obviously not ideal.
I recently killed a pyopenmm script that uses the OpenCL kernel prematurely using `kill <pid>`. I've done this in the past without issues, but this time, when I went to restart the program after making some small parameter changes, I got the following error:
Traceback (most recent call last):
File "teststatestability_12.py", line 70, in <module>
context = openmm.Context(system, integrator, platform)
File "/home/jadelman/python-2.6.5/lib/python2.6/site-packages/simtk/chem/openmm/openmm.py", line 1257, in __init__
this = _openmm.new_Context(*args)
Exception: Error initializing context: clCreateContextFromType (-2)
The -2 error equates to CL_DEVICE_NOT_AVAILABLE, I believe. THe machine has two C1060s and the other device was not effected, and deviceQuery showed both GPUs.
It seems like killing OpenMM might have been ungraceful and left the GPU in some sort of memory locked state. I've done this before without issue, so it is not an ongoing problem as far as I can tell. Our solution thus far has been to just restart the machine, which seems to work, but is obviously not ideal.