AboutDownloadsDocumentsForumsSource CodeIssues
Date:
2015-05-10 18:56
Priority:
3
State:
Open
Submitted by:
Andrey Alekseenko (aland)
Assigned to:
Nobody (None)
Summary:
Incorrect computing of distance in assignToClusters_KMCUDA

Detailed description
In assignToClusters_KMCUDA kernel, the coalesced prefetch is working incorrectly, because it assumes that all blockDim.x threads are active, while in the last block some might stall due to (t<N) condition.

The line campaign/trunk/dev/kmeansGPU/kmeansGPU.cu:159 in the last block gets executed only for (N%blockDim.x) threads, which leads to incorrect loading of cluster centroid coordinates into shared memory if D>(N%blockDim.x).

The smoke test does not catch it because in test dataset D=1, therefore even one active thread in block is enough to load centroid coordinate.

I attach an example of possible patch, albeit it was not tested extensively.

Add A Comment: Notepad

Message  ↓
Date: 2015-05-10 18:59
Sender: Andrey Alekseenko

The file wasn't uploaded due to some cryptic error, so
uploaded it to pastebin: http://pastebin.com/WbWn7J6s

No Changes Have Been Made to This Item

Feedback