[問題] OpenMP加速問題
開發平台(Platform): (Ex: VC++, GCC, Linux, ...)
vc2015
額外使用到的函數庫(Library Used): (Ex: OpenGL, ...)
opencv, openmp
問題(Question):
在迴圈內作加總,已加入reduction等設定但效能還是比沒用mp來的差有時
甚至程式會卡住一段時間
餵入的資料(Input):
ROI,影像data及其權重
預期的正確結果(Expected Output):
速度倍增 >> CPU為I5 3210M
錯誤結果(Wrong Output):
比原本還慢
程式碼(Code):(請善用置底文網頁, 記得排版)
貼上最核心的CODE 其他只是圖檔等輸出入還有權重計算,跟OPENMP無關
-------------
float tempValueNoMP = 0;
double t0 = 0;
double t1 = 0;
double t2 = 0;
t0 = (double)getTickCount();
// without MP
for (int k = 0; k < lSize; k++)
{
int xMin = 0, xMax = 0, yMin = 0, yMax = 0;
xMin = ROI[j].x + Position[i][k].x;
xMax = xMin + Position[i][k].width;
yMin = ROI[j].y + Position[i][k].y;
yMax = yMin + Position[i][k].height;
tempValueNoMP += Weight[i][k] *
(data.at<float>(yMin, xMin) + data.at<float>(yMax, xMax) -
data.at<float>(yMin, xMax) - data.at<float>(yMax, xMin));
}
t0 = ((double)getTickCount() - t0) / getTickFrequency();
t1 = (double)getTickCount();
// with MP
#pragma omp parallel reduction( +:tempValue)
{
#pragma omp for
for (int k = 0; k < lSize; k++)
{
int xMin = 0, xMax = 0, yMin = 0, yMax = 0;
xMin = ROI[j].x + Position[i][k].x;
xMax = xMin + Position[i][k].width;
yMin = ROI[j].y + Position[i][k].y;
yMax = yMin + Position[i][k].height;
tempValue += Weight[i][k] *
(data.at<float>(yMin, xMin) + data.at<float>(yMax, xMax) -
data.at<float>(yMin, xMax) - data.at<float>(yMax, xMin));
}
}
t1 = ((double)getTickCount() - t1) / getTickFrequency();
printf("%.3f\n", t0 - t1);
補充說明(Supplement):
有比對過tempvalue & tempvalueNoMP,答案一樣
但t0就是比t1小,程式更改後也卡卡的
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 211.72.181.189
※ 文章網址: https://www.ptt.cc/bbs/C_and_CPP/M.1462939000.A.DC8.html
※ 編輯: hardman1110 (211.72.181.189), 05/11/2016 12:01:18
推
05/11 14:34, , 1F
05/11 14:34, 1F
→
05/11 14:40, , 2F
05/11 14:40, 2F
→
05/11 14:40, , 3F
05/11 14:40, 3F
→
05/11 17:03, , 4F
05/11 17:03, 4F
→
05/11 17:04, , 5F
05/11 17:04, 5F
推
05/12 09:11, , 6F
05/12 09:11, 6F
C_and_CPP 近期熱門文章
PTT數位生活區 即時熱門文章