[問題] OpenMP加速問題

看板C_and_CPP (C/C++)作者hardman1110 (笨小孩)時間10年前 (2016/05/11 11:56)推噓2(2推 0噓 4→)

留言6則, 2人參與討論串1/1

開發平台(Platform): (Ex: VC++, GCC, Linux, ...) vc2015 額外使用到的函數庫(Library Used): (Ex: OpenGL, ...) opencv, openmp 問題(Question)：在迴圈內作加總，已加入reduction等設定但效能還是比沒用mp來的差有時甚至程式會卡住一段時間餵入的資料(Input)： ROI,影像data及其權重預期的正確結果(Expected Output)：速度倍增 >> CPU為I5 3210M 錯誤結果(Wrong Output)：比原本還慢程式碼(Code)：(請善用置底文網頁, 記得排版) 貼上最核心的CODE 其他只是圖檔等輸出入還有權重計算，跟OPENMP無關 ------------- float tempValueNoMP = 0; double t0 = 0; double t1 = 0; double t2 = 0; t0 = (double)getTickCount(); // without MP for (int k = 0; k < lSize; k++) { int xMin = 0, xMax = 0, yMin = 0, yMax = 0; xMin = ROI[j].x + Position[i][k].x; xMax = xMin + Position[i][k].width; yMin = ROI[j].y + Position[i][k].y; yMax = yMin + Position[i][k].height; tempValueNoMP += Weight[i][k] * (data.at<float>(yMin, xMin) + data.at<float>(yMax, xMax) - data.at<float>(yMin, xMax) - data.at<float>(yMax, xMin)); } t0 = ((double)getTickCount() - t0) / getTickFrequency(); t1 = (double)getTickCount(); // with MP #pragma omp parallel reduction( +:tempValue) { #pragma omp for for (int k = 0; k < lSize; k++) { int xMin = 0, xMax = 0, yMin = 0, yMax = 0; xMin = ROI[j].x + Position[i][k].x; xMax = xMin + Position[i][k].width; yMin = ROI[j].y + Position[i][k].y; yMax = yMin + Position[i][k].height; tempValue += Weight[i][k] * (data.at<float>(yMin, xMin) + data.at<float>(yMax, xMax) - data.at<float>(yMin, xMax) - data.at<float>(yMax, xMin)); } } t1 = ((double)getTickCount() - t1) / getTickFrequency(); printf("%.3f\n", t0 - t1); 補充說明(Supplement)：有比對過tempvalue & tempvalueNoMP，答案一樣但t0就是比t1小，程式更改後也卡卡的 -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 211.72.181.189 ※ 文章網址: https://www.ptt.cc/bbs/C_and_CPP/M.1462939000.A.DC8.html ※ 編輯: hardman1110 (211.72.181.189), 05/11/2016 12:01:18