[問題] CUDA 矩陣相加 error

看板C_and_CPP (C/C++)作者 (aada)時間16年前 (2009/12/19 01:12), 編輯推噓0(003)
留言3則, 2人參與, 最新討論串1/2 (看更多)
大家好, 想請教個問題 以下是我的程式,主要是CUDA運算矩陣加法 有些error不曉得要從拿邊修改起, #include<stdio.h> #include<stdlib.h> #include<time.h> #include<cuda.h> __global__ void VecAdd(int* A[2][2], int* B[2][2], int* C[2][2]) { int i = blockIdx.x*blockDim.x+threadIdx.x; //用多區塊,多執行緒的寫法 int j = blockIdx.y*blockDim.y+threadIdx.y; //用多區塊,多執行緒的寫法 C[i][j] = A[i][j]+B[i][j]; } int main() { int N = 2; int h_A[2][2]={{1,2},{3,4}}; int h_B[2][2]={{5,6},{7,8}}; int *h_B2[2][2]; int size = N*sizeof(int); int Grid = 1; int Block = 10; int* d_A; cudaMalloc((void**)&d_A, size); int* d_B; cudaMalloc((void**)&d_B, size); int* d_C; cudaMalloc((void**)&d_C, size); cudaMemcpy(d_A, h_A, size, cudaMemcpyHostToDevice); cudaMemcpy(d_B, h_B, size, cudaMemcpyHostToDevice); VecAdd<<<Grid, Block>>>(d_A, d_B, d_C); cudaMemcpy(h_B2, d_C, size, cudaMemcpyDeviceToHost); for(int i=0;i<N;++i) { for(int j=0;j<N;++j) { printf("%d+%d=%d\n",h_A[i][j],h_B[i][j],h_B2[i][j]); } } cudaFree(d_A); cudaFree(d_B); cudaFree(d_C); system("PAUSE"); return 0; } matrix_add.cu(11): error: expression must have integral or enum type matrix_add.cu(43): error: argument of type "int *" is incompatible with parameter of type "int *(*)[2]" matrix_add.cu(43): error: argument of type "int *" is incompatible with parameter of type "int *(*)[2]" matrix_add.cu(43): error: argument of type "int *" is incompatible with parameter of type "int *(*)[2]" 應該要怎麼改呢,謝謝 -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 140.122.193.103

12/19 04:10, , 1F
錯誤訊息寫的非常清楚,* 跟 ** 是有差的
12/19 04:10, 1F

12/19 09:33, , 2F
…看起來不像有多執行緒,沒看到執行緒的函式。=_=|||
12/19 09:33, 2F

12/19 09:48, , 3F
剛剛爬文後,才曉得有CUDA這東東…= =|||
12/19 09:48, 3F
文章代碼(AID): #1BAxVnCd (C_and_CPP)
討論串 (同標題文章)
文章代碼(AID): #1BAxVnCd (C_and_CPP)