[問題] merge 3 tables with summing common var

看板R_Language作者 (cywhale)時間10年前 (2015/10/12 16:54), 10年前編輯推噓3(304)
留言7則, 2人參與, 最新討論串1/5 (看更多)
[問題類型]: 效能諮詢(我想讓R 跑更快) 好像在哪曾看過較簡易的寫法或function,但一時想不起,也沒找到,寫了比較複雜的 code,想請問是否有更快或更簡易的方式做到 [軟體熟悉度]: 請把以下不需要的部份刪除 入門(寫過其他程式,只是對語法不熟悉) [問題敘述]: 請簡略描述你所要做的事情,或是這個程式的目的 Merge some data tables by the same key, 但若有相同的variables則合併時要相加, 不管NA,data tables彼此間的行、列數均不同 [程式範例]: library(data.table) library(dplyr) # testing data, assuming merge by key = "SP" set.seed(NULL) x <- matrix(sample(1e6), 1e5) %>% data.table() %>% setnames(1:10,sample(LETTERS,10)) %>% .[,SP:=seq_len(nrow(.))] y <- matrix(sample(1e5), 1e4) %>% data.table() %>% setnames(1:10,sample(LETTERS,10)) %>% .[,SP:=seq_len(nrow(.))] z <- matrix(sample(4e5), 2e4) %>% data.table() %>% setnames(1:20,sample(LETTERS,20)) %>% .[,SP:=seq_len(nrow(.))] # function.. try to write Rcpp function.. require(Rcpp) cppFunction('NumericVector addv(NumericVector x, NumericVector y) { NumericVector out(x.size()); NumericVector::iterator x_it,y_it,out_it; for (x_it = x.begin(), y_it=y.begin(), out_it = out.begin(); x_it != x.end(); ++x_it, ++y_it, ++out_it) { if (ISNA(*x_it)) { *out_it = *y_it; } else if (ISNA(*y_it)) { *out_it = *x_it; } else { *out_it = *x_it + *y_it; } } return out;}') ### merge two data.table with different columns/rows, ### and summing identical column names outer_join2 <- function (df1,df2,byNames) { tt=intersect(colnames(df1)[-match(byNames,colnames(df1))], colnames(df2)[-match(byNames,colnames(df2))]) df <- merge(df2,df1[,-tt,with=F],by=byNames,all=T) dt <- merge(df2[,-tt,with=F],df1[,c(byNames,tt),with=F],by=byNames,all=T) %>% .[,tt,with=F] for (j in colnames(dt)) {set(df,j=j,value=addv(df[[j]],dt[[j]]))} return (df) } # get results, 參考c大 #1LaHm_aH (R_Language) system.time(Reduce(function(x, y) outer_join2(x, y, byNames="SP"), list(x,y,z))) 用了較多行code來完成這件事,速度上似乎還可以,但不確定是否有更好的寫法?謝謝! [關鍵字]: 選擇性,也許未來有用 -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 140.112.65.48 ※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1444640089.A.EE0.html

10/15 17:40, , 1F
本系列收錄於z-4-13-5
10/15 17:40, 1F

10/15 21:51, , 2F
I saw it in 'z-4-13-5' not 14
10/15 21:51, 2F

10/15 21:57, , 3F
cy大對,我錯了 我可能當時太急著要下班XDD
10/15 21:57, 3F

10/15 21:58, , 4F
可以麻煩cy大幫我修推文嗎 避免後面的人看到我錯誤
10/15 21:58, 4F

10/15 21:58, , 5F
的指示,而找不到
10/15 21:58, 5F
※ 編輯: cywhale (36.228.159.121), 10/15/2015 22:07:23

10/15 22:08, , 6F
OK, done ^_^
10/15 22:08, 6F

10/15 22:08, , 7F
謝謝!!
10/15 22:08, 7F
文章代碼(AID): #1M6tLPxW (R_Language)
文章代碼(AID): #1M6tLPxW (R_Language)