Re: [問題] snow中使用parSapply 找不到函數
※ 引述《f496328mm (為什麼會流淚)》之銘言:
: > sapply(c(1:10), function(x) actv_fun(data,bo_matrix,x))
: [1] 0.5 0.5 0.5 3.0 1.5 17.5 9.0 0.5 2.5 2.5
: > parSapply(cl,c(1:10), function(x) actv_fun(data,bo_matrix,x))
: Error in checkForRemoteErrors(val) :
: 6 nodes produced errors; first error: 沒有這個函數 "actv_fun"
: http://imgur.com/nMZbBme

: 一樣的東西 sapply都可以執行
: 但是為什麼用到parSapply
: 卻會出現沒有這個函數??
: sapply不是可以用嗎?
: 該不會parSapply只能用內定的函數吧?
我通常都用doSNOW搭配plyr跟foreach來做
舉個例子,我要平行來parse xml檔案,裡面用到xml2, purrr, stringr跟pipeR:
(可以到這裡下載幾個xml.gz,放在vd資料夾裡面測試看看:
http://tisvcloud.freeway.gov.tw/history/vd/20160501/ )
library(xml2)
library(foreach)
library(doSNOW)
library(plyr)
library(purrr)
library(stringr)
library(pipeR)
# 我只用九成的執行緒個數 (可以根據自己喜好去設定要不要全用)
# 至於MPI部分,我就比較不熟了...
cl <- makeCluster(floor(parallel::detectCores() * 0.9), type = "SOCK")
registerDoSNOW(cl)
# library package in clusters
clusterEnv <- clusterEvalQ(cl, {
library(plyr)
library(stringr)
library(purrr)
library(pipeR)
library(xml2)
})
# export functions (我這沒有函數、變數要export就寫個範例這樣
# 就把每一個變數或是函數的名稱用字串放到list裡面即可
# clusterExport(cl, list("x", "y", "some_variables", "some_functions"))
# list the folders in vd
vdFiles <- list.files("vd", "\\.xml.gz", full.names = TRUE)
# 將llply做平行使用,只要加上.parallel=TRUE,以及registerDoSNOW就可以了
vdValueDataTable <- llply(vdFiles, function(xmlFileName){
# try to read the file
xmlFile <- try({xmlFileName %>>% read_xml(encoding = 'UTF-8') %>>%
xml_children %>>% xml_children})
# if there is something wrong, return NULL
if (any(class(xmlFile) == "try-error") || length(xmlFile) == 0)
return(NULL)
# used for combine columns of dat1
repRows <- map(seq_along(xmlFile), ~ rep(., each =
xml_length(xmlFile[[.]])*3)) %>>% do.call(what = c)
# output character matrix
xmlFile %>>% {
xmlFile %>>%
# find the vdid, status and datacollecttime
(~ dat1 <- xml_attrs(.) %>>% do.call(what = rbind) %>>%
(x ~ x[repRows, ])) %>>% xml_children %>>%
# find the vsrdir, vsrid, speed and laneoccupy
(~ dat2 <- xml_attrs(.) %>>% do.call(what = rbind) %>>%
(x ~ x[rep(1:nrow(x), each = 3), ])) %>>% xml_children %>>%
xml_attrs %>>% do.call(what = rbind) %>>%
# find the carid and volume
cbind(dat1, dat2)
} %>>% (.[.[, "laneoccupy"] != "-99" & .[, "volume"] != "-99" &
.[, "speed"] != "-99" & .[, "volume"] != "0" &
.[, "status"] == "0", ]) %>>%
(.[ , c("vdid", "carid", "datacollecttime", "speed",
"laneoccupy", "volume")])
}, .parallel = TRUE) %>>% do.call(what = rbind)
--
R資料整理套件系列文:
magrittr #1LhSWhpH (R_Language) http://tinyurl.com/j3ql84c
data.table #1LhW7Tvj (R_Language) http://tinyurl.com/hr77hrn
dplyr(上) #1LhpJCfB (R_Language) http://tinyurl.com/jtg4hau
dplyr(下) #1Lhw8b-s (R_Language)
tidyr #1Liqls1R (R_Language) http://tinyurl.com/jq3o2g3
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 180.218.152.118
※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1462552397.A.E80.html
※ 編輯: celestialgod (180.218.152.118), 05/07/2016 00:35:47
討論串 (同標題文章)
本文引述了以下文章的的內容:
完整討論串 (本文為第 2 之 2 篇):
R_Language 近期熱門文章
PTT數位生活區 即時熱門文章