[問題] ptt text mining 操作問題

看板R_Language作者 (王者迪西)時間9年前 (2016/12/04 02:07), 9年前編輯推噓0(002)
留言2則, 1人參與, 最新討論串1/1
[問題類型]: 因為要做報告 想要試著做ptt textmining 照著陳嘉葳大大的步驟做結果碰到了一對問題 [軟體熟悉度]: 新手(第一次寫R,因為要寫報告才接觸的XD) [問題敘述]: 1.無法安裝tmcn套件 安裝後會出現 * installing *source* package 'tmcn' ... ** libs *** arch - i386 Warning: 執行中命令 'make -f "C:/PROGRA~1/R/R-32~1.5/etc/i386/Makeconf" -f "C:/PROGRA~1/R/R-32~1.5/share/make/winshlib.mk" SHLIB_LDFLAGS='$(SHLIB_CXXLDFLAGS)' SHLIB_LD='$(SHLIB_CXXLD)' SHLIB="tmcn.dll" OBJECTS="tmcn_encoding_isbig5.o tmcn_encoding_isgb18030.o tmcn_encoding_isgb2312.o tmcn_encoding_isgbk.o tmcn_encoding_isutf8.o"' 已有狀 態 127 ERROR: compilation failed for package 'tmcn' * removing 'C:/Program Files/R/R-3.2.5/library/tmcn' The downloaded source packages are in ‘C:\Users\家\AppData\Local\Temp\RtmpS8wKpT\downloaded_packages’ Warning messages: 1: 執行中命令 '"C:/PROGRA~1/R/R-32~1.5/bin/i386/R" CMD INSTALL -l "C:\Program Files\R\R-3.2.5\library" C:\Users\家 \AppData\Local\Temp\RtmpS8wKpT/downloaded_packages/tmcn_0.1-4.tar.gz' 已有狀態 1 2: In install.packages("tmcn", repos = "http://R-Forge.R-project.org", : installation of package ‘tmcn’ had non-zero exit status 2.無法安裝Rwordseg 安裝會出現 * installing *source* package 'Rwordseg' ... ** R ** demo ** inst ** preparing package for lazy loading Error : .onLoad failed in loadNamespace() for 'rJava', details: call: fun(libname, pkgname) error: JAVA_HOME cannot be determined from the Registry Error : package 'rJava' could not be loaded ERROR: lazy loading failed for package 'Rwordseg' * removing 'C:/Program Files/R/R-3.2.5/library/Rwordseg' The downloaded source packages are in ‘C:\Users\家\AppData\Local\Temp\RtmpS8wKpT\downloaded_packages’ Warning messages: 1: 執行中命令 '"C:/PROGRA~1/R/R-32~1.5/bin/i386/R" CMD INSTALL -l "C:\Program Files\R\R-3.2.5\library" C:\Users\家 \AppData\Local\Temp\RtmpS8wKpT/downloaded_packages/Rwordseg_0.2-1.tar.gz' 已有 狀態 1 2: In install.packages("Rwordseg", repos = "http://R-Forge.R-project.org") : installation of package ‘Rwordseg’ had non-zero exit status 3.執行程式碼出現Error in curl::curl_fetch_memory(url, handle = handle) : Couldn't resolve host name 程式碼如下 4.求一個可以跟《用R進行中文 text Mining》做到相同效果的程式碼 [程式範例]: data <- data[data!="www.ptt.cc"] setwd("C:\\Users\\家\\Desktop\\R Test\\新增資料夾") doc.size <- length(data) doc.list<-c() for( k in 1:length(data)){ html <- content(GET(data[k],config=set_cookies("over18"="1")), encoding="UTF-8") doc <- xpathSApply(html, "//div[@id='main-content']", xmlValue) if(length(as.character(doc))==1){ name <- strsplit(data[k], '/')[[1]][4] write(doc, gsub('html', 'txt', name)) } } [環境敘述]: R 3.2.5 x32 [關鍵字]: ptt text mining -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 123.204.172.160 ※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1480788478.A.0F3.html ※ 編輯: h920032 (123.204.172.160), 12/04/2016 03:02:16

12/04 08:44, , 1F
2缺rJava
12/04 08:44, 1F

12/04 08:45, , 2F
斷字可以考慮用jiebaR
12/04 08:45, 2F
文章代碼(AID): #1OGmd-3p (R_Language)
文章代碼(AID): #1OGmd-3p (R_Language)