[問題] import/read csv/xls file (large data file)
[問題類型]:
程式諮詢(我想用R 做某件事情,但是我不知道要怎麼用R 寫出來)
問題很多不好意思麻煩大家了,自己有稍微google過,但總理不出頭緒
[軟體熟悉度]:
入門(寫過其他程式,只是對語法不熟悉)
[問題敘述]:
我想要import/read csv or xlsx file,但我不清楚R的概念。
1. 如我的資料在 D:\destop\datatest.csv,我的R Script也需要放在 D:/destop/ 嘛?
2. 有什麼方式可以讓我只要設定一次路徑,將資料擺在同個資料夾方便import?
3. 為什麼常常看到 library(readxl),如果我要使用 read_excel 的 code
如
library(readxl)
C1_data <- read_excel("D:\\destop\\datatest.xlsx")
4. 什麼狀況下 路徑的slide要 \\ 或 / ?
5. 假設我要import/read的資料檔案大小非常大,如下方 1.48 GB 的CSV (最主要的問題)
https://aqsdr1.epa.gov/aqsweb/aqstmp/airdata/hourly_44201_2016.zip 如連結
https://aqsdr1.epa.gov/aqsweb/aqstmp/airdata/annual_all_2016.zip <- 檔案較小供作測試
有沒有什麼方法可以讓我只抓它特定的row and column (包含數字與字串)
如 我想要 import 所有的 data,但在county.name的column裡面只要有"cook","DuPage",
"Kane","Kenosha","Lake","McHenry","Porter","Will"的rows就好。
6. header的字串原本有空白間距,import之後空白變成 .
我之後在處理的時候要 打 . 還是 空白?
[程式範例]:
#----- Source: https://aqsdr1.epa.gov/aqsweb/aqstmp/airdata/hourly_44201_2016.zip
# The followings are supposed to be the header of the data set
# 'State Code' 'County Code' 'Site Num' 'Parameter Code' 'POC'
# 'Latitude' 'Longitude' 'Datum' 'Parameter Name' 'Date Local'
# 'Time Local' 'Date GMT' 'Time GMT' 'Sample Measurement' 'Units of Measure'
# 'MDL' 'Uncertainty' 'Qualifier' 'Method Type' 'Method Code'
# 'Method Name' 'State Name' 'County Name' 'Date of Last Change'
Ozone <- read.csv("D:\\destop\\datatest.csv")
# import the data
# county.name contain "cook","DuPage","Kane","Kenosha","Lake","McHenry",
# "Porter","Will"
Ozone <- subset(Ozone, County.Name %in%
c("cook","DuPage","Kane","Kenosha","Lake","McHenry","Porter","Will"))
# 如何在import的時候就只讀入這些:S
[環境敘述]:
R-studio
[關鍵字]:
read.csv
read_exel
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 123.193.92.13
※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1492445728.A.44D.html
※ 編輯: peterwu76 (123.193.92.13), 04/18/2017 00:16:43
※ 編輯: peterwu76 (123.193.92.13), 04/18/2017 00:17:04
※ 編輯: peterwu76 (123.193.92.13), 04/18/2017 00:18:58
推
04/18 00:26, , 1F
04/18 00:26, 1F
→
04/18 00:29, , 2F
04/18 00:29, 2F
→
04/18 00:29, , 3F
04/18 00:29, 3F
→
04/18 00:31, , 4F
04/18 00:31, 4F
→
04/18 00:31, , 5F
04/18 00:31, 5F
→
04/18 00:32, , 6F
04/18 00:32, 6F
→
04/18 00:32, , 7F
04/18 00:32, 7F
→
04/18 00:33, , 8F
04/18 00:33, 8F
→
04/18 00:34, , 9F
04/18 00:34, 9F
→
04/18 00:36, , 10F
04/18 00:36, 10F
→
04/18 00:37, , 11F
04/18 00:37, 11F
→
04/18 00:40, , 12F
04/18 00:40, 12F
→
04/18 00:40, , 13F
04/18 00:40, 13F
→
04/18 00:41, , 14F
04/18 00:41, 14F
→
04/18 00:43, , 15F
04/18 00:43, 15F
→
04/18 00:44, , 16F
04/18 00:44, 16F
→
04/18 00:46, , 17F
04/18 00:46, 17F
→
04/18 00:47, , 18F
04/18 00:47, 18F
→
04/18 00:47, , 19F
04/18 00:47, 19F
→
04/18 00:48, , 20F
04/18 00:48, 20F
→
04/18 00:49, , 21F
04/18 00:49, 21F
→
04/18 00:50, , 22F
04/18 00:50, 22F
→
04/18 00:51, , 23F
04/18 00:51, 23F
→
04/18 00:59, , 24F
04/18 00:59, 24F
→
04/18 00:59, , 25F
04/18 00:59, 25F
→
04/18 01:01, , 26F
04/18 01:01, 26F
→
04/18 01:01, , 27F
04/18 01:01, 27F
→
04/18 01:02, , 28F
04/18 01:02, 28F
→
04/18 01:07, , 29F
04/18 01:07, 29F
→
04/18 01:08, , 30F
04/18 01:08, 30F
→
04/18 01:08, , 31F
04/18 01:08, 31F
→
04/18 01:09, , 32F
04/18 01:09, 32F
→
04/18 01:10, , 33F
04/18 01:10, 33F
※ 編輯: peterwu76 (123.193.92.13), 04/18/2017 01:30:27
→
04/18 02:02, , 34F
04/18 02:02, 34F
→
04/18 02:04, , 35F
04/18 02:04, 35F
→
04/18 02:24, , 36F
04/18 02:24, 36F
→
04/18 02:25, , 37F
04/18 02:25, 37F
R_Language 近期熱門文章
PTT數位生活區 即時熱門文章