Re: [問題] 處理資料問題(pkg:dplyr)
※ 引述《psinqoo (零度空間)》之銘言:
: 延伸問題
: 正被搞暈中
: RAW DATA長這樣
好讀版:http://pastebin.com/tPH8i43p
library(data.table)
library(reshape2)
library(dplyr)
library(tidyr)
library(magrittr)
dat = data.table(ID = rep(LETTERS[1:2], 4:3),
location = c('TAI', 'JP', 'CH', 'KOE', 'JP', 'GOK', 'TA'),
year = c(2012, 2013, 2014, 2011, 2011, 2015, 2012),
sex = rep(c("F", "M"), 4:3))
: ID 地點 日期 性別
: A TAI 2012 F
: A JP 2013 F
: A CH 2014 F
: A KOE 2011 F
: B JP 2011 M
: B GOK 2015 M
: B TA 2012 M
: 變成下面這樣
: 第一種 形式
: ID 地點一 地點二 地點三 地點四 性別
: A KOE TAI JP CH F
: B JP TA GOK M
dat %>% group_by(ID, sex) %>%
mutate(location_name = paste0("location_", 1:n())) %>%
select(-year) %>% spread(location_name, location, fill = "")
# ID sex location_1 location_2 location_3 location_4
# 1 A F TAI JP CH KOE
# 2 B M JP GOK TA
: 第二種
: ID 2011 2012 2013 2014 2015 性別
: A KOE TAI JP CH F
: B JP TA GOK M
dat %>% spread(year, location, fill="")
## dcast.data.table(dat, ID + sex ~ year, value.var = "location")
# ID sex 2011 2012 2013 2014 2015
# 1: A F KOE TAI JP CH
# 2: B M JP TA GOK
: 第三種
: ID 地點 性別
: A KO,TAI,JP,CH F
: B JP,TA,GOK M
## method 1
dat %>% spread(year, location, fill="") %>%
setnames(as.character(2011:2015), paste0("year_", 2011:2015)) %>%
mutate(location = paste(year_2011, year_2012,year_2013,year_2014,year_2015,
sep = ",")) %>%
mutate(location = gsub(',+', ',',gsub(',$', '', location))) %>%
select(ID, sex, location)
# ID sex location
# 1: A F KOE,TAI,JP,CH
# 2: B M JP,TA,GOK
## method 2 (little tricky)
dat3 = dat %>% mutate(ones = "c1") %>% select(-year) %>%
dcast.data.table(ID + sex ~ ones,
fun.aggregate = function(x){paste(x, collapse = ',')},
value.var = "location") %>% setnames("c1", "location")
dat3
# ID sex location
# 1: A F TAI,JP,CH,KOE
# 2: B M JP,GOK,TA
## 3 (method 2 with order of year)
# PS: output will be the same as method 1
dat3_2 = dat %>% mutate(ones = "c1") %>%
arrange(ID, year) %>% select(-year) %>%
dcast.data.table(ID + sex ~ ones,
fun.aggregate = function(x){paste(x, collapse = ',')},
value.var = "location") %>% setnames("c1", "location")
dat3_2
# ID sex location
# 1: A F KOE,TAI,JP,CH
# 2: B M JP,TA,GOK
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 1.163.8.105
※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1438841468.A.E39.html
推
08/06 15:59, , 1F
08/06 15:59, 1F
= = 我有種被你考核的感覺 囧
推
08/07 08:14, , 2F
08/07 08:14, 2F
哈哈,這個一定不遺餘力
※ 編輯: celestialgod (42.72.254.183), 08/07/2015 09:19:41
討論串 (同標題文章)
本文引述了以下文章的的內容:
完整討論串 (本文為第 4 之 4 篇):
R_Language 近期熱門文章
PTT數位生活區 即時熱門文章