Re: [問題] 處理資料問題(pkg:dplyr)

看板R_Language作者 (攸藍)時間10年前 (2015/08/06 14:11), 10年前編輯推噓2(200)
留言2則, 1人參與, 最新討論串4/4 (看更多)
※ 引述《psinqoo (零度空間)》之銘言: : 延伸問題 : 正被搞暈中 : RAW DATA長這樣 好讀版:http://pastebin.com/tPH8i43p library(data.table) library(reshape2) library(dplyr) library(tidyr) library(magrittr) dat = data.table(ID = rep(LETTERS[1:2], 4:3), location = c('TAI', 'JP', 'CH', 'KOE', 'JP', 'GOK', 'TA'), year = c(2012, 2013, 2014, 2011, 2011, 2015, 2012), sex = rep(c("F", "M"), 4:3)) : ID 地點 日期 性別 : A TAI 2012 F : A JP 2013 F : A CH 2014 F : A KOE 2011 F : B JP 2011 M : B GOK 2015 M : B TA 2012 M : 變成下面這樣 : 第一種 形式 : ID 地點一 地點二 地點三 地點四 性別 : A KOE TAI JP CH F : B JP TA GOK M dat %>% group_by(ID, sex) %>% mutate(location_name = paste0("location_", 1:n())) %>% select(-year) %>% spread(location_name, location, fill = "") # ID sex location_1 location_2 location_3 location_4 # 1 A F TAI JP CH KOE # 2 B M JP GOK TA : 第二種 : ID 2011 2012 2013 2014 2015 性別 : A KOE TAI JP CH F : B JP TA GOK M dat %>% spread(year, location, fill="") ## dcast.data.table(dat, ID + sex ~ year, value.var = "location") # ID sex 2011 2012 2013 2014 2015 # 1: A F KOE TAI JP CH # 2: B M JP TA GOK : 第三種 : ID 地點 性別 : A KO,TAI,JP,CH F : B JP,TA,GOK M ## method 1 dat %>% spread(year, location, fill="") %>% setnames(as.character(2011:2015), paste0("year_", 2011:2015)) %>% mutate(location = paste(year_2011, year_2012,year_2013,year_2014,year_2015, sep = ",")) %>% mutate(location = gsub(',+', ',',gsub(',$', '', location))) %>% select(ID, sex, location) # ID sex location # 1: A F KOE,TAI,JP,CH # 2: B M JP,TA,GOK ## method 2 (little tricky) dat3 = dat %>% mutate(ones = "c1") %>% select(-year) %>% dcast.data.table(ID + sex ~ ones, fun.aggregate = function(x){paste(x, collapse = ',')}, value.var = "location") %>% setnames("c1", "location") dat3 # ID sex location # 1: A F TAI,JP,CH,KOE # 2: B M JP,GOK,TA ## 3 (method 2 with order of year) # PS: output will be the same as method 1 dat3_2 = dat %>% mutate(ones = "c1") %>% arrange(ID, year) %>% select(-year) %>% dcast.data.table(ID + sex ~ ones, fun.aggregate = function(x){paste(x, collapse = ',')}, value.var = "location") %>% setnames("c1", "location") dat3_2 # ID sex location # 1: A F KOE,TAI,JP,CH # 2: B M JP,TA,GOK -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 1.163.8.105 ※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1438841468.A.E39.html

08/06 15:59, , 1F
好強好強~~~~五體投地
08/06 15:59, 1F
= = 我有種被你考核的感覺 囧

08/07 08:14, , 2F
麥安捏共~~造福版友
08/07 08:14, 2F
哈哈,這個一定不遺餘力 ※ 編輯: celestialgod (42.72.254.183), 08/07/2015 09:19:41
文章代碼(AID): #1Lmlfyuv (R_Language)
文章代碼(AID): #1Lmlfyuv (R_Language)