Re: [問題] 資料堆疊

看板R_Language作者 (攸藍)時間10年前 (2015/08/07 17:07), 10年前編輯推噓1(100)
留言1則, 1人參與, 最新討論串4/4 (看更多)
※ 引述《naturalsmen (日日夜夜)》之銘言: : ※ 引述《celestialgod (攸藍)》之銘言: : 恕刪 : 借這篇問一下 : 要怎麼避免spread後r自動排序的問題?? : 如果spread(num, sth.)的num是包含>9的數字的名稱 : 例如: student1~student10 : r會自己把它排成student1 student10 student2 ... student9 : 這樣的情況要怎麼解決?? 我有兩個笨方法,一個是重新排列columns,另一個是改名 另外還有一個好方法是factorize 好讀版:http://pastebin.com/BPWGgByi library(data.table) library(dplyr) library(tidyr) library(magrittr) DT = data.table(stu = paste0("stu", 1:20), X = rnorm(20), Y = rnorm(20)) %>% gather(cate, values, -stu) DT %>% spread(stu, values) %>% tbl_dt(FALSE) # cate stu1 stu10 stu11 stu12 stu13 stu14 # 1 X -0.08476976 0.5428922 1.9929332 -0.6145632 -0.06098296 -1.066283 # 2 Y 0.59710869 -1.0037766 0.3508158 0.4587201 -0.13639207 1.385517 # Variables not shown: stu15 (dbl), stu16 (dbl), stu17 (dbl), stu18 (dbl), # stu19 (dbl), stu2 (dbl), stu20 (dbl), stu3 (dbl), stu4 (dbl), stu5 (dbl), # stu6 (dbl), stu7 (dbl), stu8 (dbl), stu9 (dbl) ## factorize DT = data.table(stu = paste0("stu", 1:20), X = rnorm(20), Y = rnorm(20)) %>% gather(cate, values, -stu) DT %>% mutate(stu = factor(stu, levels = paste0("stu", 1:20))) %>% spread(stu, values) %>% tbl_dt(FALSE) # cate stu1 stu2 stu3 stu4 stu5 stu6 # 1 X 1.6890231 -1.300332 -1.378376 -1.874321 0.54141060 -1.2848391 # 2 Y 0.2796895 1.635385 1.048334 0.424909 0.09111916 -0.4147811 # Variables not shown: stu7 (dbl), stu8 (dbl), stu9 (dbl), stu10 (dbl), stu11 # (dbl), stu12 (dbl), stu13 (dbl), stu14 (dbl), stu15 (dbl), stu16 (dbl), # stu17 (dbl), stu18 (dbl), stu19 (dbl), stu20 (dbl) ## method 1 with select DT_spread = DT %>% spread(stu, values) sele_names = setdiff(names(DT_spread), paste0("stu", 1:20)) col_num = match(c(sele_names, paste0("stu", 1:20)), names(DT_spread)) DT_spread %>% select(col_num) %>% tbl_dt(FALSE) # cate stu1 stu2 stu3 stu4 stu5 stu6 # 1 X 1.6890231 -1.300332 -1.378376 -1.874321 0.54141060 -1.2848391 # 2 Y 0.2796895 1.635385 1.048334 0.424909 0.09111916 -0.4147811 # Variables not shown: stu7 (dbl), stu8 (dbl), stu9 (dbl), stu10 (dbl), stu11 # (dbl), stu12 (dbl), stu13 (dbl), stu14 (dbl), stu15 (dbl), stu16 (dbl), # stu17 (dbl), stu18 (dbl), stu19 (dbl), stu20 (dbl) ## method 2 DT %<>% mutate(stu_num = as.integer(gsub("stu(\\d*)", "\\1", stu)), stu = sprintf("stu%02i", stu_num)) %>% select(-stu_num) DT %>% spread(stu, values) %>% tbl_dt(FALSE) # cate stu01 stu02 stu03 stu04 stu05 stu06 # 1 X 1.6890231 -1.300332 -1.378376 -1.874321 0.54141060 -1.2848391 # 2 Y 0.2796895 1.635385 1.048334 0.424909 0.09111916 -0.4147811 # Variables not shown: stu07 (dbl), stu08 (dbl), stu09 (dbl), stu10 (dbl), # stu11 (dbl), stu12 (dbl), stu13 (dbl), stu14 (dbl), stu15 (dbl), stu16 # (dbl), stu17 (dbl), stu18 (dbl), stu19 (dbl), stu20 (dbl) -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 1.163.8.105 ※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1438938453.A.1C3.html

08/08 08:47, , 1F
花了一些時間才看懂1, 2哈哈哈 謝c大
08/08 08:47, 1F
※ 編輯: celestialgod (1.163.8.105), 08/08/2015 09:13:04
文章代碼(AID): #1Ln7LL73 (R_Language)
討論串 (同標題文章)
文章代碼(AID): #1Ln7LL73 (R_Language)