Re: [問題] 如何整理數量位置資料如:1胃,2腸

看板R_Language作者 (攸藍)時間9年前 (2015/07/10 15:21), 9年前編輯推噓0(001)
留言1則, 1人參與, 最新討論串2/3 (看更多)
※ 引述《helixc (@_2;)》之銘言: : [軟體熟悉度]:新手+入門 : [問題敘述]: : 手上有一筆某蛙類的解剖資料,想要分析食性。 : 紀錄的時候會長這樣: : ID,Food A,Food B,Food C,Food E : C146,,,,3腸 : B287,,,,10腸 : C140,,,,4腸 : C133,,,1腸, : C132,1腸,,, : B305,,,1腸, : C112,,2腸,,1腸 : C120,,,,1腸 : C128,,,,1腸 : 想要整理成這樣的資料: : ID, Food type, Amount, Location : C146, E, 3, 腸 : B287, E, 10, 腸 : C140, E, 4, 腸 : C133, C, 1, 腸 library(data.table) library(dplyr) library(tidyr) library(magrittr) library(stringr) tmp_dt = fread("ID,Food A,Food B,Food C,Food E C146,,,,3腸 B287,,,,10腸 C140,,,,4腸 C133,,,1腸, C132,1腸,,, B305,,,1腸, C112,,2腸,,1腸 C120,,,,1腸 C128,,,,1腸", colClasses = rep("Character",5)) ## method 1 output_dt = tmp_dt %>% gather(foodType, tmpCol,-ID) %>% filter(tmpCol != "") %>% mutate(Amount = str_extract(tmpCol, "\\d*"), Location = str_sub(tmpCol, nchar(tmpCol), nchar(tmpCol))) %>% select(-tmpCol) %>% transform(foodType = as.character(foodType)) %>% transform(foodType = str_sub(foodType, nchar(foodType), nchar(foodType))) ## method 2 output_dt2 = tmp_dt %>% gather(foodType, tmpCol,-ID) %>% filter(tmpCol != "") %>% transform(foodType = as.character(foodType), tmpCol = sub("(\\d*)(.)", "\\1,\\2", tmpCol)) %>% separate(tmpCol, c("Amount", "Location")) %>% transform(foodType = str_sub(foodType, nchar(foodType), nchar(foodType))) ## method 3 (不用sub,separate的sep參數可以改成用位置切割) output_dt2 = tmp_dt %>% gather(foodType, tmpCol,-ID) %>% filter(tmpCol != "") %>% transform(foodType = as.character(foodType)) %>% separate(tmpCol, c("Amount", "Location"), -2) %>% transform(foodType = str_sub(foodType, nchar(foodType), nchar(foodType))) output: (3個都一樣) # ID foodType Amount Location # 1: C132 A 1 腸 # 2: C112 B 2 腸 # 3: C133 C 1 腸 # 4: B305 C 1 腸 # 5: C146 E 3 腸 # 6: B287 E 10 腸 # 7: C140 E 4 腸 # 8: C112 E 1 腸 # 9: C120 E 1 腸 # 10: C128 E 1 腸 -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 123.205.27.107 ※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1436512890.A.854.html ※ 編輯: celestialgod (123.205.27.107), 07/10/2015 15:34:05

07/10 20:16, , 1F
好多新函式要學,感謝
07/10 20:16, 1F
文章代碼(AID): #1Ldt9wXK (R_Language)
文章代碼(AID): #1Ldt9wXK (R_Language)