[問題] 依照消費狀況分類
[問題類型]:
我可以分類成功,想請問有比較簡潔的方法嗎?或別的方法?謝謝分享!
[問題敘述]:
要將消費者依照消費年份的狀態分成三類:回購顧客,新增顧客,流失顧客
[程式範例]:
http://ideone.com/RJ4ixk
number=50
data <- matrix(nrow = number, ncol = 3 )
colnames(data) <- c("ID", "Shop_year", "Region")
set.seed(1)
data[,1] <- c(sample(1:25, size=number, replace = T) )#ID
data[,2] <- c(sample(c("year2013","year2014","year2015"), size=number, replace= T, prob = c(0.1, 0.3, 0.6) ) )
data[,3] <- c(sample(c("北區一","北區二","北區三"), size = number,
replace = T, prob = c(0.5,0.3,0.2) ))
data <- data.frame(data)
library(reshape2)
library(reshape)
library(dplyr)
result<- cast(data, ID~Shop_year,value=c("Region"))
##分類成三組
data_new <- filter(result, year2015 !=0 & year2014 == 0)
data_new$level <- "新增"
data_lost <- filter(result, year2015 == 0)
data_lost$level <- "流失"
data_now <- filter(result, year2015 != 0 & year2014 !=0)
data_now$level <- "回購"
##合併檔案
data_combine <- rbind.data.frame(data_now, data_new, data_lost)
## 新增分組資料到原始檔案
data_final <- merge(data, data_combine)
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 59.120.192.175
※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1444027929.A.FAD.html
※ 編輯: blueevil (59.120.192.175), 10/05/2015 15:01:52
→
10/05 15:28, , 1F
10/05 15:28, 1F
→
10/05 15:29, , 2F
10/05 15:29, 2F
→
10/05 15:29, , 3F
10/05 15:29, 3F
→
10/05 15:30, , 4F
10/05 15:30, 4F
→
10/05 15:47, , 5F
10/05 15:47, 5F
→
10/05 15:48, , 6F
10/05 15:48, 6F
※ 編輯: blueevil (59.120.192.175), 10/05/2015 16:33:39
推
10/07 08:28, , 7F
10/07 08:28, 7F
→
10/07 08:29, , 8F
10/07 08:29, 8F
→
10/07 10:30, , 9F
10/07 10:30, 9F
R_Language 近期熱門文章
PTT數位生活區 即時熱門文章