[問題] %in% 指令請益
[問題類型]:
如題
[軟體熟悉度]:
新手,接觸R約兩個月
[問題敘述]:
本人目前在練習data mining,手邊有個app click log的資料
格式如下
uid system command DataTimes
1 ios 0 2013/5/7 10:44
1 ios 10 2013/5/7 10:45
2 android 0 2013/5/7 10:50
2 android 10 2013/5/7 10:51
3 ios 0 2013/5/7 10:58
3 ios 20 2013/5/7 10:59
.
.
.
想請問
因為我要把整筆資料操作次數過少的使用者清掉
於是我整理了一個uidlist
是click次數超過n次的uidlist
之後我做了下面這個指令
data1 <- data[data$uid %in% uidlist,]
點開data1確實留下click次數超過n的資料
但我後來輸入
barchart(data1$uid)
顯示出的長條圖中被移除的uid還是存在
有點像是
uid
1 ============
2 ====
3 =
4
5 ===================
6
7 ===
8 =
9 =======
0 5 10
Feq
像uid 4跟6明明被移掉了卻還是會顯示
不知道為何
想請問為什會這樣及如何確實移除?
如有描述不清楚或不完全,我會立即補充謝謝!!
[環境敘述]:
R version 3.2.1 (2015-06-18)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows 7 x64 (build 7601) Service Pack 1
locale:
[1] LC_COLLATE=Chinese (Traditional)_Taiwan.950
[2] LC_CTYPE=Chinese (Traditional)_Taiwan.950
[3] LC_MONETARY=Chinese (Traditional)_Taiwan.950
[4] LC_NUMERIC=C
[5] LC_TIME=Chinese (Traditional)_Taiwan.950
attached base packages:
[1] stats graphics grDevices utils datasets methods
[7] base
other attached packages:
[1] lattice_0.20-31 arules_1.1-9 Matrix_1.2-1
loaded via a namespace (and not attached):
[1] tools_3.2.1 grid_3.2.1
[關鍵字]:
--
※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 140.96.194.58
※ 文章網址: https://www.ptt.cc/bbs/R_Language/M.1439177335.A.663.html
→
08/10 11:48, , 1F
08/10 11:48, 1F
我剛剛試了,但不太了解droplevels的意思...
→
08/10 11:59, , 2F
08/10 11:59, 2F
dd1 <- table(data2$uid)
names(dd1)
dfuid <- data.frame(uid= names(dd1), idcnt = dd1)
names(dfuid) <- c("uid", "uid2", "idcnt")
dfuid <- dfuid[, c("uid", "idcnt")]
ss <- summary(dfuid$idcnt)
str(ss)
threshold1 <- ss[[2]]
dfuid2 <- dfuid[dfuid$idcnt >= threshold1, ]
summary(dfuid2$idcnt)
uidlist <- dfuid2$uid
>uidlist
[1] 122164545fwsewe 1125rwe60c02d25f2
.
.
.(滿滿的uid)
.
.
[57]re98635rtg546re5 5t65e4rt4e6rt4e
78 Levels: 122164545fwsewe 1125rwe60c02d25f2 1805ea5f796f6034 ...
mdend5ihiqwtn6yri5h7kurkx9ypajutfx
這個嗎?
※ 編輯: remember69 (140.96.194.58), 08/10/2015 13:39:28
→
08/10 14:03, , 3F
08/10 14:03, 3F
→
08/10 14:03, , 4F
08/10 14:03, 4F
→
08/10 14:03, , 5F
08/10 14:03, 5F
→
08/10 14:04, , 6F
08/10 14:04, 6F
→
08/10 14:04, , 7F
08/10 14:04, 7F
→
08/10 14:05, , 8F
08/10 14:05, 8F
→
08/10 14:06, , 9F
08/10 14:06, 9F
挖靠真的欸,問題解決了,多謝兩位
抱歉一開始uid沒有說仔細,想說用數字看比較清楚
那為什麼那些被移除掉的uid會保留呢?是本來的設定還是因為他是dataframe所以會留著?
※ 編輯: remember69 (140.96.194.58), 08/10/2015 14:34:12
→
08/10 14:35, , 10F
08/10 14:35, 10F
推
08/10 14:36, , 11F
08/10 14:36, 11F
→
08/10 14:44, , 12F
08/10 14:44, 12F
→
08/10 14:48, , 13F
08/10 14:48, 13F
→
08/10 15:36, , 14F
08/10 15:36, 14F
→
08/10 15:38, , 15F
08/10 15:38, 15F
→
08/10 20:02, , 16F
08/10 20:02, 16F
推
08/16 11:08, , 17F
08/16 11:08, 17F
R_Language 近期熱門文章
PTT數位生活區 即時熱門文章