Re: [問題] Google Interview Question (4)
看板Prob_Solve (計算數學 Problem Solving)作者stimim (qqaa)時間11年前 (2013/03/08 16:51)推噓1(1推 0噓 1→)留言2則, 1人參與討論串9/13 (看更多)
剛剛和 Leon 討論了一下,發現我們對題目的了解有一點不同
原題目:
Given a document and a query of K words, how do u find the smallest window
that covers all the words at least once in that document? (given you know the
inverted lists of all K words, that is, for each word, you have a list of all
its occurrrences). This one is really hard. Could someone propose an
algorithm in O(n)?
我們可以確定:
1. document is given
2. 知道我們關心的是哪 K 個字
3. 這 K 個字的 occurrence list 也是給定的
問題是,n 指的是什麼?
1. n = document size (in number of words)
在這種情況下, O(n) 是可能的,方法基本上就是 Leon 提出來的。
補充一個 O(document size) 的做法
// all indices starts from 1
Given occ (list of list of occurrences)
array = [-1, ..., -1] (size = document size)
i = 0
for list in occ
i = i + 1
for index in list
array[index] = i
count = [0, 0, ..., 0] (size = K)
nKinds = 0
start = 1
end = 1
minWindowSize = INF
while end <= document size
while end <= document size and nKinds < K
word = array[end]
end = end + 1
if word == -1 // we are not interested
continue
count[word] = count[word] + 1
if count[word] == 1
nKinds = nKinds + 1
while nKinds == K
word = array[start]
if word == -1 // we are not interested
start = start + 1
continue
count[word] = count[word] - 1
if count[word] == 0 // check this window
nKinds = nKinds - 1
if end - start < minWindowSize
minWindowSize = end - start
start = start + 1
return minWindowSizejk
2. n = 這 K 個字出現的總次數
Ex:
K = 3, 要找出包含 {a, b, c} 的最小 window
occurrences:
a = { 1, 1000, 2500}
b = {400, 1001, 2000}
c = {500, 1002, 1500}
document size = 2500 甚至更高 (其他的位置是其他的字)
n = 3 + 3 + 3 = 9 而不是 2500
在這種情況下,已知可以做到 O(n logK) (RockLee 寫的方法)
不過,在這種狀況下,給定原本的 document 似乎就沒什麼用了
而且在讀入 input 時,
讀入原本的 document 會花掉 O(document size) 的時間
所以在和 Leon 討論時,他對這種解釋方法有一點懷疑
--
※ 發信站: 批踢踢實業坊(ptt.cc)
◆ From: 140.112.175.115
※ 編輯: stimim 來自: 140.112.175.115 (03/08 17:09)
推
03/09 03:21, , 1F
03/09 03:21, 1F
→
03/09 03:22, , 2F
03/09 03:22, 2F
討論串 (同標題文章)
以下文章回應了本文:
完整討論串 (本文為第 9 之 13 篇):
Prob_Solve 近期熱門文章
PTT數位生活區 即時熱門文章