Re: [問題] Google Interview Question (4)

看板Prob_Solve (計算數學 Problem Solving)作者 (Achilles)時間11年前 (2013/03/09 15:15), 編輯推噓0(000)
留言0則, 0人參與, 最新討論串10/13 (看更多)
※ 引述《stimim (qqaa)》之銘言: : 剛剛和 Leon 討論了一下,發現我們對題目的了解有一點不同 : 原題目: : Given a document and a query of K words, how do u find the smallest window : that covers all the words at least once in that document? (given you know the : inverted lists of all K words, that is, for each word, you have a list of all : its occurrrences). This one is really hard. Could someone propose an : algorithm in O(n)? : 我們可以確定: : 1. document is given : 2. 知道我們關心的是哪 K 個字 : 3. 這 K 個字的 occurrence list 也是給定的 : 問題是,n 指的是什麼? : 1. n = document size (in number of words) : 在這種情況下, O(n) 是可能的,方法基本上就是 Leon 提出來的。 恕刪. 我覺得那個 N 的定義比較有可能是 document length.. 不過下面那個定義, 我想我的方法應該也可以啦. Here is the modification. : 2. n = 這 K 個字出現的總次數 : Ex: : K = 3, 要找出包含 {a, b, c} 的最小 window : occurrences: : a = { 1, 1000, 2500} : b = {400, 1001, 2000} : c = {500, 1002, 1500} : document size = 2500 甚至更高 (其他的位置是其他的字) : n = 3 + 3 + 3 = 9 而不是 2500 : 在這種情況下,已知可以做到 O(n logK) (RockLee 寫的方法) : 不過,在這種狀況下,給定原本的 document 似乎就沒什麼用了 : 而且在讀入 input 時, : 讀入原本的 document 會花掉 O(document size) 的時間 : 所以在和 Leon 討論時,他對這種解釋方法有一點懷疑 So, when you build the occurance list from document, actually you will have scaned this array first.. {1, 400, 500, 1000, .... 2500} which represents {a, b, c, a..} We need to keep this too. Call it "Show_list" Then, use my algorithm, the left boundary are choosen sequentally from "Show_list". The first step will be, choosing [1] as left, and use max-min{b, c} as right. The second step is, drop [1] and use "Show_list(2)" as a new left boudary. Now, you drop at most one element in the window. The need O(1) to find the right boundary and get back this element. Hope it helps. -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 142.136.126.103
文章代碼(AID): #1HEk6O5q (Prob_Solve)
討論串 (同標題文章)
文章代碼(AID): #1HEk6O5q (Prob_Solve)