PTT數位生活區 / Python

Re: [問題] 爬蟲錯誤

看板Python作者timTan (用口頭禪區分年記)時間12年前 (2013/05/20 18:37)推噓0(0推 0噓 2→)

留言2則, 2人參與討論串3/4 (看更多)

※ 引述《darklimit ()》之銘言：應用隨機休息再繼續，還是會出現這樣的錯誤 error: [Errno 10054] 遠端主機已強制關閉一個現存的連線。進行except例外處理，continue繼續的話後面nameTag對應到的genre,rating 全部都會打亂這樣應該要怎麼處理? 謝謝 for i in idlist: headers = {'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1;en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'} req = urllib2.Request("http://www.imdb.com/title/tt"+i+"/",headers=headers) try: html = urllib2.urlopen("http://www.imdb.com/title/tt"+i+"/",timeout = 30) htmls = html.read() html.close() # And Sleep Here for every connection. except HTTPError, e: #Hnalde the error, #Break, #最好在此把你處理過的資料記下來，安心上路，下次再來 soup = BeautifulSoup(htmls) nameTag = [a.get_text() for a in soup.find_all("title")] genreTag = [a.get_text() for a in soup.find_all("span",{"itemprop":"genre"})] ratingTag = soup.find_all("span",{"itemprop":"ratingValue"}) for tag in nameTag: titlelist.append(nameTag) for tag in genreTag: genrelist.append(genreTag) break for tag in ratingTag: val = ''.join(tag.find(text=True)) valuelist.append(val) except HTTPError, e: print e.code print e.read() #continue except URLError, e: print 'Reason: ', e.reason #continue rsleep = random.randint(10, 40) time.sleep(rsleep) return zip(titlelist, genrelist, valuelist) -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 118.160.190.62

→

05/19 22:30,

05/19 22:30

→

05/20 01:09,

05/20 01:09

→

05/20 12:58,

05/20 12:58

→

05/20 15:38,

05/20 15:38

→

05/20 15:39,

05/20 15:39

→

05/20 15:44,

05/20 15:44

→

05/20 15:51,

05/20 15:51

→

05/20 15:52,

05/20 15:52

-- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 114.42.51.172

→

05/20 18:38, , 1^F

05/20 18:38, 1^F

→

05/22 17:57, , 2^F

05/22 17:57, 2^F

‣ 返回看板[ Python ] 程設

‣ 更多 timTan 的文章

文章代碼(AID): #1HcVpYS4 (Python)

討論串 (同標題文章)

完整討論串 (本文為第 3 之 4 篇)：

排序：最新先 | 最舊先 | 留言數

0

1

Re: [問題] 爬蟲錯誤

12年前, 05/21

0

2

Re: [問題] 爬蟲錯誤

12年前, 05/20

0

2

Re: [問題] 爬蟲錯誤

12年前, 05/20

0

8

[問題] 爬蟲錯誤

12年前, 05/19

在新視窗開啟完整討論串 (共4篇)

Python 近期熱門文章

1

1

[問題] python 3.14 free thread build

1周前, 10/29

1

13

[問題] 關於正規表示法的r'\1'?

3周前, 10/22

4

7

[問題] 請問有人用過OMIA PLUS影音平台自學嗎?

1月前, 10/09

4

21

[閒聊] Python 3.13 版本是不是很爛啊！？

3月前, 07/19

14

22

[閒聊] 各位現在用os.path 還是用pathlib.Path

3月前, 07/17

5

10

[閒聊] 2024年的自我python學習

3月前, 07/17

1

2

[問題] 用Whisper AI幫我下載字幕（有酬）

7月前, 04/01

1

3

[問題] selenium 有辦法做檔案上傳嗎?

9月前, 02/03

更多近期熱門文章 >>

PTT數位生活區即時熱門文章

6

20

Re: [情報] 京東 9800X3D 13024元 9700X 7063元

[ PC_Shopping ]

3小時前, 11/11

1

19

[菜單] 60K 輕AI遊戲機

[ PC_Shopping ]

4小時前, 11/11

27

88

[請益] 該現在升級CPU嗎

[ PC_Shopping ]

4小時前, 11/11

7

9

[心得] Phonon SMB-01

4小時前, 11/11

0

11

[菜單] Momo 31k遊戲機

[ PC_Shopping ]

4小時前, 11/11

4

18

[心得] 三星瀏覽器YouTube背景播放設定

6小時前, 11/11

23

64

OLED 螢幕是否越來越便宜？

[ PC_Shopping ]

7小時前, 11/11

14

29

[情報] 微星B850MPOWER 預購活動｜線材補寄公告

[ PC_Shopping ]

7小時前, 11/11

更多即時熱門文章 >>

‣ 返回看板[ Python ] 程設

‣ 更多 timTan 的文章

文章代碼(AID): #1HcVpYS4 (Python)