PTT數位生活區 / Python

[問題] 爬蟲錯誤

看板Python作者darklimit時間12年前 (2013/05/19 18:38)推噓0(0推 0噓 8→)

留言8則, 3人參與討論串1/4 (看更多)

應用隨機休息再繼續，還是會出現這樣的錯誤 error: [Errno 10054] 遠端主機已強制關閉一個現存的連線。進行except例外處理，continue繼續的話後面nameTag對應到的genre,rating 全部都會打亂這樣應該要怎麼處理? 謝謝 for i in idlist: headers = {'User-Agent':'Mozilla/5.0 (Windows; U; Windows NT 6.1;en-US; rv:1.9.1.6) Gecko/20091201 Firefox/3.5.6'} req = urllib2.Request("http://www.imdb.com/title/tt"+i+"/",headers=headers) try: html = urllib2.urlopen("http://www.imdb.com/title/tt"+i+"/",timeout = 30) htmls = html.read() html.close soup = BeautifulSoup(htmls) nameTag = [a.get_text() for a in soup.find_all("title")] genreTag = [a.get_text() for a in soup.find_all("span",{"itemprop":"genre"})] ratingTag = soup.find_all("span",{"itemprop":"ratingValue"}) for tag in nameTag: titlelist.append(nameTag) for tag in genreTag: genrelist.append(genreTag) break for tag in ratingTag: val = ''.join(tag.find(text=True)) valuelist.append(val) except HTTPError, e: print e.code print e.read() #continue except URLError, e: print 'Reason: ', e.reason #continue rsleep = random.randint(10, 40) time.sleep(rsleep) return zip(titlelist, genrelist, valuelist) -- ※ 發信站: 批踢踢實業坊(ptt.cc) ◆ From: 118.160.190.62

→

05/19 22:30, , 1^F

05/19 22:30, 1^F

→

05/20 01:09, , 2^F

05/20 01:09, 2^F

→

05/20 12:58, , 3^F

05/20 12:58, 3^F

→

05/20 15:38, , 4^F

05/20 15:38, 4^F

→

05/20 15:39, , 5^F

05/20 15:39, 5^F

→

05/20 15:44, , 6^F

05/20 15:44, 6^F

→

05/20 15:51, , 7^F

05/20 15:51, 7^F

→

05/20 15:52, , 8^F

05/20 15:52, 8^F

‣ 返回看板[ Python ] 程設

‣ 更多 darklimit 的文章

文章代碼(AID): #1HcAkOFw (Python)

討論串 (同標題文章)

以下文章回應了本文 (最舊先)：

0

1

Re: [問題] 爬蟲錯誤

12年前, 05/21

0

2

Re: [問題] 爬蟲錯誤

12年前, 05/20

完整討論串 (本文為第 1 之 4 篇)：

排序：最新先 | 最舊先 | 留言數

0

1

Re: [問題] 爬蟲錯誤

12年前, 05/21

0

2

Re: [問題] 爬蟲錯誤

12年前, 05/20

0

2

Re: [問題] 爬蟲錯誤

12年前, 05/20

0

8

[問題] 爬蟲錯誤

12年前, 05/19

在新視窗開啟完整討論串 (共4篇)

Python 近期熱門文章

1

1

[問題] python 3.14 free thread build

1周前, 10/29

1

13

[問題] 關於正規表示法的r'\1'?

3周前, 10/22

4

7

[問題] 請問有人用過OMIA PLUS影音平台自學嗎?

1月前, 10/09

4

21

[閒聊] Python 3.13 版本是不是很爛啊！？

3月前, 07/19

14

22

[閒聊] 各位現在用os.path 還是用pathlib.Path

3月前, 07/17

5

10

[閒聊] 2024年的自我python學習

3月前, 07/17

1

2

[問題] 用Whisper AI幫我下載字幕（有酬）

7月前, 04/01

1

3

[問題] selenium 有辦法做檔案上傳嗎?

9月前, 02/03

更多近期熱門文章 >>

PTT數位生活區即時熱門文章

6

20

Re: [情報] 京東 9800X3D 13024元 9700X 7063元

[ PC_Shopping ]

3小時前, 11/11

1

19

[菜單] 60K 輕AI遊戲機

[ PC_Shopping ]

4小時前, 11/11

27

88

[請益] 該現在升級CPU嗎

[ PC_Shopping ]

4小時前, 11/11

7

9

[心得] Phonon SMB-01

4小時前, 11/11

0

11

[菜單] Momo 31k遊戲機

[ PC_Shopping ]

5小時前, 11/11

4

18

[心得] 三星瀏覽器YouTube背景播放設定

7小時前, 11/11

23

64

OLED 螢幕是否越來越便宜？

[ PC_Shopping ]

7小時前, 11/11

14

29

[情報] 微星B850MPOWER 預購活動｜線材補寄公告

[ PC_Shopping ]

8小時前, 11/11

更多即時熱門文章 >>

‣ 返回看板[ Python ] 程設

‣ 更多 darklimit 的文章

文章代碼(AID): #1HcAkOFw (Python)