Re: [問題] python匯入資料庫問題

看板Python作者aweimeow (喵喵喵喵ヽ( ・∀・)ノ)時間9年前 (2016/09/07 15:56)推噓2(2推 0噓 0→)

留言2則, 2人參與討論串2/2 (看更多)

※ 引述《DankeTe (Aniz)》之銘言： : 各位好小弟在抓取網址及相關內容時發生了下列的錯誤， : 有至網上先行GOOGLE， : 但仍舊不得其解，希望能夠幫幫小弟解惑。 : 此圖為程式碼 : http://imgur.com/a/Mhw3H : 下圖為錯誤 : http://imgur.com/a/MEvi5 : 程式碼 : http://pastebin.com/NNucqisB 因為內容有點多，所以我就直接回應成一篇了，我看了一下程式碼認為你想做的事情是： * 把網頁的最新消息每一條的標題與連結都爬下來 * 存到資料庫去然後我的機器沒有裝 MySQL，所以就只有 debug 前面的部分 XD Line 15: train_table = soup.findAll('tr',{'class':'gray01 text_12_1pt form01'}) 如果這樣子寫的話，會只有 match exact class，所以應該改成 train_table = soup.findAll('tr', ['gray01', 'text_12_1pt']) Line 21: [tag['href'] for tag in train_link.findAll('a',{'href':True})][0] 這裡有幾個問題，第一個是你的變數名稱，在 Line 8 你宣告了 train_link，但是這邊 Line 20 也用了 train_link，所以 for loop 的 train_link 會變成字串因此你要改用別的名字，例如說使用 tr 作為名稱，第二個問題是 [statement][0] 這種寫法是沒有意義的，在這個例子當中，每一個 tr 裡面只會有一個帶有 href 屬性的 a tag，所以只需要寫成這樣就可以 tr.findAll('a', {'href': True})[0]['href'] 這一句的意思是，findAll 會回傳 list，但是我們知道他只有一個所以取出第一個，並且我只要他的 href 這個 key 的 value 就好了。再來的小問題就是你的 Code 可以考慮參考 PEP8 等 Coding Style， train_news_title.findAll('span',{'title':True})][0] 我會喜歡在 , 與 : 後方加一個空白： train_news_title.findAll('span', {'title': True})][0] 可以參考 PEP8:E231 missing whitespace after ‘,’ Line 24: for train_news_title in train_table: 因為這邊和 Line 20 是一樣的，所以我傾向於把他合併起來處理 Line 29-31: if len(train_link) == len(train_news_title): 我看不太懂這段想要做甚麼，所以特別提出來講，如果你想要的是確定這兩者之間數量會一致，可以使用 assert statement assert len(train_link) == len(train_news_title) 這樣子在兩者數量不一致的時候就會跳出 AssertionError Line 32: for j in range(number1): 因為前述說明，所以我把 number1 拿掉了，這邊可以用以下 code 來替代 for j in range(len(train_link)): 大概就是這樣了，再次想提醒你可以參考 PEP8 的 Coding Style :) 改過的 Code: http://pastebin.com/ggxm25zd Demo: http://i.imgur.com/5i82Fok.png