[問題] Selenium抓不到src的連結

看板Python作者fragmentwing (片翼碎夢)時間2年前 (2023/03/06 12:43)推噓0(0推 0噓 5→)

留言5則, 1人參與討論串1/1

問題解決，單純只是class的位置搞混了如題，想做爬蟲抓圖用的網站是這個https://unsplash.com/ 這是正確的class位置:https://imgur.com/Ri0YcfK

我從這篇開始改的:https://reurl.cc/OVEXz9 另外他這篇的程式碼改成現在用的語法可以運作後不知道為甚麼只能存一張圖片 (大概是我太不熟這類爬蟲工具了......) 我的程式碼如下: from selenium import webdriver from selenium.webdriver.chrome.service import Service from selenium.webdriver.common.by import By from selenium.webdriver.chrome.options import Options import os import time import numpy as np options = Options() folder_path = os.getcwd() driver_path = folder_path + "\chromedriver_win32\chromedriver.exe" options.chrome_executable_path = driver_path driver = webdriver.Chrome(options=options) driver.maximize_window() img_url_dic = {} driver.get("https://unsplash.com/s/photos/burger") # print(driver.page_source) position = 0 picture_number = 0 for i in range(10): position += i*500 + np.random.randint(100) js = "document.documentElement.scrollTop=%d" % position driver.execute_script(js) time.sleep(np.random.random()) tags = driver.find_elements(By.XPATH,"//img[contains(@class,'tB6UZ a5VGX')]") src = [] for tag in tags: src.append(tag.get_attribute('src')) # print(src) for i,element in enumerate(src): print(i,element) src_len = len(src) print(f'{src_len=}') driver.close() -- ※ 發信站: 批踢踢實業坊(ptt.cc), 來自: 223.138.74.61 (臺灣) ※ 文章網址: https://www.ptt.cc/bbs/Python/M.1678077784.A.DE0.html