프로그래밍/Python
파이썬 selenium 타오바오 자동 로그인 & 이미지 검색
dev109
2021. 7. 28. 20:51
반응형
파이썬 selenium을 이용한 타오바오 로그인 & 이미지 검색입니다. (검색 후 조회되는 첫번째 상품 정보 수집)
from random import random
from selenium import webdriver
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
from selenium.webdriver.common.by import By
from fake_useragent import UserAgent
from urllib.parse import quote
import time, random
options = webdriver.ChromeOptions()
options.add_argument("--disable-blink-features=AutomationControlled")
user_ag = UserAgent().random
options.add_argument('user-agent=%s'%user_ag)
options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option("useAutomationExtension", False)
options.add_experimental_option("prefs", {"prfile.managed_default_content_setting.images": 2})
driver = webdriver.Chrome('chromedriver.exe', options=options)
# 크롤링 방지 설정을 undefined로 변경
driver.execute_cdp_cmd("Page.addScriptToEvaluateOnNewDocument", {
"source": """
Object.defineProperty(navigator, 'webdriver', {
get: () => undefined
})
"""
})
wait = WebDriverWait(driver, 5)
url = 'https://login.taobao.com/member/login.jhtml'
driver.get(url=url)
time.sleep(2)
id_input = wait.until(EC.presence_of_element_located((By.ID, "fm-login-id")))
id_input.send_keys('타오바오아이디')
pw_input = wait.until(EC.presence_of_element_located((By.ID, "fm-login-password")))
pw_input.send_keys('타오바오패스워드')
wait.until(EC.presence_of_element_located((By.CLASS_NAME, "fm-button"))).click()
#로그인대기
time.sleep(random.randint(5, 10))
taobao_name_tag = wait.until(
EC.presence_of_element_located((By.CLASS_NAME, "site-nav-login-info-nick ")))
print(f" >>>> 접속자:{taobao_name_tag.text}")
url = 'https://s.taobao.com/search?q=' + quote('压片糖果')
driver.get(url)
time.sleep(2)
#이미지 검색
img_path = '이미지경로'
img_search = wait.until(EC.presence_of_element_located((By.XPATH, '//*[@id="J_IMGSeachUploadBtn"]')))
img_search.send_keys(img_path)
time.sleep(2)
search_link = driver.current_url
product_link = driver.find_element_by_xpath('//*[@id="imgsearch-itemlist"]/div/div/div/div[1]/div[1]/div/div[1]/a').get_attribute('href')
product_price = driver.find_element_by_xpath('//*[@id="imgsearch-itemlist"]/div/div/div/div[1]/div[2]/div[1]/strong').text
print('검색링크 > ', search_link)
print('첫번째 상품 상품링크 > ', product_link)
print('첫번째 상품 상품가격 > ', product_price)
공감과 댓글은 작성자에게 많은 힘이됩니다. 감사합니다😄
반응형