python-selenium教程

motuoka in Shanghai

2024-05-08

🖐🏻 免责声明

本教程仅供学习交流使用，严禁用于商业用途和非法用途，否则由此产生的一切后果均与作者无关，请各读者自觉遵守相关法律法规。

# 安装依赖库

pip install selenium

pip install webdriver_manager这个是自动下载对应版本的webDriver

# 源码

from selenium import webdriver
from selenium.webdriver.chrome.options import Options
# ChromeDriverManager可以用来自动下载对应版本的webDriver
from webdriver_manager.chrome import ChromeDriverManager
import json


def create_chrome_driver(*, headless=False, version='109.0.5414.74'):
    options = Options()
    if headless:
        options.add_argument('--headless')
    # 浏览器请求头
    options.add_argument(
        'user-agent=Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/109.0.0.0 Safari/537.36')
    # 忽略ssl警告
    options.add_argument('-ignore-certificate-errors')
    options.add_argument('-ignore -ssl-errors')
    options.add_experimental_option('detach', True)  # 浏览器停留
    # 忽略无用的日志
    options.add_experimental_option(
        'excludeSwitches', ['enable-automation', 'enable-logging'])
    options.add_experimental_option('useAutomationExtension', False)
    browser = webdriver.Chrome(ChromeDriverManager(
        driver_version=version).install(), options=options)
    browser.execute_cdp_cmd(
        'Page.addScriptToEvaluateOnNewDocument',
        {'source': 'Object.defineProperty(navigator,"webdriver",{get: () => undefined})'})
    return browser


def add_cookies(browser, cookie_file):
    with open(cookie_file, 'r') as file:
        cookies_list = json.load(file)
        for cookie_dict in cookies_list:
            if cookie_dict['secure']:
                browser.add_cookie(cookie_dict)

# 调用

import SeleniumUtil
driver = SeleniumUtil.create_chrome_driver()

# 具体使用

打开网页

1. driver.get('https://www.baidu.com')

查找元素

driver.find_element_by_id("aTab2")
driver.find_element_by_class_name("aTab2")
# 找不到会报错，这里封装一个返回None的写法
def findElement(self, type, name):
    try:
        return self.driver.find_element(type, name)
    except NoSuchElementException:
        return None

事件
- 点击
click()
- 输入文字
  - 全选（COMMAND or CTRL 根据操作系统来）
  send_keys(Keys.COMMAND, 'a')
  - 输入
  send_keys('你好啊')
- 上传文件
end_keys("本地文件路径")
- 执行js
execute_script("document.querySelector('.modal-dialog button.close').click();")
- 向js代码传递参数
  
  传递：execute_script('js代码',arg1,arg2)
  
  js接收：var agr1 = arguments[0];var arg2 = arguments[1]
- 获取js的返回值
  - 同步
    
    js中return返回值var a = 1;return a;
    
    python直接取 result = driver.execute_script('js代码')
  - 异步
    
    js中用callback，var callback= arguments[arguments.length-1];callback('hello world'),注意代码不需要用function包裹，pythonexecute_async_script时会自动用一个async方法包裹，并传一个回调函数参数过去，js直接arguments里取
    
    python直接取 result = driver.execute_async_script('js代码');result 的值就是 hello world
- 显式等待
```
# 等5秒，每0.5秒查询一次
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.by import By
from selenium.webdriver.support.ui import WebDriverWait #导入显示等待模块
from selenium.webdriver.support import expected_conditions as EC #导入显示等待条件
lightbox_img_submit = WebDriverWait(driver,5,0.5).until(EC.presence_of_element_located((By.CLASS_NAME,'lightbox-img')))
if lightbox_img_submit:
    lightbox_img = driver.find_element_by_class_name("lightbox-img")
```

# ☕ 请我喝咖啡

如果本文章对您有所帮助，不妨请作者我喝杯咖啡：）

# ☀️ 广告时间

现承接以下业务，欢迎大家支持：）

Web 2.0 & Web 3.0应用定制

Web 3.0专项脚本定制与优化

数据爬虫需求快速响应

网站/公众号/小程序一站式开发

毕业设计与科研项目支持

企业管理软件定制：ERP, MES, CRM, 进销存系统等

联系方式：

X：@motuoka

V：ck742931485