了解更多关注微信公众号“木下学Python”吧~javascript
抓取过程仅仅抓取页面内容,CSS样式文件是用来控制页面外观和元素房子位置的,对内容并无影响,能够限制css
网页加载CSS,从而减小抓取时间,代码以下:java
from selenium import webdriver f = webdriver.FirefoxProfile() f.set_preference("permissions.default.stylesheet",2) driver = webdriver.Firefox(firefox_profile = f) driver.get('https://www.douban.com/')
1)Firefox 若是不须要抓取图片能够禁止图片加载从而提升效率,代码以下:python
from selenium import webdriver f = webdriver.FirefoxProfile() f.set_preference("permissions.default.image",2) driver = webdriver.Firefox(firefox_profile = f) driver.get('https://www.douban.com/')
2)Chrome 的限制图片和JAVAscriptweb
import requests from selenium import webdriver options = webdriver.ChromeOptions() prefs = { 'profile.default_content_setting_values': { 'images': 2, 'javascript': 2 } } options.add_experimental_option('prefs', prefs) driver = webdriver.Chrome(chrome_options=options) driver.get('http://www.dianping.com/search/category/7/10/p1') driver.implicitly_wait(20)
from selenium import webdriver f = webdriver.FirefoxProfile() f.set_preference("javascript.enabled",False) driver = webdriver.Firefox(firefox_profile = f) driver.get('https://www.douban.com/')
# 不加载图片,加快访问速度 options.add_experimental_option("prefs", {"profile.managed_default_content_settings.images": 2})