Introduction to deploying selenium crawler program under Linux system

Introduction to deploying selenium crawler program under Linux system

Preface

I need to deploy the selenium crawler program to the Linux server for work. I would like to share this with you. If you are interested, you can take a look.


1. What is selenium?

Selenium is a tool used for web application testing. Selenium tests run directly in the browser, just like real users are operating, and crawlers use it to crawl some data dynamically loaded by js

2. Usage steps

1. Import library

The code is as follows

from selenium.webdriver import Chrome
from selenium.webdriver.chrome.service import Service
from selenium.webdriver.chrome.options import Options # Use a headless browser from selenium.webdriver import ChromeOptions
chrome_options = Options()
options = ChromeOptions()
options.add_experimental_option('excludeSwitches', ['enable-automation']) # => Remove the browser being controlled by the automated testing software options.add_experimental_option('useAutomationExtension', False)
chrome_options.add_argument("--headless") # => Configure headless mode for Chrome chrome_options.add_argument('--no-sandbox')
chrome_options.add_argument('--disable-gpu')
chrome_options.add_argument('--disable-dev-shm-usage')

2. Test code

The code is as follows:

s = Service(r"/home/driver/chromedriver")
driver = Chrome(
     service=s, options=chrome_options
 )
 driver.get("https://www.baidu.com")
 print(diiver.title)

3. Deployment Procedure

1. Install Chrome

The command is as follows:

yum install https://dl.google.com/linux/direct/google-chrome-stable_current_x86_64.rpm
Check the version of Chrome: google-chrome --version

2. Install chromedriver

The command is as follows:

Download the chromedriver driver address according to the corresponding chrome version: https://npm.taobao.org/mirrors/chromedriver
My version number is: 96.0.4664.45
wget https://npm.taobao.org/mirrors/chromedriver/96.0.4664.45/chromedriver_linux64.zip 
yum install -y unzip zip
unzip chromedriver_linux64.zip # Unzip the zip file mkdir driver #Create a new folder to store the driver chmod 777 driver/chromedriver # This is the permission. I give it 777 here

3. Run the test code

Create a new test.py file

vi test.py 

insert image description here

Save test.py and run it.

insert image description here

Seeing this, my request is successful.

Summarize

This is the end of this article about deploying selenium crawler program under Linux system. For more relevant content about linux selenium crawler program, please search previous articles of 123WORDPRESS.COM or continue to browse the related articles below. I hope you will support 123WORDPRESS.COM in the future!

You may also be interested in:
  • Configure selenium environment based on linux and implement operation

<<:  The difference between animation and transition

>>:  MySQL essential basics: grouping function, aggregate function, grouping query detailed explanation

Recommend

React Fragment Introduction and Detailed Usage

Table of contents Preface Motivation for Fragment...

How to delete special character file names or directories in Linux

Delete a file by its inode number First use ls -i...

MySQL backup and recovery design ideas

background First, let me explain the background. ...

How to implement scheduled backup of MySQL database

1. Create a shell script vim backupdb.sh Create t...

Teach you how to build the vue3.0 project architecture step by step

Table of contents Preface: 1. Create a project wi...

How to implement parallel downloading of large files in JavaScript

Table of contents 1. HTTP Range Request 1.1 Range...

Practical operation of using any font in a web page with demonstration

I have done some research on "embedding non-...

JavaScript implementation of magnifying glass details

Table of contents 1. Rendering 2. Implementation ...

Solutions to the problem of table nesting and border merging

【question】 When the outer table and the inner tab...

JS implements a stopwatch timer

This article example shares the specific code of ...

Build a Docker private warehouse (self-signed method)

In order to centrally manage the images we create...