Predictive Hacks

Selenium: A Beginner’s Guide To Browser Automations In Python

Photo by Aideal Hwa on Unsplash

Selenium is an open-source tool that can automate tasks on a web browser. Its main purpose is testing, however, it’s a powerful tool for a developer/data scientist because it has many more applications like scraping and automating boring tasks like posting on pages on Facebook. In this post, we will show you the basics of selenium and how to use it on a simple example in python.

First things first, you need to install selenium:

pip install selenium

Also, you need to download chromedriver for your google chrome browser version which can be found here, and add it to your working directory.

Selenium works just like a human. We need to identify where the buttons or forms are and then perform an action like typing or pressing a button. This is all you need to know to understand its logic.

Example: How to automatically search on Google

Let’s start with this simple example. Firstly we need to import the libraries we need and set the cromedriver.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
import pandas as pd
import pickle
import time

driver = webdriver.Chrome(os.path.abspath( "chromedriver" ))

Now, if we run the following, a web browser will open at google.com.

driver.get('https://www.google.com/')

selenium web automations

In order to continue, we need to press the “I agree” button so we need to identify its element in the HTML code of the website. This is because we are going to use its XPath(don’t worry if you don’t know what it is) to Selenium.

If you right-click in the browser and press inspect, the Dev-Tools will appear. Then, click on the top left corner of the Dev-Tools on a pointer-like symbol as you can see in the screenshot below. Now, if you hover over the element you want to find (in our case the “I agree” button) and click on it, its element should be highlighted in the Dev-Tools.

selenium web automations

Finally, right-click at the highlighted part of the HTML and copy its full XPath that we are going to use in Selenium as sawn below.

selenium web automation python

The XPath is the safest way to identify elements of the HTML code but you can use its name or even its text. For our example, we are going to use the function find_element_by_xpath so we need the full XPath.

#here you should add the XPath of the 'I agree' button
item=driver.find_element_by_xpath("/html/body/div[2]/div[2]/div[3]/span/div/div/div/div[3]/button[2]/div")

#Then we need to add an action to it. We want to click it so we will use the following
item.click()

This was basically how selenium works. You need to identify the HTML elements and then perform an action. If you run the above, the I agree button should be clicked. What we need to do now is to identify the search bar, click on it, write something, and hit Enter just like a human.

#identify the search bar
item=driver.find_element_by_xpath('/html/body/div[1]/div[3]/form/div[1]/div[1]/div[1]/div/div[2]/input')
item.click()

#actions allows you to "press" keys. Here we are going to write 
#predictive hacks using the .send_keys and the perform the action.
actions = ActionChains(driver)
actions.send_keys("predictive hacks")
actions.perform()


#using send_keys we can also press ENTER as follows
actions.send_keys(Keys.ENTER)
actions.perform()

You should now get the search results for “predictive hacks” automatically!

Let’s add one more step and take a screenshot of the results.

driver.save_screenshot('screenshoot.png')

Putting it all together

We combined everything in a function that given a text it will search in google for it and save a screenshot of the results.

from selenium import webdriver
from selenium.webdriver.common.keys import Keys
from selenium.webdriver.common.action_chains import ActionChains
import pandas as pd


driver = webdriver.Chrome(os.path.abspath( "chromedriver" ))
driver.get('https://www.google.com/')

item=driver.find_element_by_xpath("/html/body/div[2]/div[2]/div[3]/span/div/div/div/div[3]/button[2]/div")
item.click()

def search_and_screenshot(text):
    driver.get('https://www.google.com/')
    item=driver.find_element_by_xpath('/html/body/div[1]/div[3]/form/div[1]/div[1]/div[1]/div/div[2]/input')
    item.click()

    actions = ActionChains(driver)
    actions.send_keys(text)

    actions.perform()


    actions.send_keys(Keys.ENTER)
    actions.perform()


    driver.save_screenshot(f'screenshot_{text}.png')

Let’s test it

search_and_screenshot('machine learning')
search_and_screenshot('data science')

Summing it up

Selenium can help us automate repeatable tasks in a web browser. In this post, we saw just a simple example but selenium is way more powerful than that. You can create helpful apps for marketing or you can scrape data easily for your next data science project. The only limit is your imagination! A famous application of selenium is the project instapy that uses selenium to auto-like and auto-follow profiles on Instagram.

Share This Post

Share on facebook
Share on linkedin
Share on twitter
Share on email

Leave a Comment

Subscribe To Our Newsletter

Get updates and learn from the best

More To Explore

Python

Image Captioning with HuggingFace

Image captioning with AI is a fascinating application of artificial intelligence (AI) that involves generating textual descriptions for images automatically.

Python

Intro to Chatbots with HuggingFace

In this tutorial, we will show you how to use the Transformers library from HuggingFace to build chatbot pipelines. Let’s