Build a Google Shopping Scraper in Python for Price and Product Data

Table of contents

Google Shopping Scraper

Google Shopping provides a wealth of product and price data, making it a valuable resource for e-commerce analysis. In this tutorial, we’ll guide you through building a Google Shopping scraper using Python. You’ll learn how to extract product details, prices, and seller information, enabling you to monitor market trends and competitor pricing.

Why Build a Google Shopping Scraper?

Accessing structured product and price data can be incredibly useful for:

  • Competitor Analysis: Monitor pricing trends and offers from various sellers.
  • Market Research: Understand market dynamics by analyzing product availability and seller competition.
  • Inventory Management: Keep track of products listed by different vendors.

Tools You’ll Need

We’ll use the following tools to build our scraper:

  • Python: The programming language for scripting.
  • Selenium: A powerful library to interact with web pages.
  • selenium-stealth: To help evade detection mechanisms by websites.
  • ChromeDriver: To control the Chrome browser for web scraping.
  • CSV: To save the scraped data for analysis.

Prerequisites

Make sure you have Python installed on your system. You can install the required libraries using pip:

pip install selenium selenium-stealth

Download and install ChromeDriver compatible with your Chrome browser version.

Writing the Scraper

Here is the full Python code for scraping Google Shopping:

from selenium import webdriver
from selenium_stealth import stealth
from selenium.webdriver.common.by import By
from selenium.webdriver.chrome.options import Options
from selenium.webdriver.support.wait import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
import time
import traceback
import csv

# Set up Chrome options
options = webdriver.ChromeOptions()
options.add_argument("start-maximized")
# Uncomment the line below to run in headless mode
# options.add_argument("--headless")

options.add_experimental_option("excludeSwitches", ["enable-automation"])
options.add_experimental_option('useAutomationExtension', False)
driver = webdriver.Chrome(options=options)

# Enable stealth mode to avoid detection
stealth(driver,
        languages=["en-US", "en"],
        vendor="Google Inc.",
        platform="Win32",
        webgl_vendor="Intel Inc.",
        renderer="Intel Iris OpenGL Engine",
        fix_hairline=True,
        )

# Set up CSV file
csv_file = "products_data.csv"
csv_columns = ["Product Title", "Num of Vendor", "Vendor Name", "Price", "Image URL", "Vendor Link", "Product Rating"]

# Write the header to the CSV file
with open(csv_file, mode="w", newline="", encoding="utf-8") as file:
    writer = csv.DictWriter(file, fieldnames=csv_columns)
    writer.writeheader()

# Open Google Shopping search page
driver.get("https://www.google.com/search?sca_esv=272b628add5e1644&sxsrf=ADLYWIKOv6zMptOSkrduUIogA-_1QFM4Vw:1736252785259&q=winter+coats&udm=28&fbs=AEQNm0Aa4sjWe7Rqy32pFwRj0UkWxyMMuf0D-HOMEpzq2zertRy7G-dme1ONMLTCBvZzSliUAtuJwXTPBVWQOpeBsM3fUSkUhBhk1d2K-btGU92EzjqhYy18vZp6Kq7rwx5djl86CZ3SmcnTCNSLLpUAYm1ku78BstqWpCyrkw60NVNGecT-nd3TIWJpCihPwU0PqL2HsfQ2bCEzBN5b0AAdvbM5G21Byw&ved=1t:220175&ictx=111&biw=1541&bih=945&dpr=1#ip=1")

# Scraping data
all_box = driver.find_elements(By.CSS_SELECTOR, ".SsM98d")

for box in all_box:
    try:
        box.click()
        # Wait for the product title to load
        element = WebDriverWait(driver, 10).until(
            EC.presence_of_element_located(
                (By.CSS_SELECTOR, "div.bi9tFe.PZPZlf[jsname='ZOnBqe'][data-attrid='product_title']")
            )
        )

        time.sleep(3)  # Allow additional time for content to load
        product_title = driver.find_element(By.CSS_SELECTOR, 'div.bi9tFe').text
        image = driver.find_element(By.CSS_SELECTOR, '.q5hmpb img.KfAt4d').get_attribute('src')
        vendors = driver.find_elements(By.CSS_SELECTOR, 'a.P9159d')

        for vendor in vendors:
            vendor_name = vendor.find_element(By.CSS_SELECTOR, '.uWvFpd').text
            try:
                price = vendor.find_element(By.CSS_SELECTOR, '.GBgquf span span').text
            except:
                price = vendor.find_element(By.CSS_SELECTOR, 'span.Pgbknd.xUrPFc').text

            try:
                href = vendor.get_attribute('href')
            except Exception as e:
                href = None

            try:
                rating = vendor.find_element(By.CSS_SELECTOR, 'span.NFq8Ad').text
            except Exception as e:
                rating = ""

            # Save data to CSV
            with open(csv_file, mode="a", newline="", encoding="utf-8") as file:
                writer = csv.DictWriter(file, fieldnames=csv_columns)
                writer.writerow({
                    "Product Title": product_title,
                    "Num of Vendor": len(vendors),
                    "Vendor Name": vendor_name,
                    "Price": price,
                    "Image URL": image,
                    "Vendor Link": href,
                    "Product Rating": rating,
                })

            print(f"nProduct Title: {product_title}nNum of Vendor: {len(vendors)}nVendor Name: {vendor_name}nPrice: {price}nImage URL: {image}nVendor Link: {href}nRating: {rating}n")

    except Exception as e:
        print("An error occurred:")
        traceback.print_exc()

# Close the browser
driver.quit()

Code Explanation:

  • Chrome Options:
    • Configures the browser to run maximized and disables automation flags to avoid detection.
  • Stealth Mode:
    • Makes the browser appear human-like to bypass anti-bot measures.
  • CSV Setup:
    • Prepares the file to store scraped data with headers like “Product Title,” “Price,” etc.
    • Screenshot 2025 01 07 215833
  • Google Shopping URL:
    • Opens Google search with a specific query (winter+coats).
    • Customizable: Replace the query with your desired product search (e.g., summer+shoes).
  • Product Scraping:
    • Locates product elements, clicks to expand details, and extracts:
      • Product Title, Vendor Name, Price, Image URL, Vendor Link, and Rating.
  • Error Handling:
    • Logs errors to avoid crashing when data is missing or inaccessible.
  • Save to CSV:
    • Appends the extracted data into a CSV file for future use.
  • Close Browser:
    • Ensures resources are freed by quitting the browser after scraping.

Important Notes

  • Dynamic Selectors: Google frequently changes its HTML structure. You’ll need to update the CSS selectors in your scraper when the page structure changes.
  • Legal Considerations: Scraping data from websites may violate their terms of service. Use this script responsibly.

Conclusion

Building a Google Shopping scraper in Python can provide you with valuable insights into product pricing and market competition. While the process is straightforward, keeping the scraper updated with Google’s ever-changing HTML is crucial. With this tool, you can extract and analyze data efficiently, giving you an edge in the competitive e-commerce landscape.

Responses

Related Projects

yahoo search
Bing search 1
b9929b09 167f 4365 9087 fddf3278a679
Google Maps