General Web Scraping

How to scrape product reviews from Etsy.com using Python?

Posted by Javed Roland on 12/17/2024 at 7:47 am

Scraping product reviews from Etsy.com is a valuable way to analyze customer sentiment, product popularity, and review trends. Python is an excellent tool for extracting such data, using libraries like requests and BeautifulSoup for static pages or Selenium for dynamically rendered content. Etsy organizes its reviews into structured sections on product pages, typically containing the reviewer’s name, star rating, and the review text. However, ensure that your scraping activities align with Etsy’s terms of service and respect user privacy.
The first step is to inspect the product review section using browser developer tools. Identify the tags and classes associated with review details, such as ratings and text. Here’s an example of scraping reviews using BeautifulSoup:

import requests
from bs4 import BeautifulSoup
# Target URL for an Etsy product's review page
url = "https://www.etsy.com/listing/123456789/example-product"
headers = {
    "User-Agent": "Mozilla/5.0"
}
# Fetch the page
response = requests.get(url, headers=headers)
if response.status_code == 200:
    soup = BeautifulSoup(response.content, "html.parser")
    # Extract review details
    reviews = soup.find_all("div", class_="review")
    for review in reviews:
        reviewer_name = review.find("span", class_="reviewer-name").text.strip()
        rating = review.find("span", class_="star-rating").text.strip()
        review_text = review.find("p", class_="review-text").text.strip()
        print(f"Reviewer: {reviewer_name}, Rating: {rating}, Review: {review_text}")
else:
    print("Failed to fetch the Etsy product page.")

If the reviews are dynamically loaded, you can use Selenium to render the content before extracting it. Below is an example using Selenium:

from selenium import webdriver
from selenium.webdriver.common.by import By
# Initialize Selenium WebDriver
driver = webdriver.Chrome()
driver.get("https://www.etsy.com/listing/123456789/example-product")
# Wait for the page to load
driver.implicitly_wait(10)
# Extract review details
reviews = driver.find_elements(By.CLASS_NAME, "review")
for review in reviews:
    reviewer_name = review.find_element(By.CLASS_NAME, "reviewer-name").text.strip()
    rating = review.find_element(By.CLASS_NAME, "star-rating").text.strip()
    review_text = review.find_element(By.CLASS_NAME, "review-text").text.strip()
    print(f"Reviewer: {reviewer_name}, Rating: {rating}, Review: {review_text}")
# Close the browser
driver.quit()

For large-scale scraping, consider implementing a delay between requests using time.sleep() and rotating proxies to avoid detection. You can also store the scraped data in a database or CSV file for easier analysis using libraries like pandas or SQLite.

Bituin Oskar replied 1 year, 6 months ago 4 Members · 3 Replies

3 Replies

Antonio Elfriede

Member
12/19/2024 at 7:21 am

A significant improvement to the scraper would be handling pagination. Etsy reviews are often split across multiple pages, requiring the scraper to navigate through each page. By identifying and automating the “Next” button in Selenium or parsing the pagination URLs in BeautifulSoup, you can ensure complete review collection. Adding a delay between requests can help avoid rate-limiting or detection.
Medine Daniyal

Member
12/21/2024 at 11:20 am

Rotating user-agent headers and using proxies can improve the scraper’s robustness. Etsy may block repeated requests from the same IP address, so distributing traffic across multiple proxies can prevent detection. Libraries like fake_useragent or proxy rotation services can make your scraper appear more like a real user.
Bituin Oskar

Member
01/17/2025 at 5:30 am

Storing the scraped reviews in a database like MongoDB or PostgreSQL is beneficial for organizing and querying large datasets. For instance, you can analyze trends over time, compare ratings across products, or identify recurring themes in customer reviews. A structured storage solution also facilitates integration with visualization tools like Tableau or Matplotlib.

How to scrape product reviews from Etsy.com using Python?

Antonio Elfriede

Medine Daniyal

Bituin Oskar