-
How to scrape product reviews from Etsy.com using Python?
Scraping product reviews from Etsy.com is a valuable way to analyze customer sentiment, product popularity, and review trends. Python is an excellent tool for extracting such data, using libraries like requests and BeautifulSoup for static pages or Selenium for dynamically rendered content. Etsy organizes its reviews into structured sections on product pages, typically containing the reviewer’s name, star rating, and the review text. However, ensure that your scraping activities align with Etsy’s terms of service and respect user privacy.
The first step is to inspect the product review section using browser developer tools. Identify the tags and classes associated with review details, such as ratings and text. Here’s an example of scraping reviews using BeautifulSoup:import requests from bs4 import BeautifulSoup # Target URL for an Etsy product's review page url = "https://www.etsy.com/listing/123456789/example-product" headers = { "User-Agent": "Mozilla/5.0" } # Fetch the page response = requests.get(url, headers=headers) if response.status_code == 200: soup = BeautifulSoup(response.content, "html.parser") # Extract review details reviews = soup.find_all("div", class_="review") for review in reviews: reviewer_name = review.find("span", class_="reviewer-name").text.strip() rating = review.find("span", class_="star-rating").text.strip() review_text = review.find("p", class_="review-text").text.strip() print(f"Reviewer: {reviewer_name}, Rating: {rating}, Review: {review_text}") else: print("Failed to fetch the Etsy product page.")
If the reviews are dynamically loaded, you can use Selenium to render the content before extracting it. Below is an example using Selenium:
from selenium import webdriver from selenium.webdriver.common.by import By # Initialize Selenium WebDriver driver = webdriver.Chrome() driver.get("https://www.etsy.com/listing/123456789/example-product") # Wait for the page to load driver.implicitly_wait(10) # Extract review details reviews = driver.find_elements(By.CLASS_NAME, "review") for review in reviews: reviewer_name = review.find_element(By.CLASS_NAME, "reviewer-name").text.strip() rating = review.find_element(By.CLASS_NAME, "star-rating").text.strip() review_text = review.find_element(By.CLASS_NAME, "review-text").text.strip() print(f"Reviewer: {reviewer_name}, Rating: {rating}, Review: {review_text}") # Close the browser driver.quit()
For large-scale scraping, consider implementing a delay between requests using time.sleep() and rotating proxies to avoid detection. You can also store the scraped data in a database or CSV file for easier analysis using libraries like pandas or SQLite.
Log in to reply.