How to Scrape Amazon Reviews Using Python

Published on: September 2, 2025

Amazon reviews aren’t just feedback—they’re raw, unfiltered customer intelligence. If you’re building a product, managing a brand, or analyzing competitors, Amazon review data can uncover patterns, pain points, and persuasive language directly from your target audience.

In this guide, we’ll walk you through how to scrape Amazon reviews using Python, step-by-step. You’ll build a scraper that extracts:

  • Review titles and content
  • Ratings and verification tags
  • Helpful vote counts and review images
  • Reviewer names and timestamps

Plus, we’ll handle pagination, rate limiting, error handling, and show you how to store your results in a clean CSV. This tutorial is ideal if you want to extract real-world review data and turn it into actionable insights.

Let’s dive in.

Why Scraping Amazon Reviews Is So Valuable

Amazon review data is a goldmine for product teams, marketers, researchers, and entrepreneurs. Whether you’re launching a competing product, improving your own, or gathering voice-of-customer insights, learning how to scrape Amazon reviews puts you directly in touch with how real customers think and feel.

You can use Amazon review data to:

  • Track product feedback over time
  • Identify competitors’ weaknesses
  • Analyze verified vs unverified purchases
  • Perform sentiment analysis at scale
  • Understand how customers describe pain points

A well-built Amazon review scraper allows you to extract all this information efficiently and structure it for analysis.

What Kind of Data Can You Extract?

When you scrape reviews from Amazon, here’s what you can typically extract:

  • Review title – Quick summary of sentiment or focus
  • Review content – The full body of the review text
  • Rating – Star rating, e.g., 4.0 out of 5
  • Date – When the review was posted
  • Author – Reviewer name
  • Verification tag – Whether it’s a verified purchase
  • Helpful votes – Social validation of the review
  • Images – Uploaded photos that often provide strong visual context

These fields are instrumental if you’re doing sentiment analysis, competitor benchmarking, or even training an AI model on customer feedback.

Less Talk, More Data?

Powerful proxies and a fully supported API – we’ve got everything you need for fast Amazon data scraping!

How to Scrape Amazon Reviews Using Python

To scrape Amazon reviews, Python is your best tool. It’s powerful, flexible, and comes with excellent scraping libraries like Requests, BeautifulSoup, and Pandas. In this guide, we’ll walk through a robust Python script that:

  • Collects review data from Amazon product pages
  • Handles pagination
  • Avoids rate limiting
  • Catches and logs common errors
  • Saves everything to CSV

Required Libraries

First, let’s install the required libraries. When it comes to scraping Amazon, our tools of choice are no different than other e-commerce scraping projects – we’re choosing supporting tools designed for bulk, scale, and efficiency.

import requests
from bs4 import BeautifulSoup
import pandas as pd
import time
import random
import logging

Set Up Custom Headers

Rotating your headers, especially the User-Agent, avoid detection. Here, we define a list of headers that will be randomly selected for each request.

headers_list = [
    {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)",
        "Accept-Language": "en-US,en;q=0.9",
    },
    # Add more user agents if needed
]

Request + Error Handling

This function sends a GET request to the Amazon reviews page and returns the parsed HTML using BeautifulSoup. It includes error handling to catch timeouts or failed responses.

def get_soup(url):
    try:
        response = requests.get(url, headers=random.choice(headers_list), timeout=10)
        response.raise_for_status()
        return BeautifulSoup(response.text, "lxml")
    except requests.exceptions.RequestException as e:
        logging.error(f"Request failed: {e}")
        return None

Extract Review Details

This function takes in a review HTML element and extracts all the useful fields (title, rating, text, date, author, verified tag, and any uploaded images). It includes fallback strategies in case Amazon’s layout varies.

def extract_review(review):
    try:
        title_elem = review.select_one(".review-title span:not([class])")
        title = title_elem.text.strip() if title_elem else None

        rating_elem = review.select_one(".review-rating span")
        rating = rating_elem.text.replace("out of 5 stars", "").strip() if rating_elem else None

        content_elem = review.select_one("[data-hook='review-body']")
        content = " ".join(content_elem.stripped_strings) if content_elem else None

        author_elem = review.select_one(".a-profile-name")
        author = author_elem.text.strip() if author_elem else None

        date_elem = review.select_one(".review-date")
        date = date_elem.text.strip() if date_elem else None

        verified_elem = review.select_one("span.a-size-mini")
        verified = verified_elem.text.strip() if verified_elem else None

        image_elements = review.select(".review-image-tile img")
        images = [img.get("data-src", img.get("src")) for img in image_elements] if image_elements else []

        return {
            "title": title,
            "rating": rating,
            "content": content.replace("Read more", "") if content else None,
            "author": author,
            "date": date,
            "verified": verified,
            "images": images
        }

    except Exception as e:
        logging.warning(f"Failed to extract a review: {e}")
        return None

Handle Pagination

Amazon splits reviews across pages. This function loops through each review page, scrapes reviews using get_soup(), and appends the results into a list. A short delay between requests reduces the risk of throttling.

def get_reviews(asin, max_pages=5):
    all_reviews = []

    for page in range(1, max_pages + 1):
        url = f"https://www.amazon.com/product-reviews/{asin}/?pageNumber={page}"
        print(f"Scraping page {page}...")
        soup = get_soup(url)
        if not soup:
            continue

        reviews = soup.select("#cm_cr-review_list > .a-section")
        if not reviews:
            break  # No more reviews

        for review in reviews:
            parsed = extract_review(review)
            if parsed:
                all_reviews.append(parsed)

        # Rate limiting between requests
        time.sleep(random.uniform(2.0, 5.0))

    return all_reviews

Export to CSV

Here’s a quick script to export to CSV. This is a simple way to help you extract data from Amazon to Excel, or another database of your choice.

def save_reviews_to_csv(reviews, asin):
    df = pd.DataFrame(reviews)
    df.to_csv(f"amazon_reviews_{asin}.csv", index=False)
    print(f"Saved {len(reviews)} reviews to amazon_reviews_{asin}.csv")

Main Function

This is the entry point for the script. It defines the ASIN to scrape, runs the scraper, and saves the data. You can read our guide on scraping Amazon ASIN data to learn more.

def main():
    asin = "B098FKXT8L"  # Replace with your product's ASIN
    reviews = get_reviews(asin, max_pages=10)
    save_reviews_to_csv(reviews, asin)

if __name__ == "__main__":
    main()

Handling Common Errors During Scraping

When scraping reviews from Amazon, here are common issues you may encounter:

  • Request timeouts – Use try/except with timeout=10.
  • Empty response – Often caused by IP bans or bot detection.
  • Parsing errors – Missing HTML elements (Amazon changes their layout frequently).
  • Invalid selector – Add if element checks before calling .text.

Tip: Use logging.warning() or logging.error() to track these without breaking your script.

⚠️ A Word of Warning

Amazon updates its front-end HTML frequently. That means your Amazon review scraper might work today, but break tomorrow.

If you’re looking for a more stable, production-grade solution, check out Rayobyte’s Web Scraper API. We handle dynamic content, proxy rotation, error handling, and structure the data for you — so you can focus on results, not fixing broken scrapers.

Don’t Forget Pagination!

If you’re scraping a large volume of review data, you’ll need to paginate through all available review pages. The example above covers this, but for more advanced control over:

  • Infinite scrolling
  • JavaScript-loaded reviews
  • Dynamic delays

…you might want to build a custom Amazon web crawler to better source and log the exact URLs you are interested in tracking.

Why Proxies Matter When Scraping Amazon

Amazon tracks how often an IP address makes requests. If your scraper hits too many pages in a short time from the same IP, you may:

  • Get blocked through the likes of 403 errors
  • See CAPTCHA challenges
  • Receive fake or misleading content

Using a rotating proxy ensures that each request appears to come from a different IP, reducing the chance of bans. Rayobyte offers geo-located residential proxies to help you stay under the radar.

What to Do With Amazon Review Data

Once you’ve collected the data, the next step is analysis. Consider:

  • Performing sentiment analysis to understand customer perception.
  • Building dashboards showing top complaints, keywords, or satisfaction trends.
  • Grouping reviews by date to monitor feedback over time.

Read more:

Ready to Scrape Amazon Reviews?

By learning how to scrape Amazon reviews using Python, you’re unlocking powerful insights buried in customer feedback. Whether you’re tracking your own products or keeping an eye on the competition, Amazon review scraping can give you the edge.

But if you’re tired of fixing broken scrapers and dodging blocks, let Rayobyte do the heavy lifting.
Try our Web Scraper API and get structured, real-time Amazon review data with none of the hassle. Between our API and high-grade, ethical proxies, we’ve got the scraping tools and expertise you need!

Amazon Insights at Scale

We’ve got a wide range of products and expertise to help you gain the most valuable data for your business.

Table of Contents

    Real Proxies. Real Results.

    When you buy a proxy from us, you’re getting the real deal.

    Kick-Ass Proxies That Work For Anyone

    Rayobyte is America's #1 proxy provider, proudly offering support to companies of any size using proxies for any ethical use case. Our web scraping tools are second to none and easy for anyone to use.

    Related blogs