How to Scrape Amazon Reviews Using Python
Amazon reviews aren’t just feedback—they’re raw, unfiltered customer intelligence. If you’re building a product, managing a brand, or analyzing competitors, Amazon review data can uncover patterns, pain points, and persuasive language directly from your target audience.

In this guide, we’ll walk you through how to scrape Amazon reviews using Python, step-by-step. You’ll build a scraper that extracts:
- Review titles and content
- Ratings and verification tags
- Helpful vote counts and review images
- Reviewer names and timestamps
Plus, we’ll handle pagination, rate limiting, error handling, and show you how to store your results in a clean CSV. This tutorial is ideal if you want to extract real-world review data and turn it into actionable insights.
Let’s dive in.
Why Scraping Amazon Reviews Is So Valuable
Amazon review data is a goldmine for product teams, marketers, researchers, and entrepreneurs. Whether you’re launching a competing product, improving your own, or gathering voice-of-customer insights, learning how to scrape Amazon reviews puts you directly in touch with how real customers think and feel.

You can use Amazon review data to:
- Track product feedback over time
- Identify competitors’ weaknesses
- Analyze verified vs unverified purchases
- Perform sentiment analysis at scale
- Understand how customers describe pain points
A well-built Amazon review scraper allows you to extract all this information efficiently and structure it for analysis.
What Kind of Data Can You Extract?
When you scrape reviews from Amazon, here’s what you can typically extract:
- Review title – Quick summary of sentiment or focus
- Review content – The full body of the review text
- Rating – Star rating, e.g., 4.0 out of 5
- Date – When the review was posted
- Author – Reviewer name
- Verification tag – Whether it’s a verified purchase
- Helpful votes – Social validation of the review
- Images – Uploaded photos that often provide strong visual context
These fields are instrumental if you’re doing sentiment analysis, competitor benchmarking, or even training an AI model on customer feedback.
Less Talk, More Data?
Powerful proxies and a fully supported API – we’ve got everything you need for fast Amazon data scraping!

How to Scrape Amazon Reviews Using Python
To scrape Amazon reviews, Python is your best tool. It’s powerful, flexible, and comes with excellent scraping libraries like Requests, BeautifulSoup, and Pandas. In this guide, we’ll walk through a robust Python script that:
- Collects review data from Amazon product pages
- Handles pagination
- Avoids rate limiting
- Catches and logs common errors
- Saves everything to CSV
Required Libraries
First, let’s install the required libraries. When it comes to scraping Amazon, our tools of choice are no different than other e-commerce scraping projects – we’re choosing supporting tools designed for bulk, scale, and efficiency.
import requests from bs4 import BeautifulSoup import pandas as pd import time import random import logging
Set Up Custom Headers
Rotating your headers, especially the
, avoid detection. Here, we define a list of headers that will be randomly selected for each request.User-Agent
headers_list = [ { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64)", "Accept-Language": "en-US,en;q=0.9", }, # Add more user agents if needed ]
Request + Error Handling
This function sends a GET request to the Amazon reviews page and returns the parsed HTML using BeautifulSoup. It includes error handling to catch timeouts or failed responses.
def get_soup(url): try: response = requests.get(url, headers=random.choice(headers_list), timeout=10) response.raise_for_status() return BeautifulSoup(response.text, "lxml") except requests.exceptions.RequestException as e: logging.error(f"Request failed: {e}") return None
Extract Review Details
This function takes in a review HTML element and extracts all the useful fields (title, rating, text, date, author, verified tag, and any uploaded images). It includes fallback strategies in case Amazon’s layout varies.
def extract_review(review): try: title_elem = review.select_one(".review-title span:not([class])") title = title_elem.text.strip() if title_elem else None rating_elem = review.select_one(".review-rating span") rating = rating_elem.text.replace("out of 5 stars", "").strip() if rating_elem else None content_elem = review.select_one("[data-hook='review-body']") content = " ".join(content_elem.stripped_strings) if content_elem else None author_elem = review.select_one(".a-profile-name") author = author_elem.text.strip() if author_elem else None date_elem = review.select_one(".review-date") date = date_elem.text.strip() if date_elem else None verified_elem = review.select_one("span.a-size-mini") verified = verified_elem.text.strip() if verified_elem else None image_elements = review.select(".review-image-tile img") images = [img.get("data-src", img.get("src")) for img in image_elements] if image_elements else [] return { "title": title, "rating": rating, "content": content.replace("Read more", "") if content else None, "author": author, "date": date, "verified": verified, "images": images } except Exception as e: logging.warning(f"Failed to extract a review: {e}") return None
Handle Pagination
Amazon splits reviews across pages. This function loops through each review page, scrapes reviews using
, and appends the results into a list. A short delay between requests reduces the risk of throttling.get_soup()
def get_reviews(asin, max_pages=5): all_reviews = [] for page in range(1, max_pages + 1): url = f"https://www.amazon.com/product-reviews/{asin}/?pageNumber={page}" print(f"Scraping page {page}...") soup = get_soup(url) if not soup: continue reviews = soup.select("#cm_cr-review_list > .a-section") if not reviews: break # No more reviews for review in reviews: parsed = extract_review(review) if parsed: all_reviews.append(parsed) # Rate limiting between requests time.sleep(random.uniform(2.0, 5.0)) return all_reviews
Export to CSV
Here’s a quick script to export to CSV. This is a simple way to help you extract data from Amazon to Excel, or another database of your choice.
def save_reviews_to_csv(reviews, asin): df = pd.DataFrame(reviews) df.to_csv(f"amazon_reviews_{asin}.csv", index=False) print(f"Saved {len(reviews)} reviews to amazon_reviews_{asin}.csv")
Main Function
This is the entry point for the script. It defines the ASIN to scrape, runs the scraper, and saves the data. You can read our guide on scraping Amazon ASIN data to learn more.
def main(): asin = "B098FKXT8L" # Replace with your product's ASIN reviews = get_reviews(asin, max_pages=10) save_reviews_to_csv(reviews, asin) if __name__ == "__main__": main()
Handling Common Errors During Scraping
When scraping reviews from Amazon, here are common issues you may encounter:
- Request timeouts – Use
withtry/except
.timeout=10
- Empty response – Often caused by IP bans or bot detection.
- Parsing errors – Missing HTML elements (Amazon changes their layout frequently).
- Invalid selector – Add
checks before callingif element
..text
Tip: Use
or logging.warning()
to track these without breaking your script.logging.error()
⚠️ A Word of Warning
Amazon updates its front-end HTML frequently. That means your Amazon review scraper might work today, but break tomorrow.
If you’re looking for a more stable, production-grade solution, check out Rayobyte’s Web Scraper API. We handle dynamic content, proxy rotation, error handling, and structure the data for you — so you can focus on results, not fixing broken scrapers.
Don’t Forget Pagination!
If you’re scraping a large volume of review data, you’ll need to paginate through all available review pages. The example above covers this, but for more advanced control over:
- Infinite scrolling
- JavaScript-loaded reviews
- Dynamic delays
…you might want to build a custom Amazon web crawler to better source and log the exact URLs you are interested in tracking.
Why Proxies Matter When Scraping Amazon
Amazon tracks how often an IP address makes requests. If your scraper hits too many pages in a short time from the same IP, you may:
- Get blocked through the likes of 403 errors
- See CAPTCHA challenges
- Receive fake or misleading content
Using a rotating proxy ensures that each request appears to come from a different IP, reducing the chance of bans. Rayobyte offers geo-located residential proxies to help you stay under the radar.
What to Do With Amazon Review Data
Once you’ve collected the data, the next step is analysis. Consider:
- Performing sentiment analysis to understand customer perception.
- Building dashboards showing top complaints, keywords, or satisfaction trends.
- Grouping reviews by date to monitor feedback over time.
Read more:
- How to Speed Up Sentiment Analysis with Web Scraping
- Customer Sentiment Analysis Tools
- A Guide to AI Sentiment Analysis
Ready to Scrape Amazon Reviews?
By learning how to scrape Amazon reviews using Python, you’re unlocking powerful insights buried in customer feedback. Whether you’re tracking your own products or keeping an eye on the competition, Amazon review scraping can give you the edge.
But if you’re tired of fixing broken scrapers and dodging blocks, let Rayobyte do the heavy lifting.
Try our Web Scraper API and get structured, real-time Amazon review data with none of the hassle. Between our API and high-grade, ethical proxies, we’ve got the scraping tools and expertise you need!
Amazon Insights at Scale
We’ve got a wide range of products and expertise to help you gain the most valuable data for your business.
