News Feed Forums General Web Scraping What data can I scrape from Nordstrom.com for product reviews?

  • What data can I scrape from Nordstrom.com for product reviews?

    Posted by Indiana Valentim on 12/19/2024 at 11:28 am

    Scraping product reviews from Nordstrom.com can provide insights into customer opinions, ratings, and feedback on various items. Using PHP, you can send HTTP requests to retrieve web pages and parse their HTML content to extract relevant data. By analyzing the structure of the product review section, you can identify tags or elements containing review details such as customer names, ratings, and review text. This process involves initializing a request to the desired product page, loading the content into a parser, and extracting the required fields. Below is an example of how to scrape reviews from Nordstrom.com using PHP.

    <?php
    $url = "https://www.nordstrom.com/s/womens-shoes";
    $options = [
        CURLOPT_URL => $url,
        CURLOPT_RETURNTRANSFER => true,
        CURLOPT_USERAGENT => "Mozilla/5.0"
    ];
    $ch = curl_init();
    curl_setopt_array($ch, $options);
    $html = curl_exec($ch);
    curl_close($ch);
    $dom = new DOMDocument();
    libxml_use_internal_errors(true);
    $dom->loadHTML($html);
    libxml_clear_errors();
    $xpath = new DOMXPath($dom);
    $products = $xpath->query("//div[contains(@class, 'product-card')]");
    foreach ($products as $product) {
        $name = $xpath->query(".//span[contains(@class, 'product-title')]", $product)->item(0)->nodeValue ?? "Name not available";
        $price = $xpath->query(".//span[contains(@class, 'product-price')]", $product)->item(0)->nodeValue ?? "Price not available";
        $rating = $xpath->query(".//span[contains(@class, 'rating')]", $product)->item(0)->nodeValue ?? "No rating available";
        echo "Name: $name, Price: $price, Rating: $rating\n";
    }
    ?>
    

    This PHP script uses cURL to fetch the product page and DOMDocument with XPath to parse and extract product details. The script targets product titles, prices, and ratings, ensuring default values are provided for missing elements. To handle pagination, you can modify the script to identify and navigate to additional pages. Incorporating error handling ensures the scraper continues to function smoothly even if the page structure changes.

    Heli Burhan replied 2 days, 11 hours ago 2 Members · 1 Reply
  • 1 Reply
  • Heli Burhan

    Member
    12/20/2024 at 7:07 am

    A good way to enhance the scraper is by adding support for pagination to gather data from multiple pages. Nordstrom often splits product listings across multiple pages, so automating the process of navigating through “Next” buttons is essential. By tracking and following pagination links, you can scrape a complete dataset for a category. Including random delays between requests ensures the scraper mimics human behavior and reduces the risk of detection. This method allows for a more comprehensive analysis of Nordstrom’s product catalog.

Log in to reply.