News Feed Forums General Web Scraping How can I scrape product reviews from Shopee Thailand using Node.js n Puppeteer?

  • How can I scrape product reviews from Shopee Thailand using Node.js n Puppeteer?

    Posted by Evelia Judith on 12/11/2024 at 10:50 am

    Puppeteer is great for scraping Shopee Thailand, especially when the content is dynamically loaded via JavaScript. You’ll need to load the page and extract product reviews once the page is fully rendered. A common challenge is handling infinite scrolling, where products or reviews are loaded as you scroll down. Use page.evaluate() to grab reviews, ratings, and other metadata after the page has rendered.

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: false });
        const page = await browser.newPage();
        await page.goto('https://shopee.co.th/product-page-url');
        // Wait for the review section to load
        await page.waitForSelector('.shopee-review-item');
        // Extract review data
        const reviews = await page.evaluate(() => {
            const reviewElements = document.querySelectorAll('.shopee-review-item');
            const reviews = [];
            reviewElements.forEach(review => {
                const username = review.querySelector('.shopee-user-name').textContent;
                const rating = review.querySelector('.shopee-star-rating').textContent;
                const comment = review.querySelector('.shopee-review-item__content').textContent;
                reviews.push({ username, rating, comment });
            });
            return reviews;
        });
        console.log(reviews);
        await browser.close();
    })();
    
    Zaheer Arethusa replied 1 week, 1 day ago 6 Members · 5 Replies
  • 5 Replies
  • Anapa Jerilyn

    Member
    12/11/2024 at 11:21 am

    I map each cURL header to a key-value pair in the headers dictionary in Python. For complex authentication, I use Python’s requests.auth module or manual header configuration.

  • Prashant Sanjiv

    Member
    12/12/2024 at 7:22 am

    When scraping Shopee Thailand, product reviews can be accessed by navigating through the product page and extracting review data. Puppeteer allows you to emulate user actions like scrolling, ensuring that all reviews load before extracting them. You can gather product ratings and user comments by targeting the right classes and using page.$eval() or page.$$eval() to fetch the review content.

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        await page.goto('https://shopee.co.th/product-page-url');
        // Scroll to load reviews
        await page.evaluate(() => {
            window.scrollBy(0, window.innerHeight);
        });
        // Scrape reviews
        const reviews = await page.$$eval('.review-item', items => {
            return items.map(item => {
                const name = item.querySelector('.review-author').innerText;
                const rating = item.querySelector('.rating-star').innerText;
                const reviewText = item.querySelector('.review-text').innerText;
                return { name, rating, reviewText };
            });
        });
        console.log(reviews);
        await browser.close();
    })();
    
  • Rutendo Urvashi

    Member
    12/12/2024 at 10:03 am

    Puppeteer is ideal for scraping Shopee Thailand because it handles JavaScript-rendered pages seamlessly. Scraping reviews involves targeting the right HTML elements that contain user ratings and comments. The challenge is often dealing with multiple review pages or infinite scrolling, so you may need to automate scrolling and capture reviews as they load.

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        await page.goto('https://shopee.co.th/product-page-url');
        // Scroll to load all reviews
        await page.evaluate(() => {
            window.scrollTo(0, document.body.scrollHeight);
        });
        // Extract reviews
        const reviews = await page.$$eval('.shopee-review-item', reviews => {
            return reviews.map(review => ({
                user: review.querySelector('.review-username').innerText,
                rating: review.querySelector('.shopee-star-rating').innerText,
                text: review.querySelector('.shopee-review-item__content').innerText
            }));
        });
        console.log(reviews);
        await browser.close();
    })();
    
  • Adelbert Nana

    Member
    12/13/2024 at 5:52 am

    To scrape Shopee Thailand reviews, Puppeteer provides the ability to simulate human actions, such as clicking and scrolling, to ensure you can capture all reviews on a product page. Once on the page, you can extract relevant details like the review text, ratings, and the reviewer’s name by querying the appropriate DOM elements.

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        await page.goto('https://shopee.co.th/product-page-url');
        // Wait for reviews to load
        await page.waitForSelector('.shopee-review-item');
        // Scrape review data
        const reviews = await page.evaluate(() => {
            const reviewItems = document.querySelectorAll('.shopee-review-item');
            return Array.from(reviewItems).map(item => {
                const username = item.querySelector('.shopee-review-item__user-name').innerText;
                const rating = item.querySelector('.shopee-star-rating').innerText;
                const comment = item.querySelector('.shopee-review-item__content').innerText;
                return { username, rating, comment };
            });
        });
        console.log(reviews);
        await browser.close();
    })();
    
  • Zaheer Arethusa

    Member
    12/14/2024 at 6:29 am

    Scraping reviews from Shopee Thailand with Puppeteer involves interacting with the product page, loading additional reviews if necessary, and then parsing the content using custom selectors. It’s important to handle the asynchronous nature of review loading—using Puppeteer’s waiting functions ensures you only scrape reviews after they’ve been fully loaded.

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        await page.goto('https://shopee.co.th/product-page-url');
        // Ensure reviews are fully loaded
        await page.waitForSelector('.shopee-review-item');
        // Scroll to load reviews
        await page.evaluate(() => {
            window.scrollBy(0, window.innerHeight);
        });
        // Scrape review data
        const reviews = await page.$$eval('.shopee-review-item', reviewItems => {
            return reviewItems.map(review => ({
                name: review.querySelector('.shopee-review-item__user-name').innerText,
                rating: review.querySelector('.shopee-star-rating').innerText,
                reviewText: review.querySelector('.shopee-review-item__content').innerText
            }));
        });
        console.log(reviews);
        await browser.close();
    })();
    

Log in to reply.