Forum Replies Created

  • To scrape Shopee Thailand reviews, Puppeteer provides the ability to simulate human actions, such as clicking and scrolling, to ensure you can capture all reviews on a product page. Once on the page, you can extract relevant details like the review text, ratings, and the reviewer’s name by querying the appropriate DOM elements.

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        await page.goto('https://shopee.co.th/product-page-url');
        // Wait for reviews to load
        await page.waitForSelector('.shopee-review-item');
        // Scrape review data
        const reviews = await page.evaluate(() => {
            const reviewItems = document.querySelectorAll('.shopee-review-item');
            return Array.from(reviewItems).map(item => {
                const username = item.querySelector('.shopee-review-item__user-name').innerText;
                const rating = item.querySelector('.shopee-star-rating').innerText;
                const comment = item.querySelector('.shopee-review-item__content').innerText;
                return { username, rating, comment };
            });
        });
        console.log(reviews);
        await browser.close();
    })();
    
  • Scraping Lazada Thailand with BeautifulSoup can also involve navigating through multiple pages to scrape all products. Pagination is often hidden in JavaScript, but requests can fetch each page’s HTML for parsing. In this example, we fetch the product listings and parse the title, price, and ratings (if available). If you want to go deeper, you could explore the product pages by following links to each individual product.

    import requests
    from bs4 import BeautifulSoup
    url = 'https://www.lazada.co.th/catalog/?q=tv'
    headers = {'User-Agent': 'Mozilla/5.0'}
    response = requests.get(url, headers=headers)
    soup = BeautifulSoup(response.text, 'html.parser')
    # Get all products listed on the first page
    products = soup.find_all('div', class_='c2prKC')
    for product in products:
        title = product.find('div', {'class': 'c16H9d'}).text.strip()
        price = product.find('span', {'class': 'c13VH6'}).text.strip()
        print(f'Title: {title}, Price: {price}')