Forum Replies Created

  • To scrape customer reviews from Tesco Lotus Thailand using Puppeteer, you’ll need to handle dynamic content. As you navigate to a product page, Puppeteer will allow you to wait for the reviews section to load fully. Extracting review details such as user ratings and comments can be done using CSS selectors. It’s also important to handle any delays in loading reviews, as some content may be lazy-loaded as you scroll. Puppeteer’s waitForSelector() method will ensure that all necessary elements are visible before scraping.

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        await page.goto('https://www.tescolotus.com/product-page');
        // Ensure the review section is visible
        await page.waitForSelector('.review-section');
        // Extract reviews
        const reviewData = await page.evaluate(() => {
            const reviews = [];
            const reviewElements = document.querySelectorAll('.review');
            reviewElements.forEach(review => {
                const reviewerName = review.querySelector('.reviewer-name').innerText;
                const rating = review.querySelector('.rating')?.innerText;
                const comment = review.querySelector('.review-comment').innerText;
                reviews.push({ reviewerName, rating, comment });
            });
            return reviews;
        });
        console.log(reviewData);
        await browser.close();
    })();
    
  • Scraping Central Thailand requires attention to both static and dynamic content, especially for prices and product availability. Scrapy excels at parsing static content, but for more complex sites with dynamic loading, you might need to ensure that you are targeting the correct elements. Scraping the prices and availability status can sometimes require additional logic to account for out-of-stock items or sale prices. It’s also important to handle pagination to scrape all available products within a category.

    import scrapy
    class CentralPriceScraper(scrapy.Spider):
        name = 'central_price_scraper'
        start_urls = ['https://www.central.co.th/en/shop/']
        def parse(self, response):
            for product in response.xpath('//div[@class="product-item"]'):
                title = product.xpath('.//h2[@class="product-title"]/text()').get()
                price = product.xpath('.//span[@class="price"]/text()').get()
                availability = product.xpath('.//div[@class="availability-status"]/text()').get()
                yield {
                    'product': title.strip(),
                    'price': price.strip(),
                    'availability': availability.strip(),
                }
            # Follow pagination
            next_page = response.xpath('//a[contains(@class, "next-page")]/@href').get()
            if next_page:
                yield response.follow(next_page, self.parse)