Replies – Discussions – Adelbert Nana

Forum Replies Created

Adelbert Nana

Member

12/13/2024 at 5:52 am in reply to: How can I scrape product reviews from Shopee Thailand using Node.js n Puppeteer?

To scrape Shopee Thailand reviews, Puppeteer provides the ability to simulate human actions, such as clicking and scrolling, to ensure you can capture all reviews on a product page. Once on the page, you can extract relevant details like the review text, ratings, and the reviewer’s name by querying the appropriate DOM elements.

const puppeteer = require('puppeteer');
(async () => {
    const browser = await puppeteer.launch({ headless: true });
    const page = await browser.newPage();
    await page.goto('https://shopee.co.th/product-page-url');
    // Wait for reviews to load
    await page.waitForSelector('.shopee-review-item');
    // Scrape review data
    const reviews = await page.evaluate(() => {
        const reviewItems = document.querySelectorAll('.shopee-review-item');
        return Array.from(reviewItems).map(item => {
            const username = item.querySelector('.shopee-review-item__user-name').innerText;
            const rating = item.querySelector('.shopee-star-rating').innerText;
            const comment = item.querySelector('.shopee-review-item__content').innerText;
            return { username, rating, comment };
        });
    });
    console.log(reviews);
    await browser.close();
})();

Adelbert Nana

Member
12/13/2024 at 5:50 am in reply to: How can I scrape product data from Lazada Thailand using Python n BeautifulSoup?
Scraping Lazada Thailand with BeautifulSoup can also involve navigating through multiple pages to scrape all products. Pagination is often hidden in JavaScript, but requests can fetch each page’s HTML for parsing. In this example, we fetch the product listings and parse the title, price, and ratings (if available). If you want to go deeper, you could explore the product pages by following links to each individual product.
```
import requests
from bs4 import BeautifulSoup
url = 'https://www.lazada.co.th/catalog/?q=tv'
headers = {'User-Agent': 'Mozilla/5.0'}
response = requests.get(url, headers=headers)
soup = BeautifulSoup(response.text, 'html.parser')
# Get all products listed on the first page
products = soup.find_all('div', class_='c2prKC')
for product in products:
    title = product.find('div', {'class': 'c16H9d'}).text.strip()
    price = product.find('span', {'class': 'c13VH6'}).text.strip()
    print(f'Title: {title}, Price: {price}')
```