News Feed Forums General Web Scraping Use Node.js to scrape seller ratings from JD.com product pages

  • Use Node.js to scrape seller ratings from JD.com product pages

    Posted by Ariah Alfred on 12/13/2024 at 8:30 am

    Scraping seller ratings from JD.com, one of the largest e-commerce websites in China, involves handling dynamic content rendered by JavaScript. Node.js, along with Puppeteer, is an excellent choice for this task as it provides the ability to interact with dynamically loaded pages and extract content effectively. Seller ratings on JD.com are typically displayed near the product description or alongside the seller’s name. These ratings are important for determining the reliability and quality of the seller and are often displayed as numerical values or percentages.
    To begin, the script uses Puppeteer to navigate to a product page on JD.com. It waits for the seller ratings section to load fully to ensure the data is accessible in the DOM. By inspecting the webpage structure, the script identifies the specific element containing the ratings and extracts its text content. Additional error handling is added to account for cases where the ratings section is missing or fails to load. Below is a complete implementation of the script using Puppeteer:

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        // Navigate to a JD.com product page
        await page.goto('https://item.jd.com/123456.html', { waitUntil: 'networkidle2' });
        // Wait for the seller ratings section to load
        await page.waitForSelector('.seller-info .rating');
        // Extract seller ratings
        const sellerRating = await page.evaluate(() => {
            const ratingElement = document.querySelector('.seller-info .rating');
            return ratingElement ? ratingElement.innerText.trim() : 'Seller rating not found';
        });
        console.log('Seller Rating:', sellerRating);
        await browser.close();
    })();
    
    Subhash Siddiqa replied 1 month ago 5 Members · 4 Replies
  • 4 Replies
  • Lencho Coenraad

    Member
    12/14/2024 at 9:53 am

    The script could be improved by adding support for scraping seller ratings across multiple products. By extracting links to similar products on the page, the script could navigate through them and gather data at scale.

  • Javed Roland

    Member
    12/17/2024 at 7:47 am

    Integrating a feature to handle location-specific seller ratings would enhance the script. JD.com may show different ratings or policies depending on the buyer’s region, so simulating location-based inputs would provide more accurate data.

  • Elen Honorata

    Member
    12/18/2024 at 7:22 am

    The script could store the extracted ratings in a database or JSON file instead of printing them to the console. This would make it easier to analyze and share the data, especially when working with large datasets.

  • Subhash Siddiqa

    Member
    12/19/2024 at 5:45 am

    Adding user-agent rotation and proxy support would make the scraper more resilient to detection by JD.com’s anti-bot measures. This would allow for consistent data scraping without interruptions or IP bans.

Log in to reply.