-
Use Node.js to scrape seller ratings from JD.com product pages
Scraping seller ratings from JD.com, one of the largest e-commerce websites in China, involves handling dynamic content rendered by JavaScript. Node.js, along with Puppeteer, is an excellent choice for this task as it provides the ability to interact with dynamically loaded pages and extract content effectively. Seller ratings on JD.com are typically displayed near the product description or alongside the seller’s name. These ratings are important for determining the reliability and quality of the seller and are often displayed as numerical values or percentages.
To begin, the script uses Puppeteer to navigate to a product page on JD.com. It waits for the seller ratings section to load fully to ensure the data is accessible in the DOM. By inspecting the webpage structure, the script identifies the specific element containing the ratings and extracts its text content. Additional error handling is added to account for cases where the ratings section is missing or fails to load. Below is a complete implementation of the script using Puppeteer:const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch({ headless: true }); const page = await browser.newPage(); // Navigate to a JD.com product page await page.goto('https://item.jd.com/123456.html', { waitUntil: 'networkidle2' }); // Wait for the seller ratings section to load await page.waitForSelector('.seller-info .rating'); // Extract seller ratings const sellerRating = await page.evaluate(() => { const ratingElement = document.querySelector('.seller-info .rating'); return ratingElement ? ratingElement.innerText.trim() : 'Seller rating not found'; }); console.log('Seller Rating:', sellerRating); await browser.close(); })();
Log in to reply.