News Feed Forums General Web Scraping How to scrape product information from BestBuy.com using JavaScript?

  • How to scrape product information from BestBuy.com using JavaScript?

    Posted by Silvija Mailcun on 12/19/2024 at 11:10 am

    Scraping product information from BestBuy.com using JavaScript can be accomplished with Node.js and a library like Puppeteer. Puppeteer is an excellent choice for handling dynamic content, as it allows you to automate a headless browser and scrape data such as product names, prices, and ratings. The process involves launching a browser instance, navigating to the desired product page, and extracting the required details using DOM selectors. This approach ensures that all JavaScript-rendered content is fully loaded before scraping. Below is an example script to scrape BestBuy product data using Node.js and Puppeteer.

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        const url = 'https://www.bestbuy.com/site/searchpage.jsp?st=laptops';
        await page.goto(url, { waitUntil: 'networkidle2' });
        const products = await page.evaluate(() => {
            let productData = [];
            const items = document.querySelectorAll('.sku-item');
            items.forEach(item => {
                const name = item.querySelector('.sku-title')?.textContent.trim() || 'Name not available';
                const price = item.querySelector('.priceView-customer-price > span')?.textContent.trim() || 'Price not available';
                const rating = item.querySelector('.sr-only')?.textContent.trim() || 'No rating';
                productData.push({ name, price, rating });
            });
            return productData;
        });
        console.log(products);
        await browser.close();
    })();
    

    This script automates a headless browser to navigate BestBuy’s product search page, extract data such as names, prices, and ratings, and print the results in the console. To handle pagination, you can implement logic to identify and click the “Next” button for additional pages. Adding delays between requests and using proxies reduces the risk of being blocked by BestBuy’s anti-scraping mechanisms. The extracted data can be stored in a database or written to a file for further analysis.

    Jeanne Dajana replied 2 days, 9 hours ago 2 Members · 1 Reply
  • 1 Reply
  • Jeanne Dajana

    Member
    12/20/2024 at 8:32 am

    To improve the scraper, adding pagination support ensures a more comprehensive dataset. BestBuy’s product pages often have multiple pages of listings, and handling the “Next” button programmatically allows you to gather all products in a category. Using Puppeteer’s click function, you can simulate clicking the “Next” button and scrape additional pages in a loop. Introducing a delay between page loads prevents the scraper from overloading the server. This method ensures that your dataset includes all relevant products across multiple pages.

Log in to reply.