News Feed Forums General Web Scraping Extract shipping policies from Ceneo Poland using Node.js

  • Extract shipping policies from Ceneo Poland using Node.js

    Posted by Monica Nerva on 12/13/2024 at 10:59 am

    Ceneo is one of the most popular price comparison websites in Poland, helping customers find the best deals on various products. Scraping shipping policies from Ceneo involves focusing on the delivery details that are often provided by individual sellers. These policies typically include shipping costs, estimated delivery times, and options like courier or pickup services. Using Node.js and Puppeteer, you can handle dynamic content on Ceneo’s website to ensure all delivery details are fully loaded and extracted correctly.
    The script begins by navigating to a product page and waiting for the shipping policies section to appear. Puppeteer enables precise control over the browser session, allowing you to query specific elements on the page that display shipping-related information. By simulating user interactions such as scrolling or clicking on expandable sections, the script ensures no hidden details are missed. Below is a complete implementation for extracting shipping policies from Ceneo Poland:

    const puppeteer = require('puppeteer');
    (async () => {
        const browser = await puppeteer.launch({ headless: true });
        const page = await browser.newPage();
        // Navigate to a Ceneo product page
        await page.goto('https://www.ceneo.pl/product-page', { waitUntil: 'networkidle2' });
        // Wait for the shipping policies section to load
        await page.waitForSelector('.shipping-info');
        // Extract shipping policies
        const shippingPolicies = await page.evaluate(() => {
            const policies = [];
            document.querySelectorAll('.shipping-info .option').forEach(option => {
                const name = option.querySelector('.delivery-type')?.innerText.trim() || 'No name';
                const cost = option.querySelector('.delivery-cost')?.innerText.trim() || 'No cost';
                const time = option.querySelector('.delivery-time')?.innerText.trim() || 'No time specified';
                policies.push({ name, cost, time });
            });
            return policies;
        });
        console.log('Shipping Policies:', shippingPolicies);
        await browser.close();
    })();
    
    Lugos Shanti replied 3 days, 8 hours ago 3 Members · 2 Replies
  • 2 Replies
  • Jaroslav Bohumil

    Member
    12/17/2024 at 8:31 am

    The script could include additional error handling to manage cases where the shipping information section fails to load. Adding retries or fallback messages would ensure robustness in the scraping process.

  • Lugos Shanti

    Member
    12/19/2024 at 11:04 am

    Extending the script to scrape shipping policies for multiple products would make it more versatile. This can be achieved by collecting product links from a category page and iterating through them.

Log in to reply.