-
How to scrape hotel prices from Expedia.com using JavaScript?
Scraping hotel prices from Expedia.com using JavaScript can be effectively done with Node.js and Puppeteer, a library that provides control over a headless browser. This approach allows you to handle dynamic content rendered by JavaScript, ensuring all hotel listings are fully loaded before data extraction. The process involves navigating to the desired Expedia page, waiting for the content to load, and then targeting elements like hotel names, prices, and ratings using DOM selectors. Puppeteer is particularly useful for handling scenarios where content is paginated or dynamically updated. Below is an example script to scrape hotel prices from Expedia.
const puppeteer = require('puppeteer'); (async () => { const browser = await puppeteer.launch({ headless: true }); const page = await browser.newPage(); const url = 'https://www.expedia.com/Hotels'; await page.goto(url, { waitUntil: 'networkidle2' }); const hotels = await page.evaluate(() => { const hotelData = []; const items = document.querySelectorAll('.uitk-card'); items.forEach(item => { const name = item.querySelector('.uitk-heading-5')?.textContent.trim() || 'Name not available'; const price = item.querySelector('.uitk-price')?.textContent.trim() || 'Price not available'; const rating = item.querySelector('.uitk-star-rating-score')?.textContent.trim() || 'No rating'; hotelData.push({ name, price, rating }); }); return hotelData; }); console.log(hotels); await browser.close(); })();
This script uses Puppeteer to scrape hotel details like names, prices, and ratings from Expedia. It ensures all content is loaded before starting the extraction process. Adding functionality for pagination allows the scraper to navigate through multiple pages of listings, collecting a complete dataset. Using random delays between requests reduces the risk of being flagged by Expedia’s anti-scraping mechanisms. The extracted data can be saved in a structured format like JSON or CSV for further analysis.
Log in to reply.