News Feed Forums General Web Scraping Tracking discount percentages on e-commerce websites with Ruby

  • Tracking discount percentages on e-commerce websites with Ruby

    Posted by Orrin Ajay on 12/18/2024 at 10:03 am

    Tracking discount percentages from e-commerce websites is a common task for price monitoring. Discounts are often displayed alongside product titles and prices, either as a percentage or a discounted price. Using Ruby with the Nokogiri library, you can extract this data from static pages. For dynamically loaded pages, integrating Ruby with a headless browser like Selenium is essential to fetch the required elements. Additionally, calculating discounts programmatically by comparing original and sale prices is straightforward with Ruby’s robust data manipulation features.
    Here’s an example of tracking discounts using Ruby and Nokogiri:

    require 'nokogiri'
    require 'open-uri'
    url = 'https://example.com/products'
    doc = Nokogiri::HTML(URI.open(url))
    doc.css('.product-item').each do |product|
      title = product.css('.product-title').text.strip
      original_price = product.css('.original-price').text.strip
      sale_price = product.css('.sale-price').text.strip
      discount = ((original_price.to_f - sale_price.to_f) / original_price.to_f * 100).round(2)
      puts "Product: #{title}, Discount: #{discount}%"
    end
    

    For large-scale scraping, integrating a proxy service and implementing rate-limiting ensures smooth operation without triggering anti-scraping measures. How do you handle products with missing or inconsistent pricing data?

    Bituin Oskar replied 5 days, 13 hours ago 4 Members · 3 Replies
  • 3 Replies
  • David Maja

    Member
    12/20/2024 at 9:53 am

    When prices are inconsistent or missing, I add error handling to skip such products. Logging these cases helps me review and update the scraper logic later.

  • Gualtiero Wahyudi

    Member
    12/25/2024 at 7:55 am

    For dynamic content, I rely on Selenium-WebDriver with Ruby to load all elements before extracting prices. It’s slower but ensures I don’t miss any data.

  • Bituin Oskar

    Member
    01/17/2025 at 5:35 am

    Using caching for previously scraped pages saves time and bandwidth, especially when monitoring discounts that don’t change frequently.

Log in to reply.