News Feed Forums General Web Scraping How to scrape product prices from Target.com using C#?

  • How to scrape product prices from Target.com using C#?

    Posted by Lugos Shanti on 12/19/2024 at 11:02 am

    Scraping product prices from Target.com using C# is an efficient way to extract data such as product names, prices, and ratings. By using libraries like HttpClient for making web requests and HtmlAgilityPack for parsing HTML, you can easily handle static pages. The process involves sending an HTTP GET request to the Target product page, loading the HTML into a document, and extracting specific data points by targeting the appropriate tags. This approach is lightweight and efficient, especially when working with well-structured web pages. Below is an example of how to scrape product details from Target using C#.

    using System;
    using System.Net.Http;
    using HtmlAgilityPack;
    class TargetScraper
    {
        static async System.Threading.Tasks.Task Main(string[] args)
        {
            string url = "https://www.target.com/c/coffee-makers/-/N-5xtd3";
            HttpClient client = new HttpClient();
            client.DefaultRequestHeaders.Add("User-Agent", "Mozilla/5.0");
            HttpResponseMessage response = await client.GetAsync(url);
            string html = await response.Content.ReadAsStringAsync();
            HtmlDocument doc = new HtmlDocument();
            doc.LoadHtml(html);
            var products = doc.DocumentNode.SelectNodes("//div[contains(@class, 'product-card')]");
            foreach (var product in products)
            {
                var name = product.SelectSingleNode(".//a[contains(@class, 'product-title')]")?.InnerText.Trim() ?? "Name not available";
                var price = product.SelectSingleNode(".//span[contains(@class, 'product-price')]")?.InnerText.Trim() ?? "Price not available";
                var rating = product.SelectSingleNode(".//span[contains(@class, 'rating')]")?.InnerText.Trim() ?? "No rating";
                Console.WriteLine($"Name: {name}, Price: {price}, Rating: {rating}");
            }
        }
    }
    

    This script fetches the HTML content of the Target product category page, parses it, and extracts product names, prices, and ratings. If a product doesn’t have a price or rating, the script assigns a default message. This code handles static content well, but for dynamic content, you may need a library like Selenium for C#. Additionally, the scraper can be enhanced to handle pagination by identifying and following the “Next” button. Adding error handling ensures the program runs smoothly even with unexpected changes in the HTML structure.

    Salma Dominique replied 2 days, 15 hours ago 3 Members · 2 Replies
  • 2 Replies
  • Agrafena Oscar

    Member
    12/20/2024 at 6:52 am

    One improvement to the scraper is handling pagination to collect data from all available pages. Target.com typically displays a limited number of products per page, so navigating through all pages ensures a comprehensive dataset. The “Next” button can be identified and its link extracted to load subsequent pages programmatically. Adding a delay between requests reduces the chances of being flagged as a bot. This approach allows you to gather a complete list of products without overwhelming the server.

  • Salma Dominique

    Member
    12/20/2024 at 8:12 am

    To enhance the scraper’s robustness, you can incorporate error handling for missing elements or unexpected changes in the page structure. For example, if a product listing is missing a price or rating, the scraper should log the issue and continue processing the remaining products. Try-catch blocks can help handle network issues or timeouts during the HTTP request. Logging errors and skipped items makes it easier to debug and refine the scraper. This ensures the scraper remains reliable even when Target updates its website.

Log in to reply.