News Feed Forums General Web Scraping What data can be extracted from MarksandSpencer.com using Python?

  • What data can be extracted from MarksandSpencer.com using Python?

    Posted by Jacinda Thilini on 12/21/2024 at 11:57 am

    Scraping data from MarksandSpencer.com using Python allows you to gather details such as product names, prices, and availability across categories like clothing, food, and homeware. Marks & Spencer is a leading retailer in the UK, offering a wide range of products that make it an excellent source for market research. Using Python, you can automate the process of collecting this data efficiently. The first step involves inspecting the HTML structure of the product pages to identify relevant elements such as product titles and price tags. By using Python’s libraries to send HTTP requests and parse HTML, you can build a robust scraper.
    One challenge when scraping Marks & Spencer is handling multiple categories and subcategories, as the website is vast. Automating navigation across categories ensures that the scraper captures a comprehensive dataset. Additionally, tracking customer reviews and ratings alongside product data adds depth to the analysis. Below is an example Python script for scraping product data from MarksandSpencer.com.

    import requests
    from html.parser import HTMLParser
    class MarksAndSpencerParser(HTMLParser):
        def __init__(self):
            super().__init__()
            self.in_product_name = False
            self.in_product_price = False
            self.products = []
            self.current_product = {}
        def handle_starttag(self, tag, attrs):
            attrs = dict(attrs)
            if tag == "h2" and "class" in attrs and "product-title" in attrs["class"]:
                self.in_product_name = True
            if tag == "span" and "class" in attrs and "price" in attrs["class"]:
                self.in_product_price = True
        def handle_endtag(self, tag):
            if self.in_product_name and tag == "h2":
                self.in_product_name = False
            if self.in_product_price and tag == "span":
                self.in_product_price = False
        def handle_data(self, data):
            if self.in_product_name:
                self.current_product["name"] = data.strip()
            if self.in_product_price:
                self.current_product["price"] = data.strip()
                self.products.append(self.current_product)
                self.current_product = {}
    url = "https://www.marksandspencer.com/"
    response = requests.get(url)
    parser = MarksAndSpencerParser()
    parser.feed(response.text)
    for product in parser.products:
        print(f"Product: {product['name']}, Price: {product['price']}")
    

    This script extracts product names and prices from MarksandSpencer.com. By adding functionality for tracking stock levels or analyzing reviews, the scraper can provide more comprehensive insights. Introducing random delays between requests ensures smoother operations and reduces detection risks.

    Giiwedin Vesna replied 2 weeks ago 3 Members · 2 Replies
  • 2 Replies
  • Rita Lari

    Member
    12/27/2024 at 8:10 am

    Integrating product image URLs into the dataset enhances the scraper’s output, especially for e-commerce analysis. For instance, collecting images alongside product details provides a richer dataset that can be used for visual comparisons. Another improvement could involve identifying and categorizing items based on specific tags, such as “best-seller” or “new arrival.” These features add value to the scraper by enabling targeted insights into product popularity and trends. Combining these capabilities makes the scraper more versatile and effective.

  • Giiwedin Vesna

    Member
    01/16/2025 at 2:10 pm

    Capturing delivery and return policy details adds another layer of insight to the scraper’s output. For example, understanding delivery timelines or restrictions can provide valuable information for logistics analysis. Another enhancement could be tracking product availability in specific regions or stores, which is particularly useful for localized marketing strategies. These features make the scraper more comprehensive and valuable for business planning. Adding advanced features ensures the scraper remains relevant for evolving data needs.

Log in to reply.