News Feed Forums General Web Scraping Scrape product availability from Otto Germany using Python

  • Scrape product availability from Otto Germany using Python

    Posted by Flora Abdias on 12/13/2024 at 9:26 am

    Otto is one of the most popular e-commerce platforms in Germany, known for its wide variety of products ranging from fashion to electronics. Scraping product availability from Otto involves using Python with the requests and BeautifulSoup libraries to fetch and parse the HTML content of a product page. Product availability is typically displayed as a label near the “Add to Cart” button, indicating whether the product is “In Stock,” “Out of Stock,” or available for delivery within a specific timeframe.
    The first step is to inspect the structure of the webpage using browser developer tools to identify the exact HTML tags and classes containing the availability information. The script sends an HTTP request to the product page, parses the HTML response, and locates the availability text. Error handling is included to address cases where availability information may not be present or is dynamically loaded. Below is a complete implementation for scraping product availability from Otto Germany:

    import requests
    from bs4 import BeautifulSoup
    # URL of the Otto product page
    url = "https://www.otto.de/product-page"
    # Headers to mimic a browser request
    headers = {
        "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36"
    }
    # Fetch the page content
    response = requests.get(url, headers=headers)
    if response.status_code == 200:
        soup = BeautifulSoup(response.content, "html.parser")
        # Extract product availability
        availability = soup.find("div", class_="availability-status")
        if availability:
            print("Product Availability:", availability.text.strip())
        else:
            print("Availability information not found.")
    else:
        print(f"Failed to fetch the page. Status code: {response.status_code}")
    
    Subhash Siddiqa replied 3 days, 18 hours ago 5 Members · 4 Replies
  • 4 Replies
  • Lencho Coenraad

    Member
    12/14/2024 at 9:54 am

    The script could benefit from error handling for network issues such as timeouts or invalid responses. Adding retry logic with a short delay between attempts would ensure the script remains robust in case of temporary server issues.

  • Javed Roland

    Member
    12/17/2024 at 7:48 am

    An enhancement to the script could include scraping availability for multiple products by iterating through a list of product URLs. This would make the scraper more versatile for analyzing stock levels across a category of products.

    • This reply was modified 5 days, 16 hours ago by  Javed Roland.
  • Elen Honorata

    Member
    12/18/2024 at 7:22 am

    The script can be extended to save the scraped availability data into a structured format such as JSON or CSV for further analysis. This approach is particularly useful for tracking stock trends or comparing availability across products.

  • Subhash Siddiqa

    Member
    12/19/2024 at 5:46 am

    Adding functionality to handle dynamically loaded content would improve the accuracy of the scraper. If the availability information is rendered by JavaScript, integrating a library like Selenium or Playwright would ensure that all content is fully loaded before extraction.

Log in to reply.