How To Retry Failed Python Requests

Python requests are essential to many different online tasks, including web scraping. What happens when those requests fail, though? How do you handle the Python requests retry process?

In this guide, we break down the various reasons why Python requests fail and the different types of failed Python requests you might encounter. We’ll also walk you through what you can do to retry failed Python requests. Find out everything you need to know below.

Try Our Residential Proxies Today!

What Are Python Requests?

read about python request

Python Requests is an HTTP library for Python (a programming language) designed to make HTTP requests easier and more human-friendly. It allows you to send HTTP requests and receive responses in a straightforward manner, eliminating much of the complexity involved in working with HTTP in Python.

With Requests, you can perform common HTTP operations such as GET, POST, PUT, DELETE, and more. It supports various authentication methods, session management, cookies, proxies, SSL verification, and custom headers. Requests also provides convenient features for handling JSON data, file uploads, streaming responses, and timeouts.

Here’s a basic example of using Requests to make a GET request:

import requests

response = requests.get(‘https://api.example.com/data’)

print(response.text)

This code will send a GET request to the specified URL and print the response content.

How Are Python Requests Used in Web Scraping?

python request and web scraping

In web scraping with Python, Requests acts as the foundation for fetching the web content you want to extract. Here’s how it’s typically used:

  • Sending HTTP requests: Requests allows you to send various HTTP requests (GET, POST, etc.) to a specific URL. It provides a user-friendly interface to build the request with headers, cookies, and data (if applicable).
  • Retrieving responses: Once the request is sent, Requests captures the server’s response. This response object contains the status code (success or error), headers, and, most importantly, the downloaded content (usually HTML).
  • Handling different scenarios: Requests helps manage different response codes. You can check for successful responses (200 OK) or handle errors (404 Not Found, etc.).

Here’s an example of how Python requests might be used in web scraping:

import requests

url = “https://www.example.com”  # Replace with the target URL

# Send a GET request

response = requests.get(url)

# Check for successful response

if response.status_code == 200:

# Access the downloaded HTML content

html_content = response.text

# Proceed with parsing the HTML content (using Beautiful Soup or similar)

else:

print(“Error: Could not retrieve data from”, url)

Requests is generally preferred for web scraping because of its simplicity and flexibility. It offers a clean and concise way for developers to interact with web servers and supports a variety of HTTP methods and request parameters. It also provides mechanisms for error and unexpected response handling.

Reasons Why Python Requests Fail

why python request fail

There are many reasons why a Python request might fail during the web scraping process (or any other type of project). The following are some of the most well-known examples:

  • Connection errors: These errors could be due to a failing internet connection, firewalls blocking communication, or issues with the DNS server.
  • Timeouts: In this situation, which we’ll discuss in more depth later, the server might be taking too long to respond, or the request itself might be taking too long to complete.
  • HTTP error codes: In this scenario, the server might respond with an error code indicating a problem with the request itself (more on that in the next section!).
  • Server overload: This might happen when the server is overloaded and unable to handle a particular request at that moment.
  • Incorrect URL: In this case, the provided URL might be misspelled or point to a non-existent resource.
  • Missing headers: In this situation, required headers for authentication or content type might be missing.
  • Network proxy issues: If using a network proxy, configuration issues might prevent the request from reaching the server.

Types of Failed Python Requests

Failed Python requests can be divided into two main categories: Timed out requests and requests that return an error.

Timed out requests

In Python’s requests library, a timeout occurs when a web request fails to receive a response from the server within a specified time limit.

Here’s a breakdown of what happens when a timeout occurs:

  • No response received: The server you’re trying to connect to hasn’t sent any data back within the set timeframe. This could be due to various reasons like a slow server, network issues, or a server down for maintenance.
  • Exception raised: By default, requests doesn’t have timeouts. If you don’t set one, your program might wait indefinitely for a response, potentially freezing. To prevent this, you can specify a timeout value (in seconds) using the timeout argument in your request methods (e.g., requests.get(url, timeout=5)). If the response isn’t received within that time, a requests.exceptions.Timeout exception is raised.
  • Code execution halted: When a timeout exception occurs, the code execution stops at the point of the request because the program is stuck waiting for a response that isn’t coming. This can be problematic if other parts of your code rely on the data from the request.

Returned an error requests

When an issue occurs, requests will often return error responses. These responses include two critical pieces of information: a status code and an error message.

The status code acts as a concise indicator of the general problem category, while the error message provides more specific details to help diagnose the exact cause.

The following are some of the most common status codes and error messages:

  • 400 Bad Request: This is an HTTP status code that indicates the server could not understand the request due to invalid syntax. Common causes of a 400 Bad Request error include incorrect syntax in the request URL, missing or incorrect parameters in the request, or invalid request headers.
  • 401 Unauthorized: This is an HTTP status code that indicates the client (such as a web browser or software application) attempted to access a resource that requires authentication, but the request lacked valid credentials — or authentication credentials were provided but were incorrect or insufficient.
  • 403 Forbidden: This is an HTTP status code that indicates the server understood your request, but it will not authorize access to the requested resource.
  • 404 Not Found: This is an HTTP status code that indicates the server could not find the requested resource. The client (such as a web browser or a software application) made a request for a resource, but the server could not locate it at the specified URL.
  • 405 Method Not Allowed: This is an HTTP status code that indicates the server recognized the request method (such as GET, POST, PUT, DELETE, etc.), but the method is not allowed for the requested resource.
  • 408 Request Timeout: This is an HTTP status code that indicates that the client (such as a web browser or a software application) did not produce a complete request within the time frame expected by the server.
  • 429 Too Many Requests: This is an HTTP status code that indicates the client has sent too many requests to the server within a specific time frame, and the server is imposing rate limiting or throttling to restrict the client’s access temporarily. People often run into this code during web scraping projects.
  • 500 Internal Server Error: This is an HTTP status code that indicates an unexpected condition occurred on the server, preventing it from fulfilling the client’s request.
  • 501 Not Implemented: This is an HTTP status code that indicates the server does not support the functionality required to fulfill the client’s request.
  • 502 Bad Gateway: This is an HTTP status code that indicates that a server acting as a gateway or proxy received an invalid response from another server while attempting to fulfill the client’s request.
  • 503 Service Unavailable: This is an HTTP status code that indicates the server is temporarily unable to handle the request due to overload or maintenance.
  • 504 Gateway Timeout: This is an HTTP status code that indicates that a server acting as a gateway or proxy did not receive a timely response from a server while attempting to fulfill the client’s request.
  • 505 HTTP Version Not Supported: This is an HTTP status code that indicates that the server does not support the HTTP protocol version used in the request.

Python Requests Retry: What to Do

python request fail and retry

There are two primary options you can choose from when creating a Python requests retry strategy. The first is a simple loop at set intervals, and the other involves using increasing delays.

Python HTTP request retry with a loop

You can implement HTTP request retry logic in Python using a loop combined with exception handling. Here’s a basic example of how you can achieve this using the requests library:

import requests

import time

def make_request(url, max_retries=3, retry_delay=1):

retries = 0

while retries < max_retries:

try:

response = requests.get(url)

# Check if the response was successful

if response.status_code == 200:

return response

else:

print(f”Request failed with status code {response.status_code}. Retrying…”)

except Exception as e:

print(f”An error occurred: {e}. Retrying…”)

retries += 1

time.sleep(retry_delay)

print(“Max retries exceeded. Failed to make the request.”)

return None

# Example usage

url = “https://example.com”

response = make_request(url)

if response:

print(“Request successful!”)

print(response.text)

Here’s a quick explanation of everything that’s happening in this example:

  • The make_request function takes a URL as input, along with optional parameters for maximum retries (max_retries) and delay between retries (retry_delay).
  • Inside the function, there’s a loop that retries the request until either it succeeds or the maximum number of retries is reached.
  • Inside the loop, the requests.get function is called to make the HTTP request. If the response status code is 200 (indicating success), the function returns the response.
  • If the request fails or encounters an exception, the loop retries the request after a specified delay (retry_delay) and increments the retry count (retries).
  • If the maximum number of retries is reached without success, the function prints a failure message and returns None.

Python requests retries with HTTPAdapter

You can implement HTTP request retries using the HTTPAdapter class from the requests library along with the Retry class from the urllib3 library. Here’s how you can do it:

import requests

from requests.adapters import HTTPAdapter

from requests.packages.urllib3.util.retry import Retry

def make_request(url):

session = requests.Session()

retries = Retry(total=3, backoff_factor=0.5, status_forcelist=[500, 502, 503, 504])

adapter = HTTPAdapter(max_retries=retries)

session.mount(‘http://’, adapter)

session.mount(‘https://’, adapter)

try:

response = session.get(url)

response.raise_for_status()  # Raise an exception for non-2xx responses

return response

except requests.exceptions.RequestException as e:

print(f”An error occurred: {e}”)

return None

# Example usage

url = “https://example.com”

response = make_request(url)

if response:

print(“Request successful!”)

print(response.text)

In this example, the following is happening:

  • The make_request function takes a URL as input.
  • Inside the function, a Session object is created using requests.Session(). This allows us to reuse the same TCP connection for multiple requests.
  • A Retry object is created with specific parameters such as total (maximum number of retries), backoff_factor (delay between retries), and status_forcelist (list of HTTP status codes for which retries should be attempted).
  • An HTTPAdapter object is created with the Retry object, specifying the maximum number of retries and other retry-related settings.
  • The HTTPAdapter is then mounted to both http:// and https:// prefixes in the session object using the mount() method.
  • The session’s get() method is used to make the HTTP request, and the response is returned if successful. If an exception occurs during the request, it is caught, and an error message is printed.

Python Requests Max Retries Exceeded with URL

how much times retry python request

The error “Max retries exceeded with URL” in Python Requests indicates the library attempted to connect to a server multiple times but failed. Here are some potential causes that can contribute to this issue:

  • Network issues: Unstable internet connection, firewalls, or DNS problems can prevent reaching the server.
  • Server issues: The server might be down for maintenance, overloaded, or experiencing technical difficulties.
  • Slow response: If the server takes too long to respond, Requests might time out before a connection is established.
  • Incorrect URL: Double-check the URL for typos or errors.

The following are some tips to help you navigate these challenges:

  • Verify network connection: Ensure you have a stable internet connection. Try accessing other websites to confirm.
  • Check server status: Use online tools or try accessing the URL in a web browser to see if the server is reachable.
  • Increase timeout: You can increase the waiting time for a response using the timeout parameter in the requests function. An example might look like this:
    • response = requests.get(url, timeout=10) # Wait for 10 seconds
  • Configure retries with a custom session: Create a custom Requests session with a Retry object to control retries and backoff intervals:
    • from urllib3.util.retry import Retry

retries = Retry(total=5, backoff_factor=1, status_forcelist=[500, 502, 503])

session = requests.Session()

session.mount(‘https://’, retries=retries)

response = session.get(url)

  • Configure retries with retry exceptions: Wrap your request in a try-except block to catch specific exceptions and handle retries, like this:
    • try:

response = requests.get(url)

except requests.exceptions.ConnectionError as e:

# Handle connection error and retry logic

  • Review the URL: Make sure the URL is spelled correctly and points to the intended resource.

How to Avoid 429 Errors with Proxies

python request and proxy

In addition to using the tactics shared above, if you run into a 429 error — which is common during web scraping tasks — you may want to try incorporating proxies into your web scraping process.

Proxies act as intermediaries between clients (such as web browsers or applications) and servers on the Internet. When a client sends a request to access a web resource, it goes through the proxy server first. The proxy then forwards the request to the destination server and relays the response back to the client.

Proxies can play several different roles during the web scraping process, including the following:

  • Avoiding IP blocking: Websites often block IP addresses that send too many requests in a short period, suspecting them to be bots or scrapers. Proxies have their own IP addresses. By routing your scraping requests through a pool of proxies, you distribute the load across various IPs. This makes it seem like the requests originate from multiple users, reducing the risk of getting blocked.
  • Bypassing geo-restrictions: Some websites restrict content based on the user’s location. Proxies can be located in different countries. By using proxies from specific regions, you can access geo-restricted content that wouldn’t be available from your location.
  • Increased concurrency and speed: In some cases, proxies can improve scraping efficiency. Certain proxy providers offer features like connection pooling, which can help handle a higher volume of requests concurrently compared to a single IP.

When it comes to 429 errors, websites often implement rate limiting to prevent overloading their servers or malicious scraping. This limit is typically set on a per-IP address basis.

Because proxies act as intermediaries between your computer and the target website, each proxy server has its own unique IP address.

By rotating requests through different proxies, you can distribute your requests across various IPs, making it appear like they originate from multiple users. This approach helps you bypass the server’s rate limit imposed on a single IP.

How to Choose Proxies for Web Scraping

how to select proxy for web scraping

You can choose from a few different types of proxies for help with web scraping and other development projects. The following are the most popular options:

Data center proxies

Data center proxies rely on servers housed within large data centers – specialized facilities packed with computers and networking gear. These facilities handle storing and processing massive amounts of information.

Unlike typical IP addresses linked to internet providers (ISPs) or home devices, data center proxies don’t have such real-world connections. Since these IPs come from an actual data center, websites can more easily identify and block them.

Residential proxies

Residential proxies provide access to IP addresses registered with real Internet service providers (ISPs). These addresses belong to actual homes and mobile devices, not giant data centers.

With these proxies, websites see your requests as coming from regular users in a specific location. As a result,  you can bypass the basic anti-scraping measures that target data center addresses.

Residential proxies can be useful for tasks requiring high success rates and anonymity when scraping data. They’re also handy for verifying ads and accessing geo-restricted content, making them a versatile tool for various online needs.

ISP proxies

ISP proxies, also known as residential static proxies, offer a unique blend of features.  They provide static IPs registered with internet service providers but operate from servers within data centers. This combination gives them some advantages over both data center and residential proxies.

The data center environment allows ISP proxies to be significantly faster than traditional residential proxies that rely on individual user connections. Plus, the static IPs are incredibly reliable and can stay with you for long-term use.

Obtaining these ISP-linked IPs is more challenging, though, resulting in a smaller pool of available addresses compared to residential proxies.

Despite the limited pool, ISP proxies still shine in specific situations. For example, they excel at accessing region-restricted websites and bypassing strong IP-based anti-scraping measures.  This makes them a favorite tool for SEO professionals who need to track search rankings across various locations.

Try Our Residential Proxies Today!

Final Thoughts

conclusion on python request fail and retry

Figuring out how to retry failed Python requests can be daunting. If you use the tips and suggestions shared above, though, you’ll have an easier time figuring out the errors you’re experiencing and creating a plan to fix them.

Do you want to incorporate proxies into your Python requests retry strategy? If so, Rayobyte can help. Sign up today to learn more about our highly reliable, high-performing proxies and try our service for free.

The information contained within this article, including information posted by official staff, guest-submitted material, message board postings, or other third-party material is presented solely for the purposes of education and furtherance of the knowledge of the reader. All trademarks used in this publication are hereby acknowledged as the property of their respective owners.

Sign Up for our Mailing List

To get exclusive deals and more information about proxies.

Start a risk-free, money-back guarantee trial today and see the Rayobyte
difference for yourself!