Error Handling and Debugging

Web scraping often involves unpredictable scenarios. Proper error handling ensures scripts don’t break unexpectedly.

Selenium Try-Except Code

Try-except in Python is used to catch and handle exceptions (errors) that occur during execution. If the code in the try block runs normally, then it’s business as usual. However, if an error occurs, the except block runs instead, which prevents the entire script from crashing.

Selenium Python Try Except Code:

Now in 99% of cases, you’re going to want to close the pop-up, but we want to be thorough, so let’s first look at how to switch to a pop-up window.

try:
    element = driver.find_element(By.ID, 'target-element')
    element.click()
except Exception as e:
    print(f"Error encountered: {e}")

In this case, the except code prints a message.

Using Explicit Waits to Debug Selenium

We covered explicit waits in the previous module for navigation, but they're also a handy means of Selenium error handling.

A common error you may come across is ElementNotInteractableException. In plain English, the element isn’t clickable. This either means you’ve made an error in your code, and are targeting the wrong element or you’re simply too quick - the element isn’t in a clickable state when the code is executed.

And this is where explicit waits can help. If you’re getting this error, set a wait and see if the results improve.

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

wait = WebDriverWait(driver, 10)
try:
    element = wait.until(EC.element_to_be_clickable((By.ID, 'target-element')))
    element.click()
except Exception as e:
    print(f"Error: {e}")

Other Selenium Error Messages

Finally, let’s round this module off with some other error messages that you may come across.

ElementNotInteractableException: Element isn’t clickable.
- Cause: Element is hidden, overlapped, or not yet loaded.
- Solution: Use explicit waits with EC.element_to_be_clickable.
NoSuchElementException: Element not found.
- Cause: Incorrect locator or element not rendered yet.
- Solution: Verify selectors and use explicit waits like EC.presence_of_element_located.
TimeoutException: Waited for an element but it didn’t appear.
- Solution: Increase wait time or check if the element is dynamically loaded after certain actions.
StaleElementReferenceException: Element reference is no longer valid.
- Cause: Page refreshed or DOM updated.
- Solution: Re-locate the element before interacting.
SessionNotCreatedException: Browser or driver mismatch.
- Solution: Ensure compatible versions of the browser and WebDriver.
WebDriverException: Generic issues like closed browser windows.
- Solution: Check WebDriver installation and browser status.

Best Practices for Handling Web Scraping Challenges

Implement rate limiting, proxy rotations and randomized delays to avoid triggering CAPTCHAs as much as possible.
Use CAPTCHA solvers where necessary.
Automate logins and manage session cookies to reduce server load and minimize the amount of logins needed.
Respect the terms of service when web scraping.
Use explicit waits to close pop-up windows - and use proper locators to find in-page modals.
Use try-except blocks to prevent scripts from crashing unexpectedly.

‍