-
Zyta Orla replied to the discussion What are some ways to handle redirects during scraping? in the forum General Web Scraping a year ago
What are some ways to handle redirects during scraping?
For content behind redirects, I find it helpful to pause briefly before and after each redirection. This mimics natural browsing and helps avoid detection.
-
Zyta Orla replied to the discussion How do I deal with rate limits on public APIs? in the forum General Web Scraping a year ago
How do I deal with rate limits on public APIs?
Exponential backoff is a good strategy for handling temporary limits. I gradually increase wait times between requests until they’re accepted again.
-
Zyta Orla started the discussion How can I scrape real estate data from Zillow or Realtor.com? in the forum General Web Scraping a year ago
How can I scrape real estate data from Zillow or Realtor.com?
Zillow’s API provides data on properties, prices, and neighborhood statistics, which I use to access structured information for real estate analysis.
-
Zyta Orla changed their photo a year ago
-
Zyta Orla became a registered member a year ago
-
Norbu Nata replied to the discussion What’s the best approach to handling large datasets while scraping? in the forum General Web Scraping a year ago
What’s the best approach to handling large datasets while scraping?
If I need to process the data immediately, I set up streaming to a data pipeline with tools like Kafka. This way, I handle data in real time.
-
Norbu Nata replied to the discussion How can I scrape data that’s only available after login? in the forum General Web Scraping a year ago
How can I scrape data that’s only available after login?
Some sites require CSRF tokens or other session-specific values. I capture and send these with each request to maintain session continuity.
-
Norbu Nata replied to the discussion What are some ways to handle redirects during scraping? in the forum General Web Scraping a year ago
What are some ways to handle redirects during scraping?
Analyzing the redirection chain in dev tools can reveal if it’s intended for bots. Sometimes, switching IPs can help avoid these traps.
-
Norbu Nata replied to the discussion How do I deal with rate limits on public APIs? in the forum General Web Scraping a year ago
How do I deal with rate limits on public APIs?
For batch processing, I queue requests and handle them in smaller chunks, which lets me stay within limits while gathering data continuously.
-
Norbu Nata replied to the discussion How can I scrape data from complex multi-page forms? in the forum General Web Scraping a year ago
How can I scrape data from complex multi-page forms?
Checking for CSRF tokens is important, as many multi-page forms use them for security. I include these tokens in each request to maintain sessions.
- Load More