-
Thurstan Radovan replied to the discussion How can I scrape data that’s only available after login? in the forum General Web Scraping a year ago
How can I scrape data that’s only available after login?
For single-session scrapers, I simulate the login manually and extract cookies, which I then add to each request for as long as the session lasts.
-
Thurstan Radovan replied to the discussion What are some ways to handle redirects during scraping? in the forum General Web Scraping a year ago
What are some ways to handle redirects during scraping?
Checking the final URL structure after each request helps me confirm that I’ve reached the correct page, especially for deep links.
-
Thurstan Radovan replied to the discussion How do I deal with rate limits on public APIs? in the forum General Web Scraping a year ago
How do I deal with rate limits on public APIs?
Using a cron job to schedule API requests during non-peak hours can reduce competition for resources, which helps avoid rate limits.
-
Thurstan Radovan replied to the discussion What’s the best way to handle date-based scraping for historical data? in the forum General Web Scraping a year ago
What’s the best way to handle date-based scraping for historical data?
For sites with complex date filtering, I set up a schedule to scrape in batches. This approach reduces load on both my system and the target site.
-
Thurstan Radovan replied to the discussion How can I scrape data from complex multi-page forms? in the forum General Web Scraping a year ago
How can I scrape data from complex multi-page forms?
If possible, I simulate real user actions, like waiting for page transitions, to avoid detection on heavily monitored sites.
-
Thurstan Radovan started the discussion How can I gather SEO data by scraping Google Search results? in the forum General Web Scraping a year ago
How can I gather SEO data by scraping Google Search results?
I automate searches using rotating proxies and capture SERP results for target keywords, focusing on the top few pages only to minimize server load.
-
Thurstan Radovan changed their photo a year ago
-
Thurstan Radovan became a registered member a year ago
-
Zyta Orla replied to the discussion How can I handle pagination when scraping JavaScript-heavy sites? in the forum General Web Scraping a year ago
How can I handle pagination when scraping JavaScript-heavy sites?
Tools like Playwright work better than Selenium for JavaScript-heavy pages, as they can handle complex interactions faster and with fewer errors.
-
Zyta Orla replied to the discussion How can I scrape data that’s only available after login? in the forum General Web Scraping a year ago
How can I scrape data that’s only available after login?
Using a combination of browser automation and API requests after login speeds up scraping, as I don’t need the browser for every request.
- Load More