-
Yannig Avicenna replied to the discussion How can I scrape websites with infinite scroll without losing data? in the forum General Web Scraping a year ago
How can I scrape websites with infinite scroll without losing data?
If you’re using Python, selenium has built-in support for scrolling. Just use a loop that scrolls down and checks if new data has loaded. I add a try-catch block in case loading fails.
-
Yannig Avicenna replied to the discussion How can I scrape mobile-only websites from a desktop environment? in the forum General Web Scraping a year ago
How can I scrape mobile-only websites from a desktop environment?
Sometimes I use proxies with mobile IPs. This, combined with a mobile user-agent, gives the best result because the target server believes the request is truly from a mobile device.
-
Yannig Avicenna replied to the discussion How do I identify hidden APIs that might be easier to scrape? in the forum General Web Scraping a year ago
How do I identify hidden APIs that might be easier to scrape?
I also inspect any XHR or Fetch requests in the dev tools. They often lead to JSON endpoints, which are much simpler to parse than HTML.
-
Yannig Avicenna started the discussion How can I scrape product prices accurately without triggering anti-bot measures? in the forum General Web Scraping a year ago
How can I scrape product prices accurately without triggering anti-bot measures?
Rotating proxies with varying IPs ensures my scraper avoids detection, especially on sites with strict anti-bot protections around price data.
-
Yannig Avicenna changed their photo a year ago
-
Yannig Avicenna became a registered member a year ago
-
Daniel Teuku replied to the discussion Is Python really the best language for web scraping, or should I consider altern in the forum General Web Scraping a year ago
Is Python really the best language for web scraping, or should I consider altern
Python’s ease of use and broad library support make it accessible, especially for beginners. However, if performance is a priority, C# with HtmlAgilityPack is a strong contender, especially if you’re targeting Windows environments.
-
Daniel Teuku replied to the discussion How do I identify hidden APIs that might be easier to scrape? in the forum General Web Scraping a year ago
How do I identify hidden APIs that might be easier to scrape?
If the site is built with React or Vue, there’s a good chance data is loaded via API calls. I look for these in the page’s JavaScript code or in the dev tools.
-
Daniel Teuku replied to the discussion What are the best methods for scraping data from dynamically-loaded websites? in the forum General Web Scraping a year ago
What are the best methods for scraping data from dynamically-loaded websites?
If the data loads incrementally (like with infinite scroll), I use JavaScript event listeners to detect new content, then scrape the page once all content has loaded.
-
Daniel Teuku replied to the discussion What’s the best way to handle CAPTCHAs while scraping? in the forum General Web Scraping a year ago
What’s the best way to handle CAPTCHAs while scraping?
Some scrapers use “CAPTCHA banks,” where several accounts share CAPTCHA solutions across sessions, reducing individual load.
- Load More