-
Aravinda Govind replied to the discussion Legal considerations and ethics of web scraping: What are the boundaries? in the forum General Web Scraping a year ago
Legal considerations and ethics of web scraping: What are the boundaries?
It’s a gray area—many people scrape, but you could still get blocked or face legal action if the site owner notices.
-
Aravinda Govind replied to the discussion How do you handle pagination when scraping websites? in the forum General Web Scraping a year ago
How do you handle pagination when scraping websites?
Scrapy’s pagination support is helpful if you’re using that framework. It can auto-detect pagination links based on your rules, which reduces custom coding, and it stops when no next link is found.
-
Aravinda Govind started the discussion How can I scrape data from web apps built with React or Vue.js? in the forum General Web Scraping a year ago
How can I scrape data from web apps built with React or Vue.js?
Use Selenium or Playwright to wait for React or Vue components to fully load before scraping the rendered data.
-
Aravinda Govind changed their photo a year ago
-
Aravinda Govind became a registered member a year ago
-
Raja Lakeshia replied to the discussion How to scrape data from a website protected by Cloudflare? in the forum General Web Scraping a year ago
How to scrape data from a website protected by Cloudflare?
Rotating proxies and changing user agents can sometimes help you get past Cloudflare.
-
Raja Lakeshia replied to the discussion How do you handle pagination when scraping websites? in the forum General Web Scraping a year ago
How do you handle pagination when scraping websites?
For infinite scrolling, I use Selenium to simulate scrolling, waiting for new data to load each time. You can control the scroll rate and set timeouts to ensure all items are loaded.
-
Raja Lakeshia started the discussion How to scrape data from a website with rate limits and timeouts? in the forum General Web Scraping a year ago
How to scrape data from a website with rate limits and timeouts?
Add delays between requests to prevent reaching rate limits too quickly.
-
Raja Lakeshia changed their photo a year ago
-
Raja Lakeshia updated their profile a year ago
- Load More