-
Bronislawa Mirela became a registered member a year ago
-
Joline Abdastartus replied to the discussion How do I handle scraping for real-time data that updates frequently? in the forum General Web Scraping a year ago
How do I handle scraping for real-time data that updates frequently?
I cache recent data locally and compare it with new scrapes to identify meaningful changes instead of saving duplicate data.
-
Joline Abdastartus replied to the discussion What strategies can I use to scrape websites with limited search functionality? in the forum General Web Scraping a year ago
What strategies can I use to scrape websites with limited search functionality?
Monitoring the site’s network traffic can reveal API calls with hidden parameters. Often, tweaking these parameters yields more search results.
-
Joline Abdastartus replied to the discussion How can I handle pagination when scraping JavaScript-heavy sites? in the forum General Web Scraping a year ago
How can I handle pagination when scraping JavaScript-heavy sites?
Logging each URL as I progress through pages ensures I don’t revisit pages accidentally. This is crucial for scraping sites with complex pagination structures.
-
Joline Abdastartus replied to the discussion What’s the best approach to handling large datasets while scraping? in the forum General Web Scraping a year ago
What’s the best approach to handling large datasets while scraping?
A distributed scraping setup, where multiple scrapers work on different parts of the site simultaneously, helps speed up data collection without overloading any one scraper.
-
Joline Abdastartus started the discussion How can I scrape promotional offers and discounts on Grab? in the forum General Web Scraping a year ago
How can I scrape promotional offers and discounts on Grab?
Grab’s app and website often feature promotional banners with special codes. Using a headless browser like Puppeteer, I capture these banners and associated offers.
-
Joline Abdastartus changed their photo a year ago
-
Joline Abdastartus became a registered member a year ago
-
Florianne Andrius replied to the discussion How can I detect and manage duplicate data in my scraped results? in the forum General Web Scraping a year ago
How can I detect and manage duplicate data in my scraped results?
Pandas’ drop_duplicates function is a quick and effective way to filter out duplicates in bulk after data collection.
-
Florianne Andrius replied to the discussion How do I handle scraping pages with endless AJAX requests? in the forum General Web Scraping a year ago
How do I handle scraping pages with endless AJAX requests?
I use Playwright or Puppeteer to handle cases where AJAX loads new data as I scroll, simulating user interaction and loading all data at once.
- Load More