News Feed Forums General Web Scraping How can I scrape websites with infinite scroll without losing data?

  • How can I scrape websites with infinite scroll without losing data?

    Posted by Puleng Evy on 11/14/2024 at 5:45 am

    I usually rely on Puppeteer for this. You can set it to scroll a specific number of pixels or until a specific element is visible. Combine this with a wait time to make sure data is fully loaded each time.

    Ravi Ernestas replied 1 month, 1 week ago 4 Members · 3 Replies
  • 3 Replies
  • Yannig Avicenna

    Member
    11/15/2024 at 8:29 am

    If you’re using Python, selenium has built-in support for scrolling. Just use a loop that scrolls down and checks if new data has loaded. I add a try-catch block in case loading fails.

  • Raul Marduk

    Member
    11/15/2024 at 9:49 am

    Check if the site loads new items through an AJAX request. If so, you can intercept the AJAX call URL in your browser’s dev tools and hit it directly. This way, you can get the data in JSON format without scrolling.

  • Ravi Ernestas

    Member
    11/16/2024 at 4:55 am

    I’ve also written scripts that detect the ‘load more’ button, which some sites use instead of infinite scrolling. Simulating clicks on this button in a loop allows you to retrieve all content without scrolling.

Log in to reply.