News Feed Forums General Web Scraping How do I identify hidden APIs that might be easier to scrape?

  • How do I identify hidden APIs that might be easier to scrape?

    Posted by Thibaut Ron on 11/13/2024 at 1:57 pm

    I start by opening the network tab in dev tools and navigating through the site. Often, you’ll find JSON or AJAX requests that fetch the data directly.

    Straton Owain replied 1 month, 1 week ago 6 Members · 5 Replies
  • 5 Replies
  • Khloe Walther

    Member
    11/15/2024 at 7:49 am

    Many sites use GraphQL APIs, so I look for POST requests with query bodies in the network tab.

  • Aridai Farzona

    Member
    11/15/2024 at 8:05 am

    GraphQL can be easier to scrape because you can specify exactly what data you want.

  • Daniel Teuku

    Member
    11/15/2024 at 8:19 am

    If the site is built with React or Vue, there’s a good chance data is loaded via API calls. I look for these in the page’s JavaScript code or in the dev tools.

  • Yannig Avicenna

    Member
    11/15/2024 at 8:28 am

    I also inspect any XHR or Fetch requests in the dev tools. They often lead to JSON endpoints, which are much simpler to parse than HTML.

  • Straton Owain

    Member
    11/15/2024 at 9:34 am

    Sometimes, the API endpoint is hinted at in the page’s HTML source. A quick search for URLs or ‘endpoint’ keywords can reveal hidden paths.

Log in to reply.