News Feed Forums General Web Scraping How do I identify hidden APIs that might be easier to scrape?

  • How do I identify hidden APIs that might be easier to scrape?

    Posted by Thibaut Ron on 11/13/2024 at 1:57 pm

    I start by opening the network tab in dev tools and navigating through the site. Often, you’ll find JSON or AJAX requests that fetch the data directly.

    Straton Owain replied 10 months, 1 week ago 6 Members · 5 Replies
  • 5 Replies
  • Khloe Walther

    Member
    11/15/2024 at 7:49 am

    Many sites use GraphQL APIs, so I look for POST requests with query bodies in the network tab.

  • Aridai Farzona

    Member
    11/15/2024 at 8:05 am

    GraphQL can be easier to scrape because you can specify exactly what data you want.

  • Daniel Teuku

    Member
    11/15/2024 at 8:19 am

    If the site is built with React or Vue, there’s a good chance data is loaded via API calls. I look for these in the page’s JavaScript code or in the dev tools.

  • Yannig Avicenna

    Member
    11/15/2024 at 8:28 am

    I also inspect any XHR or Fetch requests in the dev tools. They often lead to JSON endpoints, which are much simpler to parse than HTML.

  • Straton Owain

    Member
    11/15/2024 at 9:34 am

    Sometimes, the API endpoint is hinted at in the page’s HTML source. A quick search for URLs or ‘endpoint’ keywords can reveal hidden paths.

Log in to reply.