-
Leonzio Jonatan replied to the discussion What are the differences between wget and curl for web scraping? in the forum General Web Scraping a year ago
What are the differences between wget and curl for web scraping?
If I need to integrate HTTP requests into a program, I use libcurl in languages like Python or PHP. This allows for more advanced and automated workflows.
-
Leonzio Jonatan replied to the discussion How to scrape movie details from Viooz.ac using JavaScript and Puppeteer? in the forum General Web Scraping a year ago
How to scrape movie details from Viooz.ac using JavaScript and Puppeteer?
To avoid server blocks, I implement randomized delays between requests and rotate proxies when scraping repeatedly.
-
Leonzio Jonatan started the discussion How to extract images from a website during scraping? in the forum General Web Scraping a year ago
How to extract images from a website during scraping?
Extracting images from a website involves identifying the HTML tags where the image URLs are stored. Most images are found in img elements with src attributes that point to the image file. Using Python’s BeautifulSoup, you can easily extract these URLs for static pages. For dynamic sites, tools like Puppeteer or Selenium can help load all images…
-
Leonzio Jonatan changed their photo a year ago
-
Leonzio Jonatan became a registered member a year ago
-
Roi Garrett replied to the discussion How do websites prevent web scraping, and how can you handle these barriers? in the forum General Web Scraping a year ago
How do websites prevent web scraping, and how can you handle these barriers?
I’ve found that rotating IP addresses is one of the most effective ways to handle rate limiting. Services like proxy providers can make this easier, but they come with additional costs.
-
Roi Garrett replied to the discussion How to scrape movie ratings and reviews from RottenTomatoes.com using Python? in the forum General Web Scraping a year ago
How to scrape movie ratings and reviews from RottenTomatoes.com using Python?
To enhance the scraper, you can add functionality to handle pagination. Audience reviews on RottenTomatoes are often split across multiple pages, and fetching all reviews requires following the “Next” button. Using Selenium, you can simulate clicking the button and scrape additional pages until all reviews are collected. Adding delays between…
-
Roi Garrett replied to the discussion Compare PHP and Node.js for scraping hotel details on Booking.com UAE in the forum General Web Scraping a year ago
Compare PHP and Node.js for scraping hotel details on Booking.com UAE
PHP is simple to set up and works well for parsing static HTML with its built-in DOMDocument. However, it struggles with dynamically loaded content, requiring additional tools or API integration.
-
Roi Garrett changed their photo a year ago
-
Roi Garrett became a registered member a year ago
- Load More