Unblock YesMovies.org Using Proxies and Automate Scraping Movies Using Java and MySQL
Setting Up Java for Web Scraping
Before scraping movie data from YesMovies.org, install the required Java dependencies:
<dependencies> <dependency> <groupId>org.jsoup</groupId> <artifactId>jsoup</artifactId> <version>1.15.3</version> </dependency> <dependency> <groupId>org.apache.httpcomponents.client5</groupId> <artifactId>httpclient5</artifactId> <version>5.2</version> </dependency> </dependencies>
Configuring a Proxy in Java
To unblock YesMovies.org, configure Java to use a proxy:
import java.io.IOException; import java.net.*; import org.jsoup.Jsoup; import org.jsoup.nodes.Document; public class YesMoviesUnblocker { public static void main(String[] args) { String proxyHost = "your.proxy.server"; int proxyPort = 8080; try { Proxy proxy = new Proxy(Proxy.Type.HTTP, new InetSocketAddress(proxyHost, proxyPort)); URL url = new URL("https://yesmovies.org"); HttpURLConnection connection = (HttpURLConnection) url.openConnection(proxy); connection.setRequestMethod("GET"); connection.setRequestProperty("User-Agent", "Mozilla/5.0"); Document doc = Jsoup.parse(connection.getInputStream(), "UTF-8", url.toString()); System.out.println(doc.title()); } catch (IOException e) { e.printStackTrace(); } } }
Scraping Movie Information from YesMovies
Once unblocked, extract movie details such as titles, ratings, and descriptions using Java and JSoup.
import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import java.io.IOException; public class YesMoviesScraper { public static void main(String[] args) { String url = "https://yesmovies.org/movies"; try { Document doc = Jsoup.connect(url).userAgent("Mozilla/5.0").get(); Elements movies = doc.select(".movie-item"); for (Element movie : movies) { String title = movie.select(".movie-title").text(); String rating = movie.select(".rating").text(); String description = movie.select(".description").text(); System.out.println("Title: " + title); System.out.println("Rating: " + rating); System.out.println("Description: " + description); System.out.println("-----------------------------"); } } catch (IOException e) { e.printStackTrace(); } } }
Storing Scraped Movie Data in MySQL
To store the extracted movie data, set up a MySQL database.
Creating the MySQL Database and Table
CREATE DATABASE MovieScraperDB; USE MovieScraperDB; CREATE TABLE movies ( id INT AUTO_INCREMENT PRIMARY KEY, title VARCHAR(255), rating VARCHAR(10), description TEXT, timestamp TIMESTAMP DEFAULT CURRENT_TIMESTAMP );
Inserting Movie Data into MySQL
import java.sql.Connection; import java.sql.DriverManager; import java.sql.PreparedStatement; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; public class MovieDatabaseHandler { private static final String DB_URL = "jdbc:mysql://localhost:3306/MovieScraperDB"; private static final String USER = "root"; private static final String PASSWORD = "password"; public static void saveMovies(Document doc) { String sql = "INSERT INTO movies (title, rating, description) VALUES (?, ?, ?)"; try (Connection conn = DriverManager.getConnection(DB_URL, USER, PASSWORD); PreparedStatement stmt = conn.prepareStatement(sql)) { Elements movies = doc.select(".movie-item"); for (Element movie : movies) { String title = movie.select(".movie-title").text(); String rating = movie.select(".rating").text(); String description = movie.select(".description").text(); stmt.setString(1, title); stmt.setString(2, rating); stmt.setString(3, description); stmt.executeUpdate(); } System.out.println("Movie data stored successfully."); } catch (Exception e) { e.printStackTrace(); } } }
Handling Anti-Scraping Measures
YesMovies.org may implement anti-scraping measures, so follow best practices to avoid detection:
- Use Rotating Proxies: Rotate IP addresses to prevent getting blocked.
- Set User-Agent Headers: Mimic a real browser.
- Introduce Random Delays: Avoid sending requests too quickly to prevent rate-limiting.
- Use Headless Browsing: If needed, Selenium can be used for JavaScript-heavy pages.
Conclusion
Unblocking YesMovies.org using proxies allows access to its content from restricted regions. By automating the data extraction process with Java and JSoup, users can scrape movie details efficiently. Storing the extracted data in MySQL ensures easy retrieval and analysis. Implementing best practices such as rotating proxies, request throttling, and setting user-agent headers helps avoid detection while scraping YesMovies. Whether for research, database creation, or personal use, automated scraping can make movie data collection seamless and efficient.