-
Compare using Python and Go to scrape hotel prices from Traveloka Indonesia
How does scraping hotel prices from Traveloka, a popular travel booking website in Indonesia, differ when using Python versus Go? Does Python’s BeautifulSoup library provide enough flexibility to handle the site’s structure, or does Go’s Colly library offer better performance for large-scale scraping tasks? How do both languages handle dynamic content, such as discounts or region-specific pricing, which are common on travel websites?
Below are two potential implementations—one in Python and one in Go—to scrape hotel prices from a Traveloka page. Which approach better handles the complexities of dynamic content and ensures accurate data extraction?Python Implementation:import requests from bs4 import BeautifulSoup # URL of the Traveloka hotel page url = "https://www.traveloka.com/en-id/hotel/product-page" # Headers to mimic a browser request headers = { "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/91.0.4472.124 Safari/537.36" } # Fetch the page content response = requests.get(url, headers=headers) if response.status_code == 200: soup = BeautifulSoup(response.content, "html.parser") # Extract hotel prices prices = soup.find_all("div", class_="hotel-price") for idx, price in enumerate(prices, 1): print(f"Hotel {idx} Price:", price.text.strip()) else: print(f"Failed to fetch the page. Status code: {response.status_code}")
Go Implementation:
package main import ( "fmt" "log" "github.com/gocolly/colly" ) func main() { // Create a new Colly collector c := colly.NewCollector() // Scrape hotel prices c.OnHTML(".hotel-price", func(e *colly.HTMLElement) { price := e.Text fmt.Println("Hotel Price:", price) }) // Handle errors c.OnError(func(_ *colly.Response, err error) { log.Println("Error occurred:", err) }) // Visit the Traveloka hotel page err := c.Visit("https://www.traveloka.com/en-id/hotel/product-page") if err != nil { log.Fatalf("Failed to visit website: %v", err) } }
Log in to reply.