Scraping Central Thailand requires attention to both static and dynamic content, especially for prices and product availability. Scrapy excels at parsing static content, but for more complex sites with dynamic loading, you might need to ensure that you are targeting the correct elements. Scraping the prices and availability status can sometimes require additional logic to account for out-of-stock items or sale prices. It’s also important to handle pagination to scrape all available products within a category.
import scrapy
class CentralPriceScraper(scrapy.Spider):
name = 'central_price_scraper'
start_urls = ['https://www.central.co.th/en/shop/']
def parse(self, response):
for product in response.xpath('//div[@class="product-item"]'):
title = product.xpath('.//h2[@class="product-title"]/text()').get()
price = product.xpath('.//span[@class="price"]/text()').get()
availability = product.xpath('.//div[@class="availability-status"]/text()').get()
yield {
'product': title.strip(),
'price': price.strip(),
'availability': availability.strip(),
}
# Follow pagination
next_page = response.xpath('//a[contains(@class, "next-page")]/@href').get()
if next_page:
yield response.follow(next_page, self.parse)