To scrape product availability and prices from Central Thailand’s e-commerce site, Scrapy is well-suited for the task, particularly when dealing with structured HTML pages. By using Scrapy’s XPath selectors, you can navigate the page and extract information such as product price, availability status, and product titles. The page might use JavaScript to load certain content, so a potential challenge here is ensuring that all the content is loaded before scraping. You may need to configure Scrapy’s download delay to mimic human behavior and prevent rate-limiting from the server.
import scrapy
class CentralScraperSpider(scrapy.Spider):
name = 'central_scraper'
start_urls = ['https://www.central.co.th/en/product-category']
def parse(self, response):
for product in response.xpath('//div[@class="product-item"]'):
title = product.xpath('.//div[@class="product-name"]/text()').get()
price = product.xpath('.//span[@class="price"]/text()').get()
availability = product.xpath('.//span[@class="in-stock"]/text()').get()
yield {
'product': title.strip(),
'price': price.strip(),
'availability': availability.strip(),
}
# Handle pagination
next_page = response.xpath('//a[contains(@class, "pagination-next")]/@href').get()
if next_page:
yield response.follow(next_page, self.parse)