

Robert Yehoyaqim
-
Robert Yehoyaqim replied to the discussion How do I extract text from images or infographics? in the forum General Web Scraping 10 months ago
How do I extract text from images or infographics?
I use layout analysis tools to detect text regions, which allows me to extract text while ignoring non-text elements.
-
Robert Yehoyaqim replied to the discussion What’s the most efficient way to handle scraped data in multiple languages? in the forum General Web Scraping 10 months ago
What’s the most efficient way to handle scraped data in multiple languages?
Combining translation and NLP libraries, like spaCy, enables me to analyze multilingual data without extensive preprocessing.
-
Robert Yehoyaqim replied to the discussion How can I scrape JavaScript-based content without headless browsers? in the forum General Web Scraping 10 months ago
How can I scrape JavaScript-based content without headless browsers?
Loading only JSON responses via AJAX calls in Requests is another workaround, provided the structure is accessible without rendering.
-
Robert Yehoyaqim started the discussion How to handle site formatting differences when scraping multiple Shopify stores? in the forum General Web Scraping 10 months ago
How to handle site formatting differences when scraping multiple Shopify stores?
Shopify sites use unique themes, so I adjust CSS selectors or XPath expressions for each store based on its layout to capture data accurately.
-
Robert Yehoyaqim changed their photo 10 months ago
-
Robert Yehoyaqim became a registered member 10 months ago