Once installed, you can start parsing and extracting data from HTML content. The next step is to find and select the respective element from the larger HTML DOM text.
With HTML Simple Dom, there are many ways to select an element, namely by the most common HTML Dom Element properties: tag name, class or ID.
foreach ($html->find('p') as $paragraph) {
echo $paragraph->plaintext . "<br>";
}
This retrieves all <p>
elements and outputs their text.
$elements = $html->find('.example-class');
foreach ($elements as $element) {
echo $element->plaintext . "<br>";
}
Use a period . to target classes.
$element = $html->find('#unique-id', 0);
echo $element ? $element->plaintext : "Element not found.";
Use a hash #
to target IDs.
Now we’ve found the element, we can extract the relevant data that we want to scrape. Similar to finding elements, we do this by extracting the specific data we need.
$title = $html->find('title', 0);
echo $title ? $title->plaintext : "No title found.";
foreach ($html->find('a') as $link) {
echo 'Link text: ' . $link->plaintext . "<br>";
echo 'URL: ' . $link->href . "<br><br>";
}
foreach ($html->find('img') as $image) {
echo 'Image source: ' . $image->src . "<br>";
echo 'Alt text: ' . $image->alt . "<br><br>";
}
Knowing how to explore and extract html dom element properties is great, but let’s put it into a web scraping context. When scraping and parsing the DOM, you’ll often be looking for something specific.
How this works will always vary depending on the website you are scraping, but let’s use prices as an example.
<div class="product">
<span class="price">$19.99</span>
</div>
$price = $html->find('.price', 0);
echo $price ? 'Price: ' . $price->plaintext : "Price not found.";
This locates the first element with the class "price" and outputs its text content.
foreach ($html->find('.price') as $price) {
echo 'Price: ' . trim($price->plaintext) . "<br>";
}
This retrieves and prints all price elements from a page, useful for product listings.
Sometimes, prices may be inside nested elements:
<div class="product-price">
<span class="amount">$24.99</span>
</div>
$price = $html->find('.product-price .amount', 0);
echo $price ? 'Price: ' . $price->plaintext : "Price not found.";
Using nested selectors ensures precise targeting of elements within complex HTML structures.
Our community is here to support your growth, so why wait? Join now and let’s build together!