DOMDocument - markhowellsmead/helpers GitHub Wiki
Append HTML string as node to existing document
Since 9th January 2025, the appendHTML
method no longer converts the incoming HTML to UTF-8. It's assumed that the string already uses this encoding.
<?php
namespace SayHello\Theme\Package;
use DOMDocument as GlobalDOMDocument;
use DOMNode;
/**
* DomDocument stuff
*
* @author Say Hello GmbH <[email protected]>
*/
class DomDocument
{
/**
* Helper function to allow easy adding an
* HTML string to the parent as a child node.
*
* @param DOMNode $parent
* @param string $source
* @return void
*/
public function appendHTML(DOMNode $parent, string $html)
{
$document = new GlobalDOMDocument();
$document->loadHTML($html);
foreach ($document->getElementsByTagName('body')->item(0)->childNodes as $node) {
$node = $parent->ownerDocument->importNode($node, true);
$parent->appendChild($node);
}
}
}
Revising the content of a block's HTML
This version is from 9th January 2025. I added LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD
so that the DomDocument's HTML doesn't contain artificially-added html
and body
tags, which makes the parsing and return a bit cleaner.
When using this method, make sure that the node search is correct when using $document->documentElement
(in reference to the root element).
When returning the HTML in the event that it has been changed, I've ensured that I remove the prefix '<?xml encoding="UTF-8">'
(which I add in order to ensure maintained correct encoding) from the returned string. This isn't strictly necessary, but ensures that the source code remains clean.
Encoding
Finally, I removed the former method to convert the encoding of the HTML to UTF-8. My projects always use UTF-8, so I've ensured that the function simply doesn't run on HTML strings which are using a different encoding.
<?php
namespace PT\MustUse\Blocks\CoreImage;
use DOMDocument;
use DOMXPath;
class Block
{
public function run()
{
add_filter('render_block_core/image', [$this, 'render'], 10, 2);
}
public function render($html, $block)
{
if (empty($html) || !mb_detect_encoding($html, 'UTF-8', true)) {
return $html;
}
if (strpos($block['attrs']['className'] ?? '', 'is-style-webcam') === false) {
return $html;
}
libxml_use_internal_errors(true);
$document = new DOMDocument();
$document->loadHTML('<?xml encoding="UTF-8">' . $html, LIBXML_HTML_NOIMPLIED | LIBXML_HTML_NODEFDTD);
$xpath = new DOMXPath($document);
$nodeList = $xpath->query('//img');
foreach ($nodeList as $node) {
$new_src = $node->getAttribute('src') . (parse_url($node->getAttribute('src'), PHP_URL_QUERY) ? '&' : '?') . 'force=' . rand(1, 1000000);
$node->setAttribute('src', $new_src);
}
libxml_clear_errors();
return str_replace('<?xml encoding="UTF-8">','', $document->saveHTML());
}
}