Whitespace only Text Nodes - xspec/xspec GitHub Wiki
XSpec handles whitespace-only text nodes in a way similar to XSLT.
- In XSLT, when
deep-equal()
compares nodes, whitepace-only text nodes are not ignored. - Likewise, when XSpec compares the actual result and the expected result, whitespace-only text nodes are not ignored.
- XSLT ignores whitespace-only text nodes written directly in a stylesheet.
- Likewise, XSpec ignores whitespace-only text nodes written directly in an XSpec document.
- In XSLT,
doc()
anddocument()
keep whitespace-only text nodes intact. - Likewise, in XSpec,
@href
keeps whitespace-only text nodes intact.
When XSpec compares the actual result and the expected result, all nodes are considered as significant. Whitespace-only text nodes are not ignored.
tested.xsl
Note that the constructed body
element will have no whitespace-only text nodes (unless it is serialized with indentation and then reloaded).
<xsl:stylesheet exclude-result-prefixes="#all" version="3.0"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:template as="element(body)" name="construct-body">
<body>
<p>abc</p>
</body>
</xsl:template>
</xsl:stylesheet>
expected.xml
Note that this body
element was serialized with indentation.
<?xml version="1.0" encoding="UTF-8"?>
<body>
<p>abc</p>
</body>
test.xspec
<x:description stylesheet="tested.xsl" xmlns:x="http://www.jenitennison.com/xslt/xspec">
<x:scenario label="Calling body constructor template">
<x:call template="construct-body" />
<x:expect label="Expect body" href="expected.xml" select="body" />
</x:scenario>
</x:description>
The result of this x:expect
is Failure, because of differences between the actual result (<body><p>abc</p></body>
) and the expected result (<body>
  <p>abc</p>
</body>
).
Whitespace-only text nodes in the comparison report are represented as grey \t
, \n
, \r
and ␣
characters.
XSpec discards whitespace-only text nodes when loading embedded XML.
In this example XSpec
<x:param name="span">
<span>	

 </span>
</x:param>
$span
is not 
  <span>	

 </span>

but <span/>
.
A whitespace-only text node in embedded XML is kept intact only when one of the following conditions is met:
-
Its nearest ancestor element with
@xml:space
has@xml:space="preserve"
. For example,<x:context xml:space="preserve"><span>	

 </span></x:context>
-
Its parent element name is specified in
/x:description/@preserve-space
. For example,<x:description preserve-space="code pre"> ... <x:param> <pre>	

 </pre> </x:param>
-
Its parent element is
x:text
. For example,<x:expect label="Expects a whitespace-only text node"> <x:text>	

 </x:text> </x:expect>
XSpec keeps whitespace-only text nodes intact when loading external XML. For example,
XSpec
<x:param name="href" href="body.xml" />
<x:param name="doc" select="doc('.../body.xml')" />
body.xml
<?xml version="1.0" encoding="UTF-8"?>
<body>
<p>abc</p>
</body>
$href
and $doc
are not <body><p>abc</p></body>
but <body>
  <p>abc</p>
</body>
.
x:*/@xml:space
and /x:description/@preserve-space
have no effect on external XML. For example, this XSpec has no effect on whitespace-only text nodes in body.xml
:
<x:description preserve-space="p">
...
<x:param name="href" href="body.xml" xml:space="default" />
<x:param name="doc" select="doc('.../body.xml')" xml:space="default" />
...
You may want to remove some or all of whitespace-only text nodes when they are in external XML. There is more than one way to do it.
You can write a test helper function and use it in @select
. For example, see tutorial/helper/ws-only-text/
.
You can use a Saxon-specific URI query parameter, strip-space
. For example,
<x:param href="space.xml?strip-space=yes" />
By default, Saxon-specific URI query parameters including strip-space
are not recognized. To enable the query parameters, you need to enable a Saxon-specific configuration option, RECOGNIZE_URI_QUERY_PARAMETERS
. To configure Saxon-specific configuration options, you can use SAXON_CUSTOM_OPTIONS
environment variable (for command line) or saxon.custom.options
(for Ant). For example,
-
Command line
Linux/macOS
export SAXON_CUSTOM_OPTIONS=--recognize-uri-query-parameters:true
Windows
set SAXON_CUSTOM_OPTIONS=--recognize-uri-query-parameters:true
-
Ant
ant ... -Dsaxon.custom.options=--recognize-uri-query-parameters:true ...
Unfortunately the RECOGNIZE_URI_QUERY_PARAMETERS
option (--recognize-uri-query-parameters
command line parameter) does not work side by side with XML Catalog. When XML Catalog support is enabled, Saxon does not recognize the strip-space
query parameter.