XML - ReFreezed/LuaWebGen GitHub Wiki
Note: The documentation has moved to the LuaWebGen website. Information here may be out of date!
[v1.2]
The XML module, available through the xml
global, handles XML data parsing and contains XML and HTML related functionality.
Note: The API is very similar to Penlight's. Most functions have new names, but the Penlight names also work.
Accessing XML data through the data object (or calling xml.parseXml()) will get you an XML node back (or specifically, an element). A node can be two things: XML tags become elements (represented by tables) while all other data become text nodes (represented by strings).
Elements are sometimes also called documents in this documentation and other places, especially when referring to the root element in a node tree.
Elements always have a tag
field and an attr
field (for attributes).
They are also arrays containing child nodes.
element = {
tag = tagName,
attr = {
[name1]=value1, [name2]=value2, ...
},
[1]=childNode1, [2]=childNode2, ...
}
A similar format is used in other libraries too. LuaExpat calls it LOM.
The following XML...
<animal type="dog" name="Puddles">
<hobbies>Biting & eating</hobbies>
<!-- Comments are ignored. -->
How did this <![CDATA[ get here? ]]>
</animal>
...results in this table:
document = {
tag = "animal",
attr = {
["name"] = "Puddles",
["type"] = "dog",
},
[1] = "\n\t",
[2] = {
tag = "hobbies",
attr = {},
[1] = "Biting & eating",
},
[3] = "\n\t\n\tHow did this get here? \n",
}
Notice how all whitespace is preserved, and that CDATA sections become text.
Note: All functions can be called as methods on elements (i.e. xml.toXml(element)
is the same as element:toXml()
).
- addChild
- clone
- compare
- contentsToHtml
- contentsToXml
- eachChild
- eachChildElement
- eachMatchingChildElement
- element
- encodeMoreEntities
- encodeRequiredEntities
- filter
- findAllElementsByName
- getAttributes
- getChildByName
- getFirstElement
- getHtmlText
- getText
- getTextOfDirectChildren
- isElement
- isText
- makeElementConstructors
- mapElements
- match
- newElement
- parseHtml
- parseXml
- removeWhitespaceNodes
- setAttribute
- substitute
- toHtml
- toPrettyXml
- toXml
- updateAttributes
- walk
xml.addChild( element, childNode )
Add a child node to an element.
Penlight alias:
Element:add_direct_child()
nodeClone = xml.clone( node [, textSubstitutionCallback ] )
Clones a node and it's children.
If the textSubstitutionCallback
arguments is given, it should be a function with this signature:
text = textSubstitutionCallback( text, kind, parentElement )
This function is called for every text node, tag name and attribute in the node tree.
It can modify the values of these things for the clone by returning a modified string.
kind
will be "*TEXT"
for text nodes, "*TAG"
for tag names, and the attribute name for attributes.
parentElement
will be nil if the initial node
argument is a text node.
nodesLookEqual = xml.compare( value1, value2 )
Returns true if the values are two nodes that look equal, false otherwise. Returns false if any value is not a node.
htmlString = xml.contentsToHtml( node )
[v1.3] Convert the child nodes of a node into an HTML string.
Also see xml.toHtml().
xmlString = xml.contentsToXml( node )
[v1.3] Convert the child nodes of a node into an XML string.
Also see xml.toXml().
for childNode in xml.eachChild( element )
Iterate over child nodes.
Penlight alias:
Element:children()
for childElement in xml.eachChildElement( element )
Iterate over child elements (skipping over text nodes).
Penlight alias:
Element:childtags()
for childElement in xml.eachMatchingChildElement( element, tag )
Iterate over child elements that have the given tag name.
element = xml.element( tag [, childNode ] )
element = xml.element( tag, attributesAndChildNodes )
Convenient function for creating a new element. The second argument, if given, can be either a node to put in the element as it's first child, or a combination of an array of child elements and a table of attributes. Examples:
local person = xml.element("person")
local month = xml.element("month", "April")
local planet = xml.element("planet", xml.element("moon"))
local chicken = xml.element("chicken", {
age = "3",
id = "942-8483",
xml.element("egg"),
xml.element("egg"),
})
Also see xml.newElement().
Penlight alias:
xml.elem()
encodedString = xml.encodeRequiredEntities( string )
[v1.3]
Encode &
, <
, >
, "
and '
characters into XML/HTML entities (&
etc.).
This is the same function as entities().
html = xml.encodeMoreEntities( string )
[v1.3]
Encode &
, <
, >
, "
and '
characters into HTML entities (&
etc.).
Also encodes some additional spaces and invisible characters, like
and ⁢
.
xml.filter( element [, textSubstitutionCallback ] )
Clone an element and it's children. This is an alias for xml.clone().
elements = xml.findAllElementsByName( element, tag [, doNotRecurse=false ] )
Get all child elements that have the given tag, optionally non-recursively.
Penlight alias:
Element:get_elements_with_name()
attributes = xml.getAttributes( element )
Get the attributes table for an element (i.e. element.attr
).
Note that the actual table is returned - not a copy of it!
Note: You can use xml.setAttribute() or xml.updateAttributes() for updating attributes.
Penlight alias:
Element:get_attribs()
childElement = xml.getChildByName( element, tag )
Get the first child element with a given tag name. Returns nil if none exist.
Penlight alias:
Element:child_with_name()
childElement = xml.getFirstElement( element )
Get the first child element. Returns nil if none exist.
Penlight alias:
Element:first_childtag()
text = xml.getHtmlText( element )
[v1.3]
Get the full text value of an element (i.e. the concatenation of all child text nodes, recursively).
Unlike xml.getText(), this function is aware of HTML-specific properties, e.g. that the alt
attribute of <img>
tags can be used as a textual replacement for the image.
text = xml.getText( element )
Get the full text value of an element (i.e. the concatenation of all child text nodes, recursively).
Also see xml.getHtmlText().
text = xml.getTextOfDirectChildren( element )
Get the full text value of an element's direct children (i.e. the concatenation of all child text nodes, non-recursively).
(In most cases you probably want to use xml.getText() or xml.getHtmlText() instead of this function.)
Penlight alias:
Element:get_text()
bool = xml.isElement( value )
Check if a value is an element.
Penlight alias:
xml.is_tag()
bool = xml.isText( value )
Check if a value is a text node. (Any string value will make the function return true.)
constructor1, constructor2, ... = xml.makeElementConstructors( tags )
constructor1, constructor2, ... = xml.makeElementConstructors "tag1,tag2,..."
Given a list of tag names, return a number of element constructors. The argument can either be an array of tag names, or a string with comma-separated tags.
A constructor creates a new element with the respective tag name every time it's called. It's a function with this signature:
element = constructor( [ childNode ] )
element = constructor( attributesAndChildNodes )
The argument, if given, can be either a node to put in the element as it's first child, or a combination of an array of child elements and a table of attributes (same as the argument for xml.element()).
Example:
local bowl,fruit = xml.makeElementConstructors "bowl,fruit"
local document = bowl{ size="small", fruit"Apple", fruit"Orange" }
print(document) -- <bowl size="small"><fruit>Apple</fruit><fruit>Orange</fruit></bowl>
Penlight alias:
xml.tags()
element = xml.mapElements( element, callback )
replacementNode = callback( childElement )
Visit and call a function on all child elements of an element (non-recursively), possibility modifying the document. Returning a node from the callback replaces the current element, while returning nil removes it.
Penlight alias:
Element:maptags()
matches = xml.match( document, xmlStringPattern )
matches = xml.match( document, elementPattern )
Find things in a document by supplying a pattern. This is the opposite function of xml.substitute(). See the Penlight manual on the subject for more info (look for the sections describing templates). Returns nil and a message on error.
element = xml.newElement( tag [, attributes ] )
Create a new element, optionally initialized with a given attributes table. Examples:
local person = xml.newElement("person")
local chicken = xml.newElement("chicken", {age="3", id="942-8483"})
Also see xml.element().
Penlight alias:
xml.new()
element = xml.parseHtml( xmlString [, filePathForErrorMessages ] )
Parse a string containing HTML markup. Returns nil and a message on error. Example:
local document = xml.parseHtml("<!DOCTYPE html>\n<html><head><script> var result = 1 & 3; </script></head></html>")
print(document[1][1].tag) -- script
element = xml.parseXml( xmlString [, filePathForErrorMessages ] )
Parse a string containing XML markup. Returns nil and a message on error. Example:
local document = xml.parseXml("<foo><bar/></foo>")
print(document[1].tag) -- bar
xml.removeWhitespaceNodes( document )
Recursively remove all text nodes that don't contain any non-whitespace characters from the document.
print(document:toXml())
--[[ Output:
<horses>
<horse>
<name> Glitter </name>
</horse>
<horse>
<name>Rush </name>
</horse>
</horses>
]]
document:removeWhitespaceNodes()
print(document:toXml())
--[[ Output:
<horses><horse><name> Glitter </name></horse><horse><name>Rush </name></horse></horses>
]]
xml.setAttribute( element, attributeName, attributeValue )
xml.setAttribute( element, attributeName, nil )
Add a new attribute, or update the value of an existing. Specify a nil value to remove the attribute.
Penlight alias:
Element:set_attrib()
newDocument = xml.substitute( xmlString, data )
newDocument = xml.substitute( document, data )
Create a substituted copy of a document. This is the opposite function of xml.match(). See the Penlight manual on the subject for more info (look for the sections describing templates). Returns nil and a message on error.
Penlight alias:
Element:subst()
htmlString = xml.toHtml( node [, preface=false ] )
Convert a node into an HTML string.
preface
, if given, can either be a boolean that says whether a standard <!DOCTYPE html>
string should be prepended, or be a string containing the given preface that should be added.
Example:
local document = xml.parseHtml('<html x = "y" ><body><input type=text disabled></body></html>')
print(document:toHtml())
--[[ Output:
<html x="y"><body><input type="text" disabled></body></html>
]]
xmlString = xml.toPrettyXml( node [, initIndent="", indent=noIndent, attrIndent=noIndent, preface=false ] )
Convert a node into an XML string with some "pretty" modifications.
(Generally, you probably want to use xml.toXml() instead of this function.)
initIndent
will be prepended to each line.
Specifying indent
puts each tag on a new line.
Specifying attrIndent
puts each attribute on a new line.
preface
, if given, can either be a boolean that says whether a standard <?xml...?>
string should be prepended, or be a string containing the given preface that should be added.
Examples:
local document = xml.parseXml('<foo x="y"><bar/></foo>')
print(document:toPrettyXml("", " "))
--[[ Output:
<foo x="y">
<bar/>
</foo>
]]
print(document:toPrettyXml("", " ", " ", '<?xml version="1.0"?>'))
--[[ Output:
<?xml version="1.0"?>
<foo
x="y"
>
<bar/>
</foo>
]]
This function is used when calling tostring(element)
.
Also see xml.toXml().
Penlight alias:
xml.tostring()
xmlString = xml.toXml( node [, preface=false ] )
Convert a node into an XML string.
preface
, if given, can either be a boolean that says whether a standard <?xml...?>
string should be prepended, or be a string containing the given preface that should be added.
Examples:
local document = xml.parseXml('<foo x = "y" ><bar /></foo>')
print(document:toXml())
--[[ Output:
<foo x="y"><bar/></foo>
]]
print(document:toXml('<?xml version="1.0"?>'))
--[[ Output:
<?xml version="1.0"?>
<foo x="y"><bar/></foo>
]]
Also see xml.toPrettyXml().
xml.updateAttributes( element, attributes )
Add new attributes, or update the values of existing.
Penlight alias:
Element:set_attribs()
xml.walk( document, depthFirst, callback )
traversalAction = callback( tag, element )
traversalAction = "stop"|"ignorechildren"|nil
Have a function recursively be called on every element in a document (including itself and excluding text nodes).
If depthFirst
is true then child elements are visited before parent elements.
Return "stop"
from the callback to stop the traversal completely,
return "ignorechildren"
to make the traversal skip all children (unless depthFirst
is true in which case it does nothing),
or return nil (or nothing) to continue the traversal.
Example:
document:walk(false, function(tag, el)
if tag == "dog" then
local dogName = (el.attr.name or "something")
printf("Found doggo called %s!", dogName)
end
end)