16. XML - RobertMakyla/scalaWiki GitHub Wiki
KEY POINTS: XML literals this are of type NodeSeq. You can embed Scala code inside XML literals. (a bit like JSP ?) The 'child' property of a 'Node' yields the child nodes. The 'attributes' property of a 'Node' yields a 'MetaData' object containing the node attributes. The \ and \ operators carry out XPath-like matches. You can match node patterns with XML literals in case clauses. Use the 'RuleTransformer' with 'RewriteRule' instances to transform descendants of a node. The 'XML' object interfaces with Java XML methods for loading and saving. The 'ConstructingParser' is an alternate parser that preserves comments and 'CDATA' sections.
XML Literals
val doc = <html><body>hello xml support</body></html> // --> scala.xml.Elem
val items = <li>Fred</li><li>Wilma</li> // --> scala.xml.NodeSeq
Caution:
val (x, y) = (1, 2) // defining x and y
x < y // OK
x <y // Error—unclosed XML literal
XML Nodes
Seq[Node] <-- NodeSeq <-- Node <-- Elem
<-- Text
Iterable[Metadata] <-- MetaData
val elem = <a href="http://scala-lang.org">The <em>Scala</em> language</a> // scala.xml.Elem
elem child // NodeSeq(The , <em>Scala</em>, language)
// all 3 are 'Nodes': two Text and one Elem
elem child(0) // Node The
elem child(1) // Node <em>Scala</em>
elem child(2) // Node language
elem child(0) child // NodeSeq()
elem child(1) child // NodeSeq(Scala)
elem child(2) child // NodeSeq()
Since, NodeSeq is a subtype of Seq[Node] - so I can use it as any other sequence
I can treat each Element/Text as a sequence of Nodes
for(i <- elem.child) println(i)
Element Attributes
val elem = <a href="http://scala.com">The Scala</a> // 1 child -> NodeSeq(The Scala)
val url = elem.attributes("href") // 1 attribute -> NodeSeq(http://scala.com)
elem.attributes // MetaData
elem.attributes("href").text // String: http://scala.com
elem.attributes("NotExisting") // NodeSeq = null
elem.attributes("NotExisting").text // NullPointerException
elem.attributes.get("href") // Option[Seq[Node]] = Some(http://scala.com)
elem.attributes.get("NotSureAtt") // Option[Seq[Node]] = None
elem.attributes.get("NotSureAtt").getOrElse( scala.xml.Text("") )
for (attr <- elem.attributes)
println("key=" + attr.key + ", value=" + attr.value.text)
Embedded Expressions
val myItem = <li>{ scala.xml.Text("my item") }</li> // Elem
is the same as
val myItem = <li>my item</li> // Elem
<ul>{for (i <- 1 to 2) yield <li> {i} </li>}</ul> // Elem <ul><li> 1 </li><li> 2 </li></ul
Escape char
<ul>{ "hello" }</ul> // Elem: <ul>hello</ul>
<ul>{{ "hello" }}</ul> // Elem: <ul>{ "hello" }</ul>
Expressions in Attributes
def addMyPrefix(s:String) = "http://" + s
<img src={addMyPrefix("scala.org")}/> // Elem <img src="http://scala.org"></img>
Uncommon Node Types
val js = <script><![CDATA[if (temp < 0) alert("Cold!")]]></script>
// Elem <script>if (temp < 0) alert("Cold!")</script>
scala.xml.Group instead of a sequence
val g1 = <xml:group><li>Item 1</li><li>Item 2</li></xml:group>
val g2 = scala.xml.Group( Seq( <li>Item 1</li>, <li>Item 2</li> ) )
val g3 = scala.xml.Group( <li>Item 1</li> <li>Item 2</li> )
XPath-like Expressions (always with BACKSLASHES \ or \ )
val list = <dl><one>Java</one><d_two>Gosling</d_two><one>Scala</one><dd>Odersky</dd></dl>
The \ locates immediate descendant
list \ "one" // --> NodeSeq(<one>Java</one>, <one>Scala</one>)
Wildcard "_" matches any level (only one level)
doc \ "body" \ "_" \ "li"
The \\ locates descendant at any level
doc \\ "img"
A String starting with @ locates attribute
img \ "@alt" // --> value of the alt attribute of img node
doc \\ "@alt" // --> all alt attributes of any elements inside doc.
for (n <- doc \\ "img") println( n ) // processing all img from doc
*The << doc \\ "img" \ "@src" >> will not work if the document contains more than one img element
To extract attributes from multiple nodes, use:
doc \\ "img" \\ "@src"
test:
<a><b><c><d><img src="aaa"/></d></c></b><img src="bbb"/></a> \\ "img" \\ "@src"
// --> NodeSeq(aaa, bbb)
Pattern Matching
node match {
case <li/> => ... // if node is an li element with any attributes and 0 child elements
case <li>{_}</li> => ... 1 child element
case <li>{details}</li> => details.text 1 child 'details'
case <li>{_*}</li> => ... 0-n child elements
}
16.9 to continue when need more XML