16. XML - RobertMakyla/scalaWiki GitHub Wiki

KEY POINTS: XML literals this are of type NodeSeq. You can embed Scala code inside XML literals. (a bit like JSP ?) The 'child' property of a 'Node' yields the child nodes. The 'attributes' property of a 'Node' yields a 'MetaData' object containing the node attributes. The \ and \ operators carry out XPath-like matches. You can match node patterns with XML literals in case clauses. Use the 'RuleTransformer' with 'RewriteRule' instances to transform descendants of a node. The 'XML' object interfaces with Java XML methods for loading and saving. The 'ConstructingParser' is an alternate parser that preserves comments and 'CDATA' sections.

XML Literals

     val doc = <html><body>hello xml support</body></html>        // --> scala.xml.Elem
     val items = <li>Fred</li><li>Wilma</li>                      // --> scala.xml.NodeSeq

Caution:
     val (x, y) = (1, 2)   // defining x and y
     x < y                 // OK
     x <y                  // Error—unclosed XML literal

XML Nodes

    Seq[Node] <-- NodeSeq <-- Node <-- Elem
                                   <-- Text

    Iterable[Metadata] <-- MetaData


     val elem = <a href="http://scala-lang.org">The <em>Scala</em> language</a>    // scala.xml.Elem

     elem child        // NodeSeq(The , <em>Scala</em>,  language)
                       // all 3 are 'Nodes': two Text and one Elem

     elem child(0)    // Node The
     elem child(1)    // Node <em>Scala</em>
     elem child(2)    // Node language

     elem child(0) child   // NodeSeq()
     elem child(1) child   // NodeSeq(Scala)
     elem child(2) child   // NodeSeq()

 Since, NodeSeq is a subtype of Seq[Node]  - so I can use it as any other sequence
 I can treat each Element/Text as a sequence of Nodes

     for(i <- elem.child) println(i)

Element Attributes

     val elem = <a href="http://scala.com">The Scala</a>        // 1 child     -> NodeSeq(The Scala)
     val url = elem.attributes("href")                          // 1 attribute -> NodeSeq(http://scala.com)

     elem.attributes                           //  MetaData

     elem.attributes("href").text              //  String: http://scala.com
     elem.attributes("NotExisting")            //  NodeSeq = null

     elem.attributes("NotExisting").text       //  NullPointerException

     elem.attributes.get("href")               // Option[Seq[Node]] = Some(http://scala.com)
     elem.attributes.get("NotSureAtt")         // Option[Seq[Node]] = None

     elem.attributes.get("NotSureAtt").getOrElse( scala.xml.Text("") )


     for (attr <- elem.attributes)
         println("key=" + attr.key + ", value=" + attr.value.text)

Embedded Expressions

     val myItem = <li>{ scala.xml.Text("my item") }</li>    // Elem

 is the same as

     val myItem = <li>my item</li>                          // Elem


     <ul>{for (i <- 1 to 2) yield <li> {i} </li>}</ul>        // Elem <ul><li> 1 </li><li> 2 </li></ul

Escape char

     <ul>{ "hello" }</ul>                                   // Elem:  <ul>hello</ul>
     <ul>{{ "hello" }}</ul>                                 // Elem:  <ul>{ "hello" }</ul>

Expressions in Attributes

     def addMyPrefix(s:String) = "http://" + s

     <img src={addMyPrefix("scala.org")}/>          // Elem    <img src="http://scala.org"></img>

Uncommon Node Types

     val js = <script><![CDATA[if (temp < 0) alert("Cold!")]]></script>

         // Elem    <script>if (temp &lt; 0) alert(&quot;Cold!&quot;)</script>

 scala.xml.Group  instead of a sequence

     val g1 = <xml:group><li>Item 1</li><li>Item 2</li></xml:group>
     val g2 = scala.xml.Group( Seq( <li>Item 1</li>, <li>Item 2</li> ) )
     val g3 = scala.xml.Group(      <li>Item 1</li>  <li>Item 2</li>   )

XPath-like Expressions (always with BACKSLASHES \ or \ )

     val list = <dl><one>Java</one><d_two>Gosling</d_two><one>Scala</one><dd>Odersky</dd></dl>

 The \ locates immediate descendant

     list \ "one"            // -->  NodeSeq(<one>Java</one>, <one>Scala</one>)

 Wildcard "_" matches any level (only one level)

     doc \ "body" \ "_" \ "li"

 The \\ locates descendant at any level

     doc \\ "img"

 A String starting with @ locates attribute

     img \ "@alt"        // -->  value of the alt attribute of img node

     doc \\ "@alt"      // -->  all alt attributes of any elements inside doc.

     for (n <- doc \\ "img") println( n )    // processing all img from doc

*The << doc \\ "img" \ "@src" >> will not work if the document contains more than one img element
 To extract attributes from multiple nodes, use:

     doc \\ "img" \\ "@src"

 test:

   <a><b><c><d><img src="aaa"/></d></c></b><img src="bbb"/></a> \\ "img" \\ "@src"
   // --> NodeSeq(aaa, bbb)

Pattern Matching

     node match {
         case <li/> => ...     // if node is an li element with any attributes and 0 child elements
         case <li>{_}</li> =>  ...                                                 1 child element
         case <li>{details}</li> =>  details.text                                  1 child 'details'
         case <li>{_*}</li> => ...                                               0-n child elements
     }

16.9 to continue when need more XML

⚠️ **GitHub.com Fallback** ⚠️