docs.xml - jgrey4296/jgrey4296.github.io GitHub Wiki
- https://www.w3.org/TR/rdf-syntax-grammar/
- https://www.w3.org/TR/xml-entity-names/
- https://www.w3.org/TR/xml-names/
- https://www.w3.org/TR/xmlbase/
- https://www.w3.org/TR/xml/
- https://www.w3.org/TR/xquery-31/
- https://www.w3.org/TR/xpath-31/
trang [input.xmls] output.xsd
# Extracts between table and footer
xidel -s --output-format=xml --xpath "//table/following-sibling::*[//*[@id='printfooter']/preceding::node()]" `?`
xmllint 2>&1
xmlstarlet 2>&1
XMLStarlet Toolkit: Display element structure of XML document Usage: xml el [<options>] <xml-file> where <xml-file> - input XML document file name (stdin is used if missing) <options> is one of: -a - show attributes as well -v - show attributes and their values -u - print out sorted unique lines -d<n> - print out sorted unique lines up to depth <n>
XMLStarlet is a command line toolkit to query/edit/check/transform XML documents (for more information see http://xmlstar.sourceforge.net/)
Usage: xml sel <global-options> {<template>} [ <xml-file> … ] where <global-options> - global options for selecting <xml-file> - input XML document file name/uri (stdin is used if missing) <template> - template for querying XML document with following syntax:<global-options> are: -Q or –quiet - do not write anything to standard output. -C or –comp - display generated XSLT -R or –root - print root element <xsl-select> -T or –text - output is text (default is XML) -I or –indent - indent output -D or –xml-decl - do not omit xml declaration line -B or –noblanks - remove insignificant spaces from XML tree -E or –encode <encoding> - output in the given encoding (utf-8, unicode…) -N <name>=<value> - predefine namespaces (name without ‘xmlns:’) ex: xsql=urn:oracle-xsql Multiple -N options are allowed. –net - allow fetch DTDs or entities over network –help - display help
Syntax for templates: -t|–template <options> where <options> -c or –copy-of <xpath> - print copy of XPATH expression -v or –value-of <xpath> - print value of XPATH expression -o or –output <string> - output string literal -n or –nl - print new line -f or –inp-name - print input file name (or URL) -m or –match <xpath> - match XPATH expression –var <name> <value> –break or –var <name>=<value> - declare a variable (referenced by $name) -i or –if <test-xpath> - check condition <xsl:if test=”test-xpath”> –elif <test-xpath> - check condition if previous conditions failed –else - check if previous conditions failed -e or –elem <name> - print out element <xsl:element name=”name”> -a or –attr <name> - add attribute <xsl:attribute name=”name”> -b or –break - break nesting -s or –sort op xpath - sort in order (used after -m) where op is X:Y:Z, X is A - for order=”ascending” X is D - for order=”descending” Y is N - for data-type=”numeric” Y is T - for data-type=”text” Z is U - for case-order=”upper-first” Z is L - for case-order=”lower-first”
There can be multiple –match, –copy-of, –value-of, etc options in a single template. The effect of applying command line templates can be illustrated with the following XSLT analogue
xml sel -t -c “xpath0” -m “xpath1” -m “xpath2” -v “xpath3” \ -t -m “xpath4” -c “xpath5”
XMLStarlet Toolkit: Edit XML document(s) Usage: xml ed <global-options> {<action>} [ <xml-file-or-uri> … ] where <global-options> - global options for editing <xml-file-or-uri> - input XML document file name/uri (stdin otherwise)<global-options> are: -P, or -S - preserve whitespace nodes. (or –pf, –ps) Note that space between attributes is not preserved -O (or –omit-decl) - omit XML declaration (<?xml …?>) -L (or –inplace) - edit file inplace -N <name>=<value> - predefine namespaces (name without ‘xmlns:’) ex: xsql=urn:oracle-xsql Multiple -N options are allowed. -N options must be last global options. –net - allow network access –help or -h - display help
where <action> -d or –delete <xpath> –var <name> <xpath> -i or –insert <xpath> -t (–type) elem|text|attr -n <name> [-v (–value) <value>] -a or –append <xpath> -t (–type) elem|text|attr -n <name> [-v (–value) <value>] -s or –subnode <xpath> -t (–type) elem|text|attr -n <name> [-v (–value) <value>] -m or –move <xpath1> <xpath2> -r or –rename <xpath1> -v <new-name> -u or –update <xpath> -v (–value) <value> -x (–expr) <xpath>
XMLStarlet is a command line toolkit to query/edit/check/transform XML documents (for more information see http://xmlstar.sourceforge.net/)
xml tr XMLStarlet Toolkit: Transform XML document(s) using XSLT Usage: xml tr [<options>] <xsl-file> {-p|-s <name>=<value>} [<xml-file>…] where <xsl-file> - main XSLT stylesheet for transformation <xml-file> - input XML document file/URL (stdin is used if missing) <name>=<value> - name and value of the parameter passed to XSLT processor -p - parameter is XPATH expression (“‘string’” to quote string) -s - parameter is a string literal <options> are: –help or -h - display help message –omit-decl - omit xml declaration <?xml version=”1.0”?> –embed or -E - allow applying embedded stylesheet –show-ext - show list of extensions –val - allow validate against DTDs or schemas –net - allow fetch DTDs or entities over network –xinclude - do XInclude processing on document input –maxdepth val - increase the maximum depth –html - input document(s) is(are) in HTML formatXMLStarlet is a command line toolkit to query/edit/check/transform XML documents (for more information see http://xmlstar.sourceforge.net/)
Current implementation uses libxslt from GNOME codebase as XSLT processor (see http://xmlsoft.org/ for more details)
xml val XMLStarlet Toolkit: Validate XML document(s) Usage: xml val <options> [ <xml-file-or-uri> … ] where <options> -w or –well-formed - validate well-formedness only (default) -d or –dtd <dtd-file> - validate against DTD –net - allow network access -s or –xsd <xsd-file> - validate against XSD schema -E or –embed - validate using embedded DTD -r or –relaxng <rng-file> - validate against Relax-NG schema -e or –err - print verbose error messages on stderr -S or –stop - stop on first error -b or –list-bad - list only files which do not validate -g or –list-good - list only files which validate -q or –quiet - do not list files (return result code only)NOTE: XML Schemas are not fully supported yet due to its incomplete support in libxml2 (see http://xmlsoft.org)
XMLStarlet is a command line toolkit to query/edit/check/transform XML documents (for more information see http://xmlstar.sourceforge.net/)
XMLStarlet Toolkit: Format XML document Usage: xml fo [<options>] <xml-file> where <options> are -n or –noindent - do not indent -t or –indent-tab - indent output with tabulation -s or –indent-spaces <num> - indent output with <num> spaces -o or –omit-decl - omit xml declaration <?xml version=”1.0”?> -R or –recover - try to recover what is parsable -D or –dropdtd - remove the DOCTYPE of the input docs -C or –nocdata - replace cdata section with text nodes -N or –nsclean - remove redundant namespace declarations -e or –encode <encoding> - output in the given encoding (utf-8, unicode…) -H or –html - input is HTML -h or –help - print helpXMLStarlet is a command line toolkit to query/edit/check/transform XML documents (for more information see http://xmlstar.sourceforge.net/)
XMLStarlet Toolkit: XML canonicalization Usage: xml c14n <mode> <xml-file> [<xpath-file>] [<inclusive-ns-list>] where <xml-file> - input XML document file name (stdin is used if ‘-‘) <xpath-file> - XML file containing XPath expression for c14n XML canonicalization Example: <?xml version=”1.0”?> <XPath xmlns:n0=”http://a.example.com” xmlns:n1=”http://b.example”> (//. | //@* | //namespace::*)[ancestor-or-self::n1:elem1] </XPath><inclusive-ns-list> - the list of inclusive namespace prefixes (only for exclusive canonicalization) Example: ‘n1 n2’
<mode> is one of following: –with-comments XML file canonicalization w comments (default) –without-comments XML file canonicalization w/o comments –exc-with-comments Exclusive XML file canonicalization w comments –exc-without-comments Exclusive XML file canonicalization w/o comments
XMLStarlet is a command line toolkit to query/edit/check/transform XML documents (for more information see http://xmlstar.sourceforge.net/)
feed strings in# query, -I(indenting output), -t(template:) -f(file name) -n(new line) -m(match xpath) //Trait -c(copy xpath) . -n -b(break nesting) file
xml sel -I -t -f -n -m //Trait -c . -n -b ./facade_messy.xml
# delete a path:
xml ed -d "//div[@id='toc']" ? > mod-`?`
xml val -e -d ./test.dtd ./mytest.xml
xml val -e -s ./test.xsd ./mytest.xml
- https://www.systutorials.com/docs/linux/man/1-xml_split/
- https://www.systutorials.com/docs/linux/man/1-xml_grep/
xml_grep
xml_grep --pretty_print indented --cond //www --cond [@mdate=~/2020-..-../] ./dblp-2023-10-01-07.xml 2>/dev/null
xml_grep --nb_results 4 --pretty_print indented --cond "//www/author[@string=~ /^Sebastian/]" ./dblp-2023-10-01-06.xml
xml_grep --nb_results 4 --pretty_print indented --root //www --root //inproceedings --cond pages ./dblp-2023-10-01-06.xml > results.xml
xml_grep --nb_results 4 --pretty_print indented --root "/*/*" --cond "author[string()=~/^Seb/]" ./dblp-2023-10-01-06.xml > results.xml
xsdata samples/order.xsd –output plantuml –package uml_gen
- https://developer.mozilla.org/en-US/docs/Web/XPath
- https://www.benibela.de/documentation/internettools/xpath-functions.html
- https://stackoverflow.com/questions/22214071/xpath-how-to-select-a-range-of-nodes-in-the-node-set
- https://towardsdatascience.com/xpath-for-python-89f4423415e0
# display the last names of all people in the doc
//person/@last-name
# get the 2nd person node
/people/person[2]
# get all the person nodes that have addresses in denver
//person[address/@city='denver']
# get all the addresses that have "south" in the street name
//address[contains(@street, 'south')]
# reject certain nodes:
//(* except script)
//*[not(self::script)]
# Extract an attribute value:
//a/extract(@href, '.*')
# Axes
ancestor
ancestor-or-self
attribute / @
child
descendant
descendantor-self
following
following-sibling
parent
preceding
preceding-sibling
self / .
def insert_xml(xpath, name, val=None) -> list:
""" insert an element before the xpath element """
val_cmd = ["-v", val] if val is not None else []
return ["-i", xpath, "-t", "elem", "-n", name] + val_cmd
def sub_xml(xpath, name, val=None) -> list:
""" insert an element within the xpath element """
val_cmd = ["-v", val] if val is not None else []
return ["-s", xpath, "-t", "elem", "-n", name] + val_cmd
def attr_xml(xpath, name, val) -> list:
""" set the attribute of an xpath element """
return ["-i", xpath, "-t", "attr", "-n", name, "-v", val]
def val_xml(xpath, val) -> list:
return ["-s", xpath, "-t", "text", "-n", "null", "-v", val]
def record_xml(xpath, name, val) -> list:
return sub_xml(xpath, name) + val_xml(f"{xpath}/{name}", val)