Architecture - usnistgov/xslt-blender GitHub Wiki
XSLT Blender Architecture - diagram
Here's a copy of the diagram in its current form; see the latest in the repo under docs/system-architecture-diagram.svg. (When the image is merged and that link is stable, please remove this copy and call it directly.)
Detailed description The diagram shows a large box labeled "Web browser instance". with nested boxes showing its components. A large box inside is labeled "XSLT Blender Runtime". Sitting on its top edge is a smaller box with dotted edges labeled "Interactive Display". On the boundary between these two levels is a round-cornered box labeled "Application JS/CSS", which in turn contains another rounded box labeled "Application HTML", a squared box to the left labeled "HTML Result", and, above these, an area labeled "Browser DOM. Into the "Application JS/CSS" box, a directed path indicated by an arrow heads in from "Interactive display", down to an adjoining box, "XSLT Blender JS". From there the path exits and descends to a new box (to the right side inside "XSLT Blender Runtime"), labeled "XML input". The "XML Input" box is grey, with thicker edges and square corners. Adjoining it on the left is a large pink box (regular edges, square corners) labeled "XSLT Engine: browser component". It contains five white boxes, with the labels "XSLT Stylesheet (functional mapping)", for the largest, and (for several smaller boxes) "XSLT Modules and Resources". Out the top on the left of the big pink "Engine" box, the arrowed path proceeds again up the left side back into the lower left side of the "Application JS/CSS" box at the top, into its contained "HTML Result" box. Directly above this from the "Browser DOM" box, the arrow then exits to go back up into "Interactive display" and complete the loop. This clockwise path through the diagram indicates the data flow, each component producing results from its inputs and passing those results to the next one.
Notes on the diagram
- The diagram is generic, representing an simple, idealized XSLT-based application. Particular applications may differ from it in detail.
- The arrowed path represents information flow.
- Green boxes represent HTML. The pink box is a browser component. The grey box represents the XML source for the transformation that drives the application.
- Boxes with rounded corners are specific to an application, developed by the application developer and delivered with it. These include "Application HTML", "Application JS/CSS", and boxes representing XSLT stylesheets, components or resources. All other boxes have square corners, indicating resources either delivered with the framework, or provided by the end user. Some of these boxes have dotted edges, representing dynamic components generated at runtime: "Interactive Display", "Brower DOM" and "HTML Result". Other resources (with solid edges) are considered to be static.
Process insulation and the browser runtime
Keeping the entire runtime in the browser insulates these applications from the rest of your system. They run locally, but as presented to the system, they are regarded as resources acquired from the open Internet, making them susceptible to controls implemented by browsers over what kinds of resources can be freely acquired and used by requesting parties (so-called CORS policies for cross-origin resource sharing, the general title for this family of security problems). These controls prevent hijacking and misusing your system resources for unauthorized purposes, and help to prevent data exposure and data theft - making it difficult to do by mistake and easier to detect and prevent. In short, to the extent the browser protects you from applications reading data off your system unless you specifically load them, it also protects these applications from doing so.
Keeping the entire runtime in the browser also keeps the data in your local system. Data supporting these operations can be loaded for processing, but it is never written out or passed to a server. From inside, the XSLT has access to functionality for retrieving data (implemented by browsers as an http GET
) but not for posting it. The document with access to this functionality from within XSLT, document()
, is fairly easy to find, examine and verify. Web protocols other than http
and https
are likely not to be supported (and can be examined when seen). Without extension, XSLT itself is limited in functionality to reading and manipulating data; extensions are also easy to detect and examine.
Testing and assessing an XSLT Blender application thus entails (1) ensuring that the infrastructure is safe and dependable, and (2) that the clear and demonstrable functionality of the XSLT - judged both by examination and testing including unit testing - is innocuous from a data security perspective. Since great things can be accomplished in XSLT simply mapping, renaming and elaborating data structures from XML input, leveraging the latent semantics of its native format, even very simple transformations can support very useful operations.
Static vs dynamic components
In this illustration there are static and dynamic components. All of these components comprise technical infrastructure implemented in software, which together in the application support data processing operations as directed by a user, in conformance with relevant specifications.
Static components are written, developed, and deployed to the runtime, and serve effectively as static resources.
Dynamic components are those that must change -- that cannot be static for the application to do anything at all. Even a simple application that does nothing but present a simple view of a simple data structure, will have a dynamic component to the extent that the view must be produced fresh, for a new source data set.
XSLT Blender depends on the availability of an XSLT Engine as a static component. An application will then be provided with its logical specifications in the form of transformation code (XSLT). This code base can be small because it works at a high level of abstraction. Since it too is static, it can be examined, evaluated and traced.
In contrast, the only dynamic elements - are the XML input -- often provided by a user or loaded at runtime (otherwise the entire application could be cached) and those components directly derived from it, namely the nominal transformation results (taking the form of HTML) and the DOM rendering of those results as they are spliced into a page view or other memory structure.
Other static components besides the XSLT Engine (provided today with a browser) and the stylesheet (for an application) include the following:
- Application JS/CSS - whatever script or CSS provides support for the particular project or demo
- Application HTML - framing HTML for the page view and scripting
- XSLT Blender JS - glue functionality for the application to use XSLT APIs (this component is maintained in Typescript and compiled into Javascript)
Since the last of these is a library used across all applications, this means a particular application development entails only
- A hosting page (one or more)
- Some CSS for the application and maybe some light Javascript for UI support
- An XSLT or set of XSLTs executing a mapping from an XML vocabulary into HTML, optimally tagged for the application
- If needed, additional resources (encoded in XML)
- Implicitly, an XML vocabulary stipulating boundaries around the scope of application
For the last of these, applications here either state the vocabulary targeted, or support a 'mini' vocabulary designed as part of the application.
Deployment; caching; air-gapping
Layered systems
XSLT 1.0 and advantages of a 4th-gen domain-specific language for transformations, combined with declarative design
- interactivity and page behavior/display can be pushed out to Javascript/CSS
- where it can be simpler due to XSLT's capability to express relations, providing pages with 'smart tagging'
- transformations can be focused on what they do well, which nothing else does as well (tree casting / filtering / ornamentation)
XSLT 1.0 as side-effect free, functional language
In an XSLT Blender application, XML (or HTML) provided as input is provided to the browser's DOM parser, producing a DOM document instance that can be made accessible to an XSLT process. As 'pull' processor, an XSLT engine works by starting at the root of the tree of objects (which may be delivered by parsing a file, or otherwise read or provided) and building a result structure as instructed by matching (XSLT) templates. These templates can accomplish all of the basic functional operations recognizable (from other contexts) as map, filter, and reduce, separately or in combination, over branches of the tree at any level of hierarchy; depending on the transformation the entire source tree may be processed and reflected (echoed, copied or rewritten) to the output, or only parts of it. This model provides for implementations to offer radically reworked variants and expressions of inputs, while avoiding performance and code maintenance issues as data sets grow in size and complexity. (Many or most transformations' run times will scale linearly to the size of inputs, not exponentially.)
Part of how XSLT effects this is by being side-effect free, in the sense that all operations are defined to operate independently of any system state. Additional to nominal inputs (as XML documents), XSLT executions may support parameterizing sets of defined values, in the application runtime, but apart from this, XSLT 1.0 transformation must work entirely independently of external systems or data. Consequently (and among other consequences), processors are deterministic and testable against a single specification. The XSLT 1.0 Recommendation played this role admirably until superseded by later versions of the language, and still applies usefully to XSLT 1.0 implementations such as those still available in browsers.
Due additionally to its small and highly declarative instruction set, this makes XSLT easy to read and understand, once its basic processing model is understood.
Code transparency and auditing
See also Assessment