Export Framework Overview - quality-manager/onboarding GitHub Wiki
Export Framework Overview
About
The Export Framework (previously known as "Print Framework") is the infrastructure around submitting, queuing, executing, and cleaning up export jobs (PDF/CSV jobs).
Core Classes
Over time the class structure of the framework has become pretty muddy. It wasn't a great design to begin with and it has degraded significantly since then.
PrintService
This is the entry point for some (but not all) requests from the web client to the server related to export jobs. It is through the PrintService that the following requests can be made.
- Submit a new export job
- Download the file created by an export job
PrintService
has two major dependencies. It leverages the PrintCommandFactory
for creating PrintJob
instances and it uses the PrintJobService
to create/read/update those jobs in the repository.
PrintJobRestService
This is not shown in the diagram above, but for requests that are more concerned with managing export jobs, the relevant entry point is the PrintJobRestService
.
- Cancel an export job
- Delete an export job
- Inspect the current progress of an export job
PrintEngineQueueMonitor
The PrintEngineQueueMonitor
is an Async Task that periodically polls the repository to see what export jobs are waiting to be run. The specifics of the order in which it executes jobs is discussed below, but for now you can assume that jobs are taken one at a time from a sort of queue.
Actually, there are two queues. An initialization queue and an execution queue. This tasks takes some number of jobs off the end of each queue queue and hands them to the PrintEngine
for evaluation.
PrintEngine
The PrintEngine
is the object that knows how to create a PrintCommand
from a job and schedules it for execution.
Each job given by the PrintEngineQueueMonitor
is evaluated and the appropriate PrintCommand
is constructed for that job. The command is submitted to a ThreadPoolExecutor
which will run the command on a separate thread at some point in the future. At any given time only a limited number of commands can be executing simultaneously. This limit is imposed via a thread pool which is created and managed by the PrintEngine
.
PrintCommand
When a thread from the thread pool becomes available the ThreadPoolExecutor
will use it to execute the PrintCommand
that was previously scheduled by PrintEngine
.
PrintCommand
s have two phases. An Initialization phase where the command creates and submits and export jobs that may be dependencies of this job (e.g. comprehensive PDF depends on each relevant summary PDF) and an Execution phase where the requested document is actually created.
PrintCommand
follows a template pattern whereby the overall structure of how export commands work is implemented directly in the abstract PrintCommand
but the details of some abstract methods are fleshed out in the concrete extension classes.
DocumentRenderingService
If the export job in question is a request for a PDF document, then you can be pretty sure that the corresponding PrintCommand
will eventually call the DocumentRenderingService
. This service is the entry point for the com.ibm.rqm.render.fop
plug-in where all the Apache FOP related logic resides.
In addition to handling all the initial setup of the Apache FOP, XML4J, and XSLT4J libraries, this service uses AbstractPrinter
to get the actual bytes of the PDF that was requested.
AbstractPrinter
This class is responsible for managing all the libraries and APIs that we use to create PDFs. It gets or creates the basic XML representation of the resource that was requested, uses XML4J to parse that XML, uses XSLT4J to chain together and execute our custom XSLT stylesheets producing an XSL-FO document describing the PDF to be created. And, finally, it uses Apache FOP to convert that XSL-FO to PDF bytes.
PrintJobReaper
This class is responsible for finding and removing from the repository any export jobs that have expired. It is an Async Task that periodically asks the PrintJobService
for a list of all export jobs that have expired, then instructs the PrintJobService
to delete them.
Execution Diagram
This diagram is an overly simplistic visualization of the process described above. It shows how these core classes interact with one another to queue, schedule, initialize, execute, and ultimately remove an export job.
NOTE: The different colors represent different threads. Time starts at the top of the diagram and moves down except that the first arrow for any particular color could start at any time. In the case of the red and green threads, more than one can be running at any time.
Job Priority
One of the more interesting things that the Export Framework has to do is prioritize jobs.
The Problem
This is not simple. The problem falls directly out of the requirement that a limited number of jobs can be running simultaneously and that some jobs are long running while others are very short. Consider a scenario where there are only two threads in the thread pool that governs export job execution. Now, imagine that someone submits a job that will take one hour to complete. During the initialization phase, this job is broken into many thousands of smaller "child" jobs and each is queued up for execution.
Now imagine that moments after initialization is completed, another user submits a job that would take a matter of milliseconds to complete. If the queue is FIFO in nature, this produces a terrible result. The first user's job will still take an hour to complete, but the second user's job will now take 1 hour plus a few ms! That's six orders of magnitude longer.
On the other hand, imagine if we could allow the quick little job to sneak in and complete while we paused the large job. In this case the first user's job would take an hour plus a few ms (they wouldn't even notice the difference) and the second user's job would take just a few ms (as expected). This is a much better outcome.
Now, the problem with this strategy is that it assumes there is a way to know how long a job will take to complete before you complete it. In general, this is not true. There is no good way to predict what is a large job and what is a small one. Even if we could, where would you draw the line? No, we need a system that allows child jobs to be interleaved on the fly in a way that allows quick jobs to complete quickly even though we don't know which jobs are the quick ones.
The Solution
The Export Framework resolves this conundrum with a little math, assigning priorities to each child job when they are created then using those priorities to decide which job to pull next for execution.
The formula used to calculate a priority is:
f(x, y) = round(y / (1 + (y-1)/x))
where x
is the position of the job within its own job tree and y
is the maximum allowed priority.
The current implementation of this logic uses a maximum priority of 10, giving a graph that looks something like this.
An Example
Let's look at an example to help clear this up. Imagine that two export jobs are submitted. For simplicity, let's call them the "red job" and the "blue job". Assume that both red and blue are submitted and complete their initialization (so that all child jobs are also queued up) before any child job is executed.
As each child job is queued, the Export Framework would have assigned a priority to it based on the formula defined above. Using that same formula, we can represent these two jobs as two trees of child jobs like below.
NOTE: In this case we are assuming that jobs were submitted in a pre-order fashion (root then leftmost to rightmost).
Now let's see how the Export Framework would order the execution of these jobs taking into account both their priority and when they were submitted (assume that all red jobs were submitted before blue jobs of the same priority).
This trick gives us the result we desire. Both jobs are being worked in parallel, but the order that they are worked on favors high priority child jobs. Jobs that are rather small will consist of only high priority jobs and will be completed while the larger jobs is effectively paused.
It isn't a perfect system. There are special cases that can be dreamed up to expose its flaws, but in practice the Export Framework has been using this mechanism for nearly 8 years (and counting) and we have not seen a real world need for a more bullet-proof system yet.