Export Framework Overview - quality-manager/onboarding GitHub Wiki

Export Framework Overview

About

The Export Framework (previously known as "Print Framework") is the infrastructure around submitting, queuing, executing, and cleaning up export jobs (PDF/CSV jobs).

Core Classes

Over time the class structure of the framework has become pretty muddy. It wasn't a great design to begin with and it has degraded significantly since then.

Core Classes and Relationships

PrintService

This is the entry point for some (but not all) requests from the web client to the server related to export jobs. It is through the PrintService that the following requests can be made.

  • Submit a new export job
  • Download the file created by an export job

PrintService has two major dependencies. It leverages the PrintCommandFactory for creating PrintJob instances and it uses the PrintJobService to create/read/update those jobs in the repository.

PrintJobRestService

This is not shown in the diagram above, but for requests that are more concerned with managing export jobs, the relevant entry point is the PrintJobRestService.

  • Cancel an export job
  • Delete an export job
  • Inspect the current progress of an export job

PrintEngineQueueMonitor

The PrintEngineQueueMonitor is an Async Task that periodically polls the repository to see what export jobs are waiting to be run. The specifics of the order in which it executes jobs is discussed below, but for now you can assume that jobs are taken one at a time from a sort of queue.

Actually, there are two queues. An initialization queue and an execution queue. This tasks takes some number of jobs off the end of each queue queue and hands them to the PrintEngine for evaluation.

PrintEngine

The PrintEngine is the object that knows how to create a PrintCommand from a job and schedules it for execution.

Each job given by the PrintEngineQueueMonitor is evaluated and the appropriate PrintCommand is constructed for that job. The command is submitted to a ThreadPoolExecutor which will run the command on a separate thread at some point in the future. At any given time only a limited number of commands can be executing simultaneously. This limit is imposed via a thread pool which is created and managed by the PrintEngine.

PrintCommand

When a thread from the thread pool becomes available the ThreadPoolExecutor will use it to execute the PrintCommand that was previously scheduled by PrintEngine.

PrintCommands have two phases. An Initialization phase where the command creates and submits and export jobs that may be dependencies of this job (e.g. comprehensive PDF depends on each relevant summary PDF) and an Execution phase where the requested document is actually created.

PrintCommand follows a template pattern whereby the overall structure of how export commands work is implemented directly in the abstract PrintCommand but the details of some abstract methods are fleshed out in the concrete extension classes.

DocumentRenderingService

If the export job in question is a request for a PDF document, then you can be pretty sure that the corresponding PrintCommand will eventually call the DocumentRenderingService. This service is the entry point for the com.ibm.rqm.render.fop plug-in where all the Apache FOP related logic resides.

In addition to handling all the initial setup of the Apache FOP, XML4J, and XSLT4J libraries, this service uses AbstractPrinter to get the actual bytes of the PDF that was requested.

AbstractPrinter

This class is responsible for managing all the libraries and APIs that we use to create PDFs. It gets or creates the basic XML representation of the resource that was requested, uses XML4J to parse that XML, uses XSLT4J to chain together and execute our custom XSLT stylesheets producing an XSL-FO document describing the PDF to be created. And, finally, it uses Apache FOP to convert that XSL-FO to PDF bytes.

PrintJobReaper

This class is responsible for finding and removing from the repository any export jobs that have expired. It is an Async Task that periodically asks the PrintJobService for a list of all export jobs that have expired, then instructs the PrintJobService to delete them.

Execution Diagram

This diagram is an overly simplistic visualization of the process described above. It shows how these core classes interact with one another to queue, schedule, initialize, execute, and ultimately remove an export job.

Export Framework Execution Diagram

NOTE: The different colors represent different threads. Time starts at the top of the diagram and moves down except that the first arrow for any particular color could start at any time. In the case of the red and green threads, more than one can be running at any time.

Job Priority

One of the more interesting things that the Export Framework has to do is prioritize jobs.

The Problem

This is not simple. The problem falls directly out of the requirement that a limited number of jobs can be running simultaneously and that some jobs are long running while others are very short. Consider a scenario where there are only two threads in the thread pool that governs export job execution. Now, imagine that someone submits a job that will take one hour to complete. During the initialization phase, this job is broken into many thousands of smaller "child" jobs and each is queued up for execution.

Now imagine that moments after initialization is completed, another user submits a job that would take a matter of milliseconds to complete. If the queue is FIFO in nature, this produces a terrible result. The first user's job will still take an hour to complete, but the second user's job will now take 1 hour plus a few ms! That's six orders of magnitude longer.

On the other hand, imagine if we could allow the quick little job to sneak in and complete while we paused the large job. In this case the first user's job would take an hour plus a few ms (they wouldn't even notice the difference) and the second user's job would take just a few ms (as expected). This is a much better outcome.

Now, the problem with this strategy is that it assumes there is a way to know how long a job will take to complete before you complete it. In general, this is not true. There is no good way to predict what is a large job and what is a small one. Even if we could, where would you draw the line? No, we need a system that allows child jobs to be interleaved on the fly in a way that allows quick jobs to complete quickly even though we don't know which jobs are the quick ones.

The Solution

The Export Framework resolves this conundrum with a little math, assigning priorities to each child job when they are created then using those priorities to decide which job to pull next for execution.

The formula used to calculate a priority is:

f(x, y) = round(y / (1 + (y-1)/x))

where x is the position of the job within its own job tree and y is the maximum allowed priority.

The current implementation of this logic uses a maximum priority of 10, giving a graph that looks something like this.

Priority Distribution Graph

An Example

Let's look at an example to help clear this up. Imagine that two export jobs are submitted. For simplicity, let's call them the "red job" and the "blue job". Assume that both red and blue are submitted and complete their initialization (so that all child jobs are also queued up) before any child job is executed.

As each child job is queued, the Export Framework would have assigned a priority to it based on the formula defined above. Using that same formula, we can represent these two jobs as two trees of child jobs like below.

Job Trees

NOTE: In this case we are assuming that jobs were submitted in a pre-order fashion (root then leftmost to rightmost).

Now let's see how the Export Framework would order the execution of these jobs taking into account both their priority and when they were submitted (assume that all red jobs were submitted before blue jobs of the same priority).

Job Queue

This trick gives us the result we desire. Both jobs are being worked in parallel, but the order that they are worked on favors high priority child jobs. Jobs that are rather small will consist of only high priority jobs and will be completed while the larger jobs is effectively paused.

It isn't a perfect system. There are special cases that can be dreamed up to expose its flaws, but in practice the Export Framework has been using this mechanism for nearly 8 years (and counting) and we have not seen a real world need for a more bullet-proof system yet.