Instruction Optimization - rail5/polonius GitHub Wiki
Instruction optimizations are not yet implemented, but are being worked on.
The Basic Idea
Suppose that polonius-editor is told to execute the following two instructions:
REMOVE 0 2
INSERT 0 abc
These instructions are redundant -- we're deleting 3 characters, and then inserting 3 characters to the same position. You can reduce those two instructions to a single REPLACE 0 abc
At first, that might not sound like much. But if we're editing a 2TB file, for example, removing/inserting at the beginning could take a pretty long time, whereas replaces happen nearly-instantaneously. Needless to say, there are many more cases like this one where the instructions could be optimized.
So, what we would like the program to be able to do is optimize (or simplify) the instructions given to it.
The basic idea of the rest of this document is that we want to try to represent a sequence of instructions as a math expression, which we can then simplify by applying some basic theorems.
So here, we'll define a kind of "field" with our own objects and operations. Although, I don't believe what we define here can really be called a "field," since it doesn't satisfy all of the "field axioms." It works well enough anyway.
The Math: Blocks and their Operations
Objects in this "field" are called "blocks," and can represent parts of the file we're editing. For example:
$$\displaystyle \bigg[ \begin{matrix} 0 \ 1 \ 2 \\a \ b \ c \end{matrix} \bigg]$$
This block can represent a file with only 3 characters: a, b, and c (in that order)
The top-row should be thought of as a position, and its corresponding element in the bottom-row is the datawhich is stored at that position. It might be helpful to think of these blocks as arrays of "key-value" pairs.
Picture a file that looks like this:
$$\displaystyle \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg]$$
If you opened up this file on your computer and looked at it, it would say: abcdef
If we want to make it just say def, we can subtract the abc block from it:
$$\displaystyle \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg] - \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] \ = \ \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ d \ e \ f \end{matrix} \bigg]$$
If instead, you wanted to make it say abcdefxyz, you could add an xyz block to it:
$$\displaystyle \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg] + \bigg[ \begin{matrix} 6 \ 7 \ 8 \\ x \ y \ z \end{matrix} \bigg] \ = \ \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \ 6 \ 7 \ 8 \\ a \ b \ c \ d \ e \ f \ x \ y \ z \end{matrix} \bigg]$$
Let's say instead we wanted to change the abc into hij. We would multiply it by an hij block:
$$\displaystyle \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ h \ i \ j \end{matrix} \bigg] \ = \ \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \\ h \ i \ j \ d \ e \ f \end{matrix} \bigg]$$
It's necessary to define a few arbitrary rules before we continue:
All of these objects have exactly 2 rows.
For each key, there must be a corresponding value.
The keys must be unique and in order (lowest-to-highest)
The keys do not necessarily have to be sequential (you can have "1 - 3 - 5")
The keys do not necessarily have to start from zero (you can start at anywhere $>= 0$)
The lowest possible key is $0$
This block makes up a single mathematical object. Call it $x$
Let's also define a function $mag(x)$ (meaning the "magnitude" of $x$), which gives the number of keys (or the width) of the object. In Polonius, this is called the Block Size.
First, we check which keys are shared in common between the two blocks, then we remove those key/value pairs from the left-hand object, and shift the remaining values leftward. It's kind of related to the $xor$ operation -- if a key appears in both blocks, it gets removed.
In this case, both of them have key #1. So, key #1 is removed from $x$, and all values paired with keys $> 1$ are left-shifted (#2 becomes the new #1, etc)
Note that:
$x - x = 0$ (subtracting a block from itself yields an empty block)
In fact, if $mag(x) = mag(y)$, then $x - y = 0$
Subtraction is NOT commutative ($x - y \ne y - x$)
Subtraction is NOT associative ($x - (y - z) \ne (x - y) - z$)
First, we look for keys that the two blocks share in common. In this case, they both have a key #1
Then, we replace the value in the left-hand block with the value in the right-hand block. In this case, the value under $1$ (which originally was $11$) was replaced with that in $y$, and became $3$.
Note that:
Multiplication is NOT commutative ($x \bullet y \ne y \bullet x$)
Multiplication is NOT associative ($x \bullet (y \bullet z) \ne (x \bullet y) \bullet z$)
Multiplication is NOT distributive ($x \bullet (y + z) \ne xy + xz$)
Multiplication is idempotent ($x \bullet x = x$)
$mag(x \bullet y) = mag(x)$
The Logic: Theorems and Uses
It's important to remember that, although we're calling these operations addition, subtraction and multiplication, they really just represent the INSERT, REMOVE, and REPLACE operations from Polonius. Defining them as a kind of arithmetic just helps us to work some things out logically.
Theorem #0
$$\displaystyle x + \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] + \bigg[ \begin{matrix} 3 \ 4 \ 5 \\ d \ e \ f \end{matrix} \bigg] \ = \ x + \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg]$$$$\displaystyle x + \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ d \ e \ f \end{matrix} \bigg] + \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] = x + \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg]$$$$\displaystyle x + \bigg[ \begin{matrix} 3 \ 4 \ 5 \\ d \ e \ f \end{matrix} \bigg] + \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] = x + \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 6 \ 7 \ 8 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg]$$
$$\displaystyle x \bullet \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 3 \ 4 \ 5 \\ d \ e \ f \end{matrix} \bigg] \ = \ x \bullet \bigg[ \begin{matrix} 0 \ 1 \ 2 \ 3 \ 4 \ 5 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg]$$$$\displaystyle x \bullet \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ d \ e \ f \end{matrix} \bigg] \ = \ x \bullet \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ d \ e \ f \end{matrix} \bigg]$$
'Replace' instructions can be combined.
Of asequenceof 'replace' instructions to the same position, only thelast oneis significant.
Theorem #3
$$\displaystyle x - \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ ? \ ? \ ? \end{matrix} \bigg] + \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] \ = \ x \bullet \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg]$$
This theorem is one of the most important for simplifying expressions in Polonius, since we want to prefer REPLACE operations over REMOVES or INSERTS.
In plain English: removing some characters, followed by inserting new characters into the same position, isexactly equivalentto a single replace operation at that position.
Theorem #4
$$\displaystyle x + \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] - \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ ? \ ? \ ? \end{matrix} \bigg] \ = \ x$$
Inserting some characters, followed by removing those same characters, does not change the input.
Theorem #5
$$\displaystyle x + \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ d \ e \ f \end{matrix} \bigg] \ = \ x + \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ d \ e \ f \end{matrix} \bigg]$$
Inserting some characters, and then replacing some or all of what we just inserted, can be simplified into asingleinsert operation.
Replacing some characters, and then removing those same characters, is equivalent to simply removing those characters.
Uses
We don't need to have any idea about the contents of the file we're editing in order to apply any of these theorems. They're all written in terms of $x$ -- any file will do, as long as the instruction sequence is valid for that file (ie, we're not trying to write to position #10 of a 9-byte file, etc. But if the original sequence is valid for our file, all simplified sequences will also be valid for it)
In application of any of these theorems, we need to watch for left- and right- shifts in the data. For example, consider the following sequence of instructions:
A Simple Optimization Example
$$\displaystyle + \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg] + \bigg[ \begin{matrix} 17 \ 18 \ 19 \\ d \ \ e \ \ f \end{matrix} \bigg] - \bigg[ \begin{matrix} 5 \ 6 \ 7 \\ ? \ ? \ ? \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 14 \ 15 \\ g \ \ h \end{matrix} \bigg] - \bigg[ \begin{matrix} 0 \ 1 \ 2 \\ ? \ ? \ ? \end{matrix} \bigg]$$
We might notice the complementary pair of instructions at the beginning and end:
And we might try to simplify the expression like this, using theorem #4:
$$\displaystyle + \bigg[ \begin{matrix} 17 \ 18 \ 19 \\ d \ \ e \ \ f \end{matrix} \bigg] - \bigg[ \begin{matrix} 5 \ 6 \ 7 \\ ? \ ? \ ? \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 14 \ 15 \\ g \ \ h \end{matrix} \bigg]$$
But this would be wrong! After we inserted the 3 characters "abc" to position 0, the rest of the file was right-shifted by 3 places (the length of the insert). So, if we're deleting that instruction, we have to subtract 3 from every position given between the redundant insert/remove pair.
$$\displaystyle + \bigg[ \begin{matrix} 14 \ 15 \ 16 \\ d \ \ e \ \ f \end{matrix} \bigg] - \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ ? \ ? \ ? \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 11 \ 12 \\ g \ \ h \end{matrix} \bigg]$$
It would also be easy to miss the fact that we can apply theorem #5 here as well -- the addition and the multiplication are also a redundant pair, but it's less obvious because there is a subtraction operation in-between them. This subtraction removes 3 characters and left-shifts all of the data to the right of it. Therefore, when we're replacing at position 11, that's really the same position that was earlier called 14.
Imagine a neighborhood of houses all on a conveyer belt. Whenever a new family moves in, or an old family moves out, the conveyer belt moves left or right and shifts the entire neighborhood. The only operation that does not move the houses is if one family moves out & another family moves in to the same house at the same time.
That's what this process is like. The file is the neighborhood, the "positions" or "places" where the characters go are the houses, and the characters themselves are the occupants of the houses. A single block, written like this:
$$\bigg[ \begin{matrix} 0 \ 1 \ 2 \\ a \ b \ c \end{matrix} \bigg]$$
Is only a snapshot of the file at one moment. The instant that we apply an operation (addition or subtraction, an insert or a remove), the conveyer belt shifts, and many of the old houses have new addresses.
A visual representation of an INSERT operation
A More Involved Optimization Example
Consider the following expression:
$$\displaystyle x - \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ ? \ ? \ ? \end{matrix} \bigg] - \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ ? \ ? \ ? \end{matrix} \bigg] + \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ d \ e \ f \end{matrix} \bigg] + \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ a \ b \ c \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 8 \ 9 \ 10 \\ g \ h \ f \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 8 \ 9 \ 10 \\ h \ i \ j \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 5 \ 6 \ 7 \\ k \ l \ m \end{matrix} \bigg] - \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ ? \ ? \ ? \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 11 \ 12 \ 13 \\ x \ \ y \ \ z \end{matrix} \bigg] - \bigg[ \begin{matrix} 11 \ 12 \ 13 \\ ? \ \ ? \ \ ? \end{matrix} \bigg]$$
First we can apply Theorem #1 to combine the first two subtractions:
$$\displaystyle x - \bigg[ \begin{matrix} 2 \ 3 \ 4 \ 5 \ 6 \ 7 \\ ? \ ? \ ? \ ? \ ? \ ? \end{matrix} \bigg] + \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ d \ e \ f \end{matrix} \bigg] + \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ a \ b \ c \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 8 \ 9 \ 10 \\ g \ h \ f \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 8 \ 9 \ 10 \\ h \ i \ j \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 5 \ 6 \ 7 \\ k \ l \ m \end{matrix} \bigg] - \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ ? \ ? \ ? \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 11 \ 12 \ 13 \\ x \ \ y \ \ z \end{matrix} \bigg] - \bigg[ \begin{matrix} 11 \ 12 \ 13 \\ ? \ \ ? \ \ ? \end{matrix} \bigg]$$
Next, we can apply Theorem #0 to combine those two additions:
$$\displaystyle x - \bigg[ \begin{matrix} 2 \ 3 \ 4 \ 5 \ 6 \ 7 \\ ? \ ? \ ? \ ? \ ? \ ? \end{matrix} \bigg] + \bigg[ \begin{matrix} 2 \ 3 \ 4 \ 5 \ 6 \ 7 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 8 \ 9 \ 10 \\ g \ h \ f \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 8 \ 9 \ 10 \\ h \ i \ j \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 5 \ 6 \ 7 \\ k \ l \ m \end{matrix} \bigg] - \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ ? \ ? \ ? \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 11 \ 12 \ 13 \\ x \ \ y \ \ z \end{matrix} \bigg] - \bigg[ \begin{matrix} 11 \ 12 \ 13 \\ ? \ \ ? \ \ ? \end{matrix} \bigg]$$
We also have a redundant insert/replace pair:
$$\displaystyle \cdot \cdot \cdot + \bigg[ \begin{matrix} 2 \ 3 \ 4 \ 5 \ 6 \ 7 \\ a \ b \ c \ d \ e \ f \end{matrix} \bigg] \cdot \cdot \cdot \cdot \bullet \bigg[ \begin{matrix} 5 \ 6 \ 7 \\ k \ l \ m \end{matrix} \bigg]$$
We can remove using Theorem #5, giving us:
$$\displaystyle x - \bigg[ \begin{matrix} 2 \ 3 \ 4 \ 5 \ 6 \ 7 \\ ? \ ? \ ? \ ? \ ? \ ? \end{matrix} \bigg] + \bigg[ \begin{matrix} 2 \ 3 \ 4 \ 5 \ 6 \ 7 \\ a \ b \ c \ k \ l \ m \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 8 \ 9 \ 10 \\ g \ h \ f \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 8 \ 9 \ 10 \\ h \ i \ j \end{matrix} \bigg] - \bigg[ \begin{matrix} 2 \ 3 \ 4 \\ ? \ ? \ ? \end{matrix} \bigg] \bullet \bigg[ \begin{matrix} 11 \ 12 \ 13 \\ x \ \ y \ \ z \end{matrix} \bigg] - \bigg[ \begin{matrix} 11 \ 12 \ 13 \\ ? \ \ ? \ \ ? \end{matrix} \bigg]$$
While we're at it, we can apply Theorem #2 to simplify those two multiplications:
Here finally is the simplest possible form of this instruction sequence. In Polonius's terms, this means that the following two sets of instructions will always do exactly the same thing to a given file:
It might seem crazy (and I certainly hope it does!) that we can algorithmically go from 10 instructions all the way down to only 2 instructions and have them be exactly equivalent for all files, but this is a natural consequence of the theorems listed above.
In fact, as a direct consequence of Theorems #0, #1, and #2, it should be theoretically possible to reduce any sequence of instructions, no matter how long (100 instructions, 1 million instructions, etc) down to a maximum of 3 instructions. At absolute most: one INSERT, one REMOVE, and one REPLACE.
In terms of time complexity, that means we're going from $O(kN)$ complexity to only $O(k)$ complexity (with $N$ referring to the number of inputted instructions, not to the size of the file). To explain:
Each time Polonius executes an INSERT or REMOVE instruction, it must traverse the file piece-by-piece. At its core, Polonius can only really do replaces, and inserts/removes directly at the end of a file. To insert to the beginning of a file, it has to add some blank space to the end, and chunk-by-chunk move pieces over to the new end until it hits the location of the insert. A similar process is necessary for removes. Replace instructions, for the purpose of this argument, we'll say take 0 time (really only 1 or 2 microseconds generally).
Let's call the time that it takes to traverse a given file $T$. Let's say the file is pretty big, and so traversing it takes a full second. $T = 1$ second.
If we have 100 different INSERT/REMOVE instructions to execute, this will take $100T$ -- or 100 seconds, almost two full minutes before our changes are saved.
But after we simplify that expression to be only 2 instructions (possibly plus one REPLACE instruction, which takes 0 time), it will take only $2T$ -- just 2 seconds.
Further, while 1,000 different such instructions would take $1000T$ and a billion would take $1000000000T$, their respective simplified versions will always take only $2T$. That's a pretty big deal.
Of course, in reality, it takes some time to apply the theorems as well; it takes a little bit of time to simplify the instruction sequence. But for very large files (the kind that Polonius is designed to work with), it's almost always worth it. Besides this, the idea is to have the interactive UI optimize instruction sequences while it's building them, that is, while you're still typing into the file, before you ever hit "save." This way, by the time the UI sends its instructions to polonius-editor, they're already optimized. Another part of the idea is to give polonius-editor a -n or --no-optimize option, just in case.
Even in the extremely unlikely case that the delay we add in trying to optimize the instructions is somehow equal to the time we save from having those optimized instructions, it's still a benefit in that it reduces the amount of time which is spent actually making changes to the file. Because Polonius edits files in-place, if for some reason it's interrupted before it can finish its job, the files will be corrupted (kind of half-edited). Reducing that window is obviously a good idea.