Rootish Array Stack - WilfullMurder/DataStructures-Java GitHub Wiki

Rootish Array Stack

Introduction:

One of the drawbacks of the other Array-Based Lists is that, because they store their data in one or two arrays and they avoid resizing these arrays too often, the arrays frequently are not very full. For example, immediately after a resize() operation on an ArrayStack, the backing array a is only half full. Even worse, there are times when only one third of a contains data.

The Rootish Array Stack data structure addresses the problem of wasted space. By storing n elements using O(√n) arrays. In these arrays, at most O(√n) array locations are empty at any time. All remaining array locations are used to store data. Therefore, these data structures waste at most O(√n) space when storing n elements.

A Rootish Array Stack stores its elements in a list of r arrays called blocks that are numbered 0,1,...,r-1 (see Fig.1). Block b contains b+1 elements. Therefore, all r blocks contain a total of

1+2+3+...+r = r(r+1)/2

elements. The above formula can be obtained as shown in Fig.2.

Fig.1: A sequence of add(i,x) and remove(i) operations on a RootishArrayStack.

Arrows denote elements being copied.

Fig.2: The number of white squares is 1+2+3+...+r.

The number of shaded squares is the same.

Together the white and shaded squares make a rectangle of r(r+1) squares.

As we might expect, the elements of the list are laid out in order within the blocks. The list element with index 0 is stored in block 0, elements with list indices 1 and 2 are stored in block 1, elements with list indices 3,4, and 5 are stored in block 2, and so on. The main problem we must address is determining, given an index i, which block contains i as well as the index corresponding to i within that block.

Determining the index of i:

Determining the index of i within its block is fairly simple. If index i is in block b, then the number of elements in blocks 0,...,b-1 is b(b+1)/2. Therefore, i is stored at location

j = i-b(b+1)/2

within block b. Somewhat more challenging is the problem of determining the value of b. The number of elements that have indices less than or equal to i is i+1. On the other hand, the number of elements in blocks 0,...,b is (b+1)(b+2)/2. Therefore, b is the smallest integer such that

(b+1)(b+2)/2 ≥ i+1.

As we have two binomials, we can factor this equation using the F.O.I.L method to produce the quadratic equation. Firstly, rewriting the equation as:

b2 + 3b - 2i ≥ 0.

Before sorting into the corresponding quadratic equation b2 + 3b - 2i = 0. Like all quadratics, this equation has two solutions: b = (-3 + √(9+8i)/2 and b = (-3 - √(9+8i)/2. The second solution will always return a negative and so, has no sense in our application. Therefore, we obtain the solution b = (-3 + √(9+8i)/2. In general, this solution is not an integer, but going back to our inequality, we want the smallest integer b such that b ≥ (-3 + √(9+8i)/2. This is simply

b = ⌈(-3 + √(9+8i)/2⌉.

With this out of the way, the get(i) and set(i,x) methods are straightforward.

Getting & Setting:

First, we compute the appropriate block b and the appropriate index j within the block and then perform the appropriate operation. If we use any of the other Array-Based List data structures then get(i) and set(i,x) will each run in constant time.

Adding & Growing:

The add(i,x) method should look familiar. First, we check to see if our chosen data structure is full, by checking if the number of blocks, r, is such that r(r+1)/2 = n. If so, we call grow() to add another block. With this done, we shift elements with indices i,...,n-1 to the right by one position to make room for the new element with index i.

The grow method simply adds another block to the list of blocks.

Ignoring the cost of the grow() operation, the cost of an add(i,x) operation is dominated by the cost of shifting and is therefore O(1+n-i), just like an ArrayStack.

Removing & Shrinking:

The remove(i) operation is similar to add(i,x). It shifts the elements with indices i+1,...,n left by one position and then, if there is more than one empty block, it calls the shrink() method to remove all but one of the unused blocks.

Once again, ignoring the cost of the shrink() operation, the cost of a remove(i) operation is dominated by the cost of shifting and is therefore O(n-i).

Analysis of Growing & Shrinking:

Unlike an ArrayStack.resize() operation, grow() and shrink() do not copy any data. They only allocate or free an array of size r. In some environments, this takes only constant time, while in others, it may require time proportional to r.

We can see that, immediately after a call to grow() or shrink(), the situation is clear. The final block is completely empty, and all other blocks are completely full. Another call to grow() or shrink() will not happen until at least r-1 elements have been added or removed. Therefore, even if grow() or shrink() take O(r) time, this cost can be amortized over at least r-1 add(i,x) or remove(i) operations, so that the amortized cost of grow() or shrink() is O(1) per operation.

Space usage:

Next, we analyse the amount of extra space used by a RootishArrayStack. In particular, we want to count any space used that is not an array element currently used to hold a list element (wasted space).

The remove operation ensures that a RootishArrayStack never has more than two blocks that are not completely full. The number of blocks, r, used that store n elements therefore statisfies

(r-2)(r-1)≤n.

Again, using the quadratic equation on this gives

r ≤ (3+ √(1+4n))/2 = O(√n).

The last two blocks have sizes r and r-1, so the space wasted by those two blocks is at most 2r-1 = O(√n). If we store the blocks in an ArrayStack (for example), then the amount of space wasted by the List that stores those r blocks is also O(r)=O(√n). The other space needed for storing n and other accounting information is O(1). Therefore, the total amount of wasted space in a RootishArrayStack is O(√n).

Next we argue that this space usage is optimal for any data structure that starts out empty and can support the addition of one item at a time. More precisely, we can show that, at some point during the addition of n items, the data structure is wasting an amount of space at least √n (though possibly only for a moment).

Suppose we start with an empty data structure and we add n items one at a time. At the end of the process, all n items are stored in the structure and distributed among a collection of r memory blocks. If r ≥ √n, then the data structure must be using r pointers (or references) to keep track of these r blocks and these pointers are wasted space. On the other hand, if r < √n then, by the Pigeonhole principle, some block must have a size of at least n/r > √n. If we consider the moment that this block was first allocated we see that, immediately after it was allocated, this block was empty, and therefore wasting √n space. So, at some point in time during the insertion of n elements, the data structure was wasting √n space.

Summary:

The following theorem summarizes our discussion of the RootishArrayStack data structure:

Theorem RST.1:

A RootishArrayStack implements the List interface. Ignoring the cost of calls to grow() shrink(), a RootishArrayStack supports the operations

  • get(i) and set(i,x) in O(1) time per operation
  • add(i,x) and remove(i) in O(1+n-i) time per operation.
Furthermore, beginning with an empty RootishArrayStack, any sequence of m add(i,x) and remove(i) operations results in a total of O(m) time spent during all calls to grow() and shrink().

The space (measured in words used by a RootishArrayStack that stores n elements is n+O(√n).

⚠️ **GitHub.com Fallback** ⚠️