Skiplists ‐ an analysis - WilfullMurder/DataStructures-Java GitHub Wiki
This section analyses the expected height, size, and length of the search path in a skiplist. It requires an understanding of basic probability. This is because, several proofs are based on the basic observation about coin tosses:
Let T be the number of times a fair coin is tossed up to and including the first time the coin turns up heads. Then E[T]=2.
Suppose we stop tossing the coin the first time it comes up heads. We can define the indicator variable
Which means,
The proceeding lemmata show that skiplists have linear size:
The expected number of nodes in a skiplist containing n elements, not including occurences of the sentinel, is 2n.
The probability that any specific element, x, is included in list Lr is 1/2r, therefore, the expected number of nodes in Lr is n/2r (See Mathematical background in order to see how this is derived using indicator variables and linearity of expectation). So, the total expected number of nodes in all lists is
The expected height of a skiplist containing n elements is at most logn+2.
For each r∈{1,2,3,...,∞}, we can define the indicator random variable
So, we have
The expected number of nodes in a skiplist containing n elements, including all occurences of the sentinel, is 2n+O(logn).
By lemma Skiplists.3, the expected number of nodes, not including the sentinel, is 2n. The number of occurences of the sentinel is equal to the height, h, of the skiplist so, by lemma Skiplists.4 the expected number of occurences of the sentinel is at most logn+2 = O(logn).
The expected length of a search path in a skiplist is at most 2logn+O(1).
Considering the reverse search path for a node, x, would be the most simple way to envision this lemma. The path starts at the predecessor of x in L0. At any point in time, if the path can go up a level, then it does. If the path cannot go up a level then it goes left. It should be obvious that this is identical to the search path for x, but in reverse.
The number of nodes that the reverse search path visits at a particular level, r, is related to the familiar experiment: Tosss a coin. If the coin shows heads, then move up and stop. Else, move left and repeat. The number of coin tosses before the heads represents the number of steps to the left that a reverse search path takes at a given level. It should be noted that this might overcount the number of steps to the left, since the experiment should end either at the first heads or when the search path reaches the sentinel, whichever comes first. This is not a problem since the lemma is only stating an upper bound. However, lemma Sjiplists.2 tells us that the expected number of coin tosses before the first heads is 1.
Let Sr denote the number of steps the forward search path takes at level r that go to the right. We have just argued that E[Sr]≤1. Additionally, Sr≤|Lr|, since we cannot take more steps in Lr than the length of Lr, so E[Sr]≤E[|Lr|]=n/2r.
Knowing this, we can now finish the proof of lemma Skiplists.4. Let S be the length of the search path for some node, u, in a skiplist, and let h be the height of the skiplist. Then
A Skiplist containing n elements has expected size O(n) and the expected length of the search path for any particular element is at most 2logn+O(1).