Skiplists ‐ an analysis - WilfullMurder/DataStructures-Java GitHub Wiki

Skiplists ‐ an analysis

Introduction:

This section analyses the expected height, size, and length of the search path in a skiplist. It requires an understanding of basic probability. This is because, several proofs are based on the basic observation about coin tosses:

Lemma Skiplists.2

Let T be the number of times a fair coin is tossed up to and including the first time the coin turns up heads. Then E[T]=2.

Proof:

Suppose we stop tossing the coin the first time it comes up heads. We can define the indicator variable

skiplistAnalysisLemma-2

It should be noted that I𝒾 if and only if the first 𝒾-1 coin tosses are tails, so E[I𝒾] = Pr{I𝒾=1} = 1/2𝒾-1. Regard that, T, the total number of coin tosses can be written as:

skiplistAnalysisDefinition-of-T

Which means,

skiplistAnalysisLemma-2-E T

The proceeding lemmata show that skiplists have linear size:

Lemma Skiplists.3

The expected number of nodes in a skiplist containing n elements, not including occurences of the sentinel, is 2n.

Proof:

The probability that any specific element, x, is included in list Lr is 1/2r, therefore, the expected number of nodes in Lr is n/2r (See Mathematical background in order to see how this is derived using indicator variables and linearity of expectation). So, the total expected number of nodes in all lists is

skiplistAnalysisLemma-3

Lemma Skiplists.4

The expected height of a skiplist containing n elements is at most logn+2.

Proof:

For each r∈{1,2,3,...,∞}, we can define the indicator random variable

skiplistAnalysisLemma-4

The height, h, of the skiplist can then be given by

skiplistAnalysisLemma-3-height

It should be noted that Ir is never more than the length, |Ir|, of Ir, so E[Ir]≤[|Ir|] = n/2r.

So, we have

skiplistAnalysisLemma-3-expected-nodes

Lemma Skiplists.5

The expected number of nodes in a skiplist containing n elements, including all occurences of the sentinel, is 2n+O(logn).

Proof:

By lemma Skiplists.3, the expected number of nodes, not including the sentinel, is 2n. The number of occurences of the sentinel is equal to the height, h, of the skiplist so, by lemma Skiplists.4 the expected number of occurences of the sentinel is at most logn+2 = O(logn).

Lemma Skiplists.6

The expected length of a search path in a skiplist is at most 2logn+O(1).

Proof:

Considering the reverse search path for a node, x, would be the most simple way to envision this lemma. The path starts at the predecessor of x in L0. At any point in time, if the path can go up a level, then it does. If the path cannot go up a level then it goes left. It should be obvious that this is identical to the search path for x, but in reverse.

The number of nodes that the reverse search path visits at a particular level, r, is related to the familiar experiment: Tosss a coin. If the coin shows heads, then move up and stop. Else, move left and repeat. The number of coin tosses before the heads represents the number of steps to the left that a reverse search path takes at a given level. It should be noted that this might overcount the number of steps to the left, since the experiment should end either at the first heads or when the search path reaches the sentinel, whichever comes first. This is not a problem since the lemma is only stating an upper bound. However, lemma Sjiplists.2 tells us that the expected number of coin tosses before the first heads is 1.

Let Sr denote the number of steps the forward search path takes at level r that go to the right. We have just argued that E[Sr]≤1. Additionally, Sr≤|Lr|, since we cannot take more steps in Lr than the length of Lr, so E[Sr]≤E[|Lr|]=n/2r.

Knowing this, we can now finish the proof of lemma Skiplists.4. Let S be the length of the search path for some node, u, in a skiplist, and let h be the height of the skiplist. Then

skiplistAnalysisLemma-6-height

The following theorem summarizes the reults in this section:

Theorem Skiplists.1:

A Skiplist containing n elements has expected size O(n) and the expected length of the search path for any particular element is at most 2logn+O(1).

⚠️ **GitHub.com Fallback** ⚠️