Mathematical Background - WilfullMurder/DataStructures-Java GitHub Wiki
This section is a brief review of some of the mathematical notations and tools used throughout the analysis, including logarithms, big-Oh notation, and probability theory. It is not intended to be an introduction to mathematics for computing. If you feel you are lost or missing the background (or need a refresher) we encourage you to check out the Resources & References section to read (and do exercises from) the approprite sections of the (free) textbooks on mathematics for computing.
The expression ab denotes the number a to the power of b. If b is a positive integer, then this is the equivelant of b multiplied by itself x-1 times.
In this study the expression logac denotes the base-a logarithm of c. That being, the unique value b that satisfies ab = c. Most of the logarithms used throughout the study are base 2 (binary logarithms). Therefore, the base is omitted, so that logc is a contraction of log2c.
A handy (if informal) way to think of logarithms is to think of logac as the number of times we have to divide c by a before the result is less than or equal to 1. As an example, when we do a binary search, each comparison reduces the number of possible answers by a factor of 2. This is repeated until there is at most one possible answer. So, the number of comparisons done by a binary search with an initial n+1 possible answers is ⌈log2(n+1)⌉.
Another logarithm that pops up regularly throughout the study is the natural logarithm. For this we use the common notation ln(k) to denote logℯk, where ℯ (Euler's number) is given by
alogab
= b and changing the base of a logarithmThe factorial function is used in a couple of places in this study. For a non-negative integer, n, the notation n! (n factorial) is defined to mean n!=1*2*3*...*n. Factorials show up because n! counts the number of distinct permutations, i.e., orderings, of n distinct elements. For the special case n=0, 0! is defined as 1.
The quantity n! can be approximated using Sterling's Approximation

In relation to the factorial function are the binomial coefficients. For a non-negative integer, n, and an integer k∈{0,...,n}, the notation denotes:

When analyzing data structures in this studyk, we want to talk about the running times of various operations. The exact running times will, of course, vary from computer to computer and even from run to run on an individual computer. When we talk about the running time of an operation we are referring to the number of computer instructions performed during the operation. Even for simple code, this quantity can be difficult to compute exactly. Therefore, instead of analyzing running times exactly, we will use the so-called big-Oh notation: For a function f(n), O(f(n)) denotes a set of functions,
We generally use asymptotic notation to simplify functions. As an example, in place of 5nlog(n)+8n-200 we can write O(nLog(n)). Which we can prove as follows:
5n*log(n)+8n-200 ≤ 5n*log(n)+8n
≤ 5n*log(n)+8n*log(n) for n ≥ 2 (so that log(n) ≥ 1)
≤ 13n*log(n).
This demostrates that the funciton f(n)=5nlogn+8n-200 is in the set O(nlogn) using the constants c = 13 and n0=2.A number of useful shortcuts can be applied when using asymptotic notation
- O(nc1)⊂>O(nc2), for any c1 < c2.
- For any consants a,b,c > 0, O(a)⊂O(logn)⊂O(nb)⊂O(cn).
The expression O(1) in itself raises another issue. Since there is no variable in this expression, it may not be clear which variable is getting arbitrarily large. Without context, there is no way to tell. In the example above, since the only variable in the rest of the equation is n, we can assume that this should be read as T(n) = 2logn+O(f(n)), where f(n)=1.
Big-Oh notation has been around for a while, being used by number theorist Paul Bachmann as early as 1894, and is useful for describing the running times of computer algorithms. If we contemplate the following code:
for(int i =0; i < n; i++){
a[i] = i;}
- a. 1 assignment (int i = 0)
- b. n+1 comparisons (i < n)
- c. n increments (i++)
- d. n array offset calculations (a[i})
- e. n indirect assignments (a[i]=i)
So, despite being more compact, it gives us nearly as much information. As the running time of the above example depends on the constants a, b, c, d, and e, in general, it won't be possible to compare two running times to know which is faster without knowing the values of these constants. Even if we make the effort to determine these constants (through timing tests, for example), then our conclusion will only be valid for the machine running the tests.
Big-Oh notation allows us to reason at a much higher level, in turn making it possible to analyse more complicated functions. For example, if two algorithms have the same big-Oh running time, then we will not know which is faster and there might not be an obvious winner. One could be faster on one machine, and the other could be faster on another machine. Yet, if the two algorithms have demonstrably different big-Oh running times, then we can be certain that the one with the smaller running time will be faster for large enough values of n.
We can see a clear comparison of Big-Oh running times with the graph illustrated in Fig.1.
In a few cases, we use asymptotic notation on functions with more than one variable. This can be considered to be a fallable methodology, but, seems to be pretty common in comp-sci, however, there seems to be no real standard for this. So, for our purposes, the following definition is sufficient:
Some of the data structures presented in this study are randomised; they make random choices that are independent of the data being stored in them of the operations being performed on them. Due to this, performing the same operations more than once using these structures could result in different running times. When analysing these data structures we are interested in their average or expected running times.
Customarily, the running time of an operation on a randomised data structure is a random variable, and we want to study its expected value. For a discrete random variable X taking on some countable universe U, the expected value of X, denoted by E[X], is given by the formula
One of the most important properties of expected values is linearity of expectation. For any two random variables X and Y, E[X+Y] = E[X] + E[Y}. More generally, for any random variables X1,...,Xk,
A lovely trick that we repeatedly use, is defining indicator random variables. These binary variables are useful when we want to count something. For example, suppose we want to toss a fair coin k times and we want to know the expected number of times the coin turns up heads. Intuition tells us the answer is k/2, but if we try and prove it using the definition of expected value, we get



Using indicator variables and linearity of expectation simplifies things. For each i∈{1,...,k}, define the indicator random variable
