Interference: Where the magic happens! - RPIQuantumComputing/QuantumCircuits GitHub Wiki
I am going to introduce a decently contrived problem in hopes of explaining interfence through example as exactly how to interfere is a matter of algorithm design as opposed to understanding of the architecture.
Consider the following: I am an evil man who have made a function that either spits out $0$ or $1$ in an equal ratio or spits out only one of those values all the time, i.e. the function is either "balanced" or it is "constant". Call such a function $f$. We call such given functions "oracles" as we could really care less about what that function actually is as much as we care about what it does in the fullest generality.
Classically, you might say "oh, try a few values, if you get a different value than any time before, you know it must be balance, when assuming the function $f$ either strictly balanced or constant." How many times must you do this to be absolutely certain? Well, if you check less than half plus one of the solutions, maybe all the other values are the opposite to what you seen if you only seen one value at the current point in time. Although, the probability of that being the case is very low. But, we don't care about "almost" correct algorithms-we care about correct ones and, to be correct, one must check $O(2^{n})$ values! Could we do better on a quantum computer?
First, can $f$ be implemented on a quantum computer? Of course, this may seem silly, but I have left something special about quantum computing unexplained-all of our computations must be reversible. The reason is that all operations must perserve our probability, i.e. be unitary transformations, and all unitary transformations imply a square matrix and invertibility meaning that the idea of "deleting" information is simply impossible as is the concept of a one-way computation on a quantum computer.
Suppose you give me a classical function $f$ on $n$-bits. I am going to simply add another qubit and use my first $n$-qubits to pass on the input value and "output" in some manner the evaluation to the last qubit. Trivially, knowing the original input in the output means I can recover the first $n$-qubit's state. Think, if I couldn't obtain the input from the output, the function could not possibly be invertible. Now, if I run my function $U_{f}$, to denote the unitary version of $f$, there must exist a $U_{f^{-}}$, its inverse, for it to be invertible, that could leave the output $n$-qubits, the same as the input $n$-qubits, unchanged and could take my output and give me the input. Suppose I applied the XOR operation on the last qubit with $f(x)$'s value. Now, $U_{f}$ applied twice gives me the input for the first $n$-qubits and the last qubit has f(x) XOR f(x) XOR x. The XOR has the feature that if two arguments are the same the output is zero, thus the above reduces to 0 XOR x which equals x when considering XOR's definition. Thus, we found $U_{f^{-}}$, it was $U_{f}$, meaning our operation is invertible. To prove the operation is unitary, we must also show that f(x) XOR x is a unitary transformation. XOR classically can be implemented with the CNOT gate, a unitary gate. Okay, that was a cop-out, but you can prove to yourself the classical XOR output is the control qubit in CNOT and the other two values remain unchange in CNOT.
All of that was to prove that $f(x)$ could exist on a quantum computing. But, such an observation cannot be taken for granted, we are literally reinventing the wheel here.
Suppose I initialize the first $n$-qubits with $0$ and apply the hadamard gate to each individually, $H\ket{0} = \frac{1}{\sqrt{2}}(\ket{0} + \ket{1})$. Thus, $\psi = \frac{1}{\sqrt{2^{n}}} (\bigotimes)^{n} (\ket{0} + \ket{1})$, when noting an application of $H$ on one qubit is independent of that to the other and considering the permutation, found through a mathmatical operation called the tensor product of the individual states being applied n times, $\bigotimes$. Next, suppose the next qubit is initalized to $\ket{1}$ when $H$ is applied, one gets $H\ket{1} = \frac{1}{\sqrt{2}}(\ket{0} - \ket{1})$, by definition of $H$. Thus, the total state is just the product of the first $n$-qubits with the last qubit: $\ket{\psi} = \frac{1}{\sqrt{2^{n + 1}}}\sum_{x \in {0, 1}^{n}} \ket{x} (\ket{0} - \ket{1})$. When applying my function, I get $U_{f}\ket{\psi} = \frac{1}{\sqrt{2^{n + 1}}}\sum_{x \in {0, 1}^{n}} U_{f}(\ket{x} (\ket{0} - \ket{1}))$ which simplifies to $\ket{\psi} = \frac{1}{\sqrt{2^{n + 1}}}\sum_{x \in {0, 1}^{n}} \ket{x} (\ket{0 \oplus f(x)} - \ket{1 \oplus f(x)})$.
If $f(x) = 1$, the state becomes $-\ket{0} + \ket{1}$ whereas if $f(x) = 0$, the state becomes $\ket{0} - \ket{1}$. Thus, $(-1)^{f(x)}(\ket{0} - \ket{1}) = \ket{f(x)} - \ket{1 \oplus f(0)}$. Using the identity simplifies our expression to the following:
$\ket{\psi} = \frac{1}{\sqrt{2^{n + 1}}} \sum_{x \in {0, 1}^{n}} (-1)^{f(x)}\ket{x}(\ket{0} - \ket{1})$. Now, let us apply the hadamard gates to the first $n$ and measure the last one. The last one gives us $\ket{0}$ with 50% probability and $\ket{1}$ with 50% probability.
Now, $\ket{\psi} = \frac{1}{\sqrt{2^{n + 1}}} \sum_{x \in {0, 1}^{n}} (-1)^{f(x)}\ket{x}$, which is easier to work with.
To continue, an ancillary proof will be made. It will be shown that $H_{n}\ket{a} = \frac{1}{2^{n/2}}\sum_{x} (-1)^{a \cdot x}\ket{x}$. For $\ket{0}$, $\frac{1}{\sqrt{2}}(\ket{0} + \ket{1})$. For $\ket{1}$, $\frac{1}{\sqrt{2}}(\ket{0} - \ket{1})$. Both are consistent with the formula, as $(-1)^{0} = 1$ and $(-1)^{1} = -1$. Thus, the identity has been proven for $n = 1$.
Suppose $H_{n - 1}\ket{a_{1}...a_{n - 1}} = \frac{1}{2^{n/2}}\sum_{x} (-1)^{a_{1..n-1} \cdot x_{1..n-1}}\ket{x_{1}...x_{n-1}}$. As each index represents an independent basis element, one can represent $H_{n} = H_{n-1}\ket{x_{1}...x_{n-1}} \bigotimes H\ket{x_{n}}$.
$H_{n}\frac{1}{2^{n/2}}\sum_{x} (-1)^{a_{1..n-1} \cdot x_{1..n-1}}\ket{x_{1}...x_{n-1}} \bigotimes H\frac{1}{\sqrt{2}}(-1)^{a_{n}x_{n}}\ket{x_{n}}$.
Either $a_{n} = 0$ or $a_{n} = 1$, thus the base case applies and by inductive hypothesis, $\ket{a_{1}...a_{n-1}} \bigotimes \ket{a_{n}}$ which gives $\ket{a_{1}...a_{n}}$. Thus, the inductive step has been proven. Thus, $H_{n}\ket{a_{1}...a_{n}} = \frac{1}{2^{n/2}}\sum_{x} (-1)^{a_{1..n} \cdot x_{1..n}}\ket{x_{1}...x_{n}}$ for all $n$. Now, all states are linear combination of the basis states, thus this proof generalizes to all possible inputs.
With the following proof, $\ket{\psi} = \frac{1}{2^{n}} \sum_{x \in {0, 1}^{n}} \sum_{y \in {0, 1}^{n}} (-1)^{f(x)} (-1)^{\bra{x}\ket{y}} \ket{y}$.
The probability of measuring $\ket{0}^{n}$ is $|\frac{1}{2^{n}}\sum_{x \in {0, 1}^{n}} (-1)^{f(x)}|^2$. If $f(x)$ is constant, it sums to one and, if $f(x)$ is balanced, the $\ket{0}$ and $\ket{1}$ cases cancel making the probability zero. Exactly what we want! And, we only needed one call to $U_{f}$! Thus, we could that a quantum computer could do this in $O(1)$ while a classicaly computer would take $O(2^{n})$.
Okay, that was just math. How would we motivate such an algorithm? Well, the oracle multiplies by $-1$ to the states with $\ket{1}$ and acts as identity with states with $\ket{0}$. $H^{n}$ on zero produces a superposition with no negative phase between its states. As $H^{n}$ is its own inverse, it makes sense that any $-1$ value in any of the positions would give us zero. If all the values are $\ket{1}$, the negative can be factored out as global phase, which is not measurable and, thus, does not effect the result of the computation. If all the values are $\ket{0}$, no negative phase exists making it return all zeros. But, that extra qubit essentially "kicked" its negative phase to the states without any relative phase, which is rather odd weird feature. Our application of $U_{f}$ effected our input? How, our output is literally just our input! Well, it must be measured to be our input, but we cannot measure this "quantum" phase. Now, the structure of the operator, our found identity, made this possible. If a relative phase exists between any of our superpositioned states, it must be the case that we yield a different answer from $H^{n}$ as it definitionally, when considering it as a fourier transform, takes different pure "phase" states, which be denoted as a $0$ or $1$ for having it or not having it thus forming an equal basis for our system. In fact, we could measure in the $H$ basis, how weird is that? Anyway, hopefully, you see how important this concept of phase is, it can fundamentally change our probability of measurement! In fact, thinking of our -1s and 1s as cancelling and the implication of me saying "phase" encourages one to call this an "interference" effect.
Quantum mechanically, the analogy is even stronger literally corresponding to interference of light! But, I will leave that to physics textbooks. Right now, appreciate this weird effect's power. It turned global information, information of our function, about our function into local information, the one state we measured. Also, note, we only needed one bit of information and it took one query. We didn't somehow get more information that we could by evaluating a function; we simply changed the information given to us. In essence, this is all we could ask for by a quantum computer, at least if we deterministically want an answer. If you are interested, you can look at QRAM and other probabilistic data recovery to see how we can kind of cheat this a bit.