Discrete Fourier Transform to Fast Fourier Transform - ECE-180D-WS-2023/Knowledge-Base-Wiki GitHub Wiki
Discrete Fourier Transform to Fast Fourier Transform
(Updated by Alexander Zhang, a few additional contents for introduction, add application section, conclusion section and reference section)
Introduction
Did you know that the Fast Fourier Transform algorithm could have stopped the nuclear arms race if it had been (re)discovered earlier? Regardless of this missed opportunity, this algorithm is still arguably one of the most powerful tools ever invented. Every form of digital signal processing you can think of, including wifi, 5G, sonar, radar, LiDar, the music you listen to, and the last YouTube video you watched, relies on FFT. In other words, FFT is the strongest bridge between the time and frequency domains that we have.
To gain a deeper appreciation for the Fast Fourier Transform (FFT), it is important to understand its advantages over other signal transforms. The Fourier Transform (FT) provides a continuous signal in both the time and frequency domains, while the Discrete-Time Fourier Transform (DTFT) offers a discrete signal in the time domain but is continuous in the frequency domain. Unfortunately, the limitation of these transforms is that they require infinite data points to perform an accurate analysis, which is impossible for our computers to handle. However, this issue can be resolved by implementing the Discrete Fourier Transform (DFT), which provides a discrete signal in both domains. Despite this, the DFT can still be computationally demanding due to the large number of calculations required. This is where the FFT shines, as it significantly reduces the number of computations needed to perform the same analysis, making it a more efficient choice for signal processing.
How does Discrete Fourier Transform (DFT) work?
To understand the importance of the FFT, let's first examine the number of calculations required to perform a DFT.
Suppose we have a discrete signal x[n] of length N in the time domain. Performing the DFT on x[n] will give us its frequency domain representation X[k] of the same length, where k is an integer. The DFT of x[n] is given by
$$X[k] = \sum_{n=0}^{N-1} x[n]e^{-jk\omega_0n}$$
where $x[n]$, $X[k]$ are of length $N$ and $\omega_0 = 2\pi/N$
$\omega_0$ is the fundamental frequency of the signal, therefore each $\omega_0k$ corresponds to a different rage of frequencies. To analyze each sample at each frequency, each $x[n]$ is multiplied by the frequency term, $e^{-jk\omega_0n}$. The resulting summation indicated how much each frequency appears in the original signal^1.
For a better understanding, let's look at an example. Suppose we have $x[n] = u[n] - u[n-10]$^2
figure 1: $x[n] = u[n] - u[n-10]$
The graph shows that these sample points form a line, which can be intuitively interpreted as having zero frequencies. Let's verify if our DFT aligns with this intuition. $$X[k] = \sum_{n=0}^{9}x[n]e^{-jk\frac{2\pi}{10}n} = \sum_{n=0}^{9} (u[n] - u[n-10])e^{-jk\frac{2\pi}{10}n} = \sum_{n=0}^{9}(e^{-jk\frac{2\pi}{10}})^n = \frac{1-e^{-jk\frac{2\pi}{10}(10)}}{1-e^{-jk\frac{2\pi}{10}}}$$
$$X[0] = lim_{k\to\infty} \frac{1-e^{-j2\pi k}}{1-e^{-jk\frac{2\pi}{10}}} = 10$$
$$X[1] = X[2] = X[3] = \dots = X[9] = 0$$ The result confirms our intuition, as $X[0] = 10$ indicates that all ten samples have a frequency of zero.
Let's consider the complexity of the DFT. For instance, in the example we just explored, there are 10 distinct frequencies. The expression $$\sum_{n=0}^{9}x[n]e^{-jk\frac{2\pi}{10}n}$$ shows that each frequency is multiplied by all 10 samples, resulting in 10 x 10 = 100 complex multiplications. Therefore, the complexity of DFT is $O(n^2)$. Processing one thousand samples would require one million operations. In the early days of the nuclear arms race, DFT was used for detecting underground nuclear testing. Detecting a day's worth of testing would take three years.
How fast is FAST Fourier Transform?
Suppose we have a signal $x[n]$ of length $2^p$. According to the DFT equation, we need to multiply our time domain signal $x[n]$ with a complex exponential $e^{-jk\omega_0n}$, which can be split into a sine and cosine wave at different frequencies depending on the value of $k$. The FFT algorithm reduces the number of computations by eliminating all the redundant operations. This is done by observing the cosine and sine behaviors at even and odd indexes separately.
Consider a signal of length $N=8$. According to our DFT equations, each sample in the signal will be multiplied by a complex exponential (consisting of a cosine and sine wave) at $k=0, 1,...,7$. For the purpose of this discussion, let's focus only on even indexes and the cosine portion of the complex exponential. By examining Figures 3 and 4, we can see that if we compare the first half with the second half of cosine indexes (e.g., $k=0$ with $k=4$, $k=1$ with $k=5$, etc.), there is overlapping at every location where there is a sample. This implies that each sample is being multiplied by the same value twice. Therefore, multiplication with the second half frequencies of the cosine wave ($k=4, 5,...,7$) is unnecessary. A similar pattern applies to odd indexes, except that the sign is flipped. A simple sign change requires zero floating-point operations (flops). This initial computation round alone reduces the calculation by half^3.
figure 2: $x[n] = cos(k\frac{2\pi}{N}n), N = 8$
figure 3: $cos(k\frac{2\pi}{8}n), k = 0, k = 4$
figure 4: $cos(k\frac{2\pi}{8}n), k = 1, k = 5$
To see the real impact of the FFT algorithm, we have to repeat this process until there’s only one even and one odd index left, a 2-point DFT. But exactly how many times can we split the indexes to get to the 2-point DFT? Since $N = 2^p → p =log{_2}{N}$. Therefore, the complexity is reduced to $O(Nlog{_2}{N})$. Using DFT, computing the Fourier transform of one thousand samples would require 1,000,000 complex multiplications, but using FFT, it only takes around 9966. Figure 4 shows the computing time of a DFT versus the FFT for various signal lengths. Notice that the greater the value of N the more FFT outperforms DFT. If N is not a power of 2, we can utilize the FFT algorithm by adding zero-valued samples to the end of the signal until its size becomes a power of 2. This technique is called zero-padding and does not affect the frequency content of the signal.
figure 5: DFT vs FFT Performance
The FFT algorithm has transformed digital signal processing by greatly reducing the computational complexity required for computing a signal's Fourier transform. This efficiency provides us with more time and computing power to tackle complex problems. As Gilbert Strang famously stated, "FFT is the most important numerical algorithm of our lifetime," highlighting the significant impact of this algorithm.
Application of Fast Fourier Transform
For instance, FFT plays a crucial role in audio compression by converting time-domain signals into the frequency domain. This section explores the specific aspects of audio compression using the FFT algorithm. A fundamental understanding of audio compression quality using FFT is essential. Generally, the quality of a captured audio signal depends on its sampling rate and bit depth. The sampling rate indicates the number of times the audio waveform is sampled per second, while the bit depth represents the resolution of each sample. Higher sampling rates and bit depths yield better audio quality but require more storage space. FFT-based audio compression strikes a balance between audio quality and storage needs. The FFT reduces the complexity of the Discrete Fourier Transform (DFT) from $O(N^2)$ to $O(Nlog{_2}{N})$, significantly speeding up the process for large N values. Moreover, the Cooley-Tukey algorithm, the most well-known FFT algorithm used in MP3 file compression, recursively divides the DFT into smaller DFTs of even and odd-indexed samples.
Before delving into the audio compression process, it's essential to understand the importance of selecting a window function. Applied to each frame of the audio signal, the window function minimizes spectral leakage and enhances frequency resolution. Mathematically, the window function w[n] multiplies the audio signal frame x[n], resulting in a windowed frame $$x_w[n]: x_w[n] = x[n] * w[n], \text{for }n = 0, 1, ..., N-1.$$ The choice of window function significantly impacts the quality of the frequency-domain representation. Common window functions include rectangular, Hamming, and Hanning windows. The chosen window function affects frequency resolution and spectral leakage, ultimately impacting overall compression performance.
Here is a general overview of the FFT algorithm's application in audio compression:
- Capture the audio signal: Record or obtain an audio signal, typically in the form of a digital waveform.
- Divide the signal into frames: Segment the audio waveform into small, overlapping frames of equal length.
- Apply the window function: Multiply each frame by a window function to minimize spectral leakage and improve frequency resolution in the transformed signal.
- Perform FFT: Compute an FFT for each windowed frame, converting the time-domain signal to the frequency domain and resulting in a complex-valued spectrum for each frame.
- Quantization and encoding: Quantize the frequency-domain data by selecting only the most significant frequency components and discarding the less important ones. This step compresses the data. Then, encode the remaining frequency components using an efficient encoding scheme.
- Store or transmit the compressed data: Save the compressed frequency-domain data, along with the necessary metadata for decompression.
Comparing FFT-based audio compression with other techniques, each has its strengths and weaknesses. Non-compression or lossless compression techniques, such as Pulse Code Modulation (PCM) and FLAC, preserve the full quality of the original audio but require more storage space and bandwidth. Other lossy compression techniques, such as MP3 and AAC, which also use forms of FFT, are more efficient in terms of storage and transmission but compromise on audio quality to various degrees. The FFT algorithm offers a balance, providing sufficient audio quality for most purposes while significantly reducing data size.
Conclusion
Overall, the Fast Fourier Transform (FFT) is a critical component of modern digital signal processing that has revolutionized numerous sectors, including telecommunications, audio technology, and defense systems. This powerful algorithm has changed the landscape of digital audio, offering a more efficient solution for transforming time-domain signals into the frequency domain, ultimately paving the way for highly efficient audio compression techniques. The FFT's ability to strike a balance between audio quality and storage needs has made it the backbone of audio compression. Its continual refinement and improvement are evidence of its enduring relevance in a world that is increasingly dependent on digital technologies. From facilitating online streaming services to being a key player in the 5G revolution, the FFT's impact is wide-ranging and profound. While the FFT has its limitations and is not always the optimal choice in every situation, its versatility and computational efficiency make it a robust solution in a broad spectrum of applications. As our understanding and technology continue to advance, it is likely that we will continue to find new and innovative uses for this remarkable algorithm. Despite its inception over half a century ago, the FFT continues to have a substantial impact on our lives, often in ways that are not immediately obvious. As we move further into the digital age, the FFT's importance will only continue to grow. It serves as a testament to the transformative power of mathematical algorithms in shaping the world we live in today and the future.