The Ultimate Guide to Mechanics of the FFT Algorithm
with Applications in Signal Processing
Contents
- Introduction
- What is the Fourier Transform?
2.1 Time Domain
2.2 Frequency Domain
2.3 The Fourier Transform: A Mathematical Perspective - The Limitation of the Traditional Discrete Fourier Transformation Calculation
- The FFT Algorithm
- The Matrix View
- The Bottom Line
Note: The entire code file used in this article is available at the following repository: https://github.com/namanlab/FFT_Algorithm_Code
Introduction
The very foundations of the world around us, from the behavior of quantum particles to the movement of large celestials are probably governed by algorithms. As the silent architects of our digital cosmos, they are like the gods of the modern era sculpting the contours of our technological reality. Algorithms are omnipotent, as they command the intricacies of reality with unmatched precision and understanding. They manifest an omnipresence, quietly shaping our experiences, guiding all of technology, and influencing the very fabric of our interconnected world. Finally, they also exhibit an omniscient prowess as they help mankind decipher complex patterns and navigate the vast realms of data buried within our nature.
An algorithm isn’t just a method or a technique of doing something, it’s also about doing it efficiently in a way that saves us time and space, the two constraints that basically led to the entire field of their study. In this article, we will explore one of the most brilliant algorithms of the century: the Fast Fourier Transform (FFT) algorithm. The FFT algorithm helped us solve one of the biggest challenges in audio signal processing, namely computing the discrete Fourier transform of a signal in a way that is not only time efficient but also extremely beautiful. I hope that by the end of this article, you will be able to appreciate the sheer elegance of this revolutionary algorithm. Let’s begin!
What is the Fourier Transform?
Note: If you’re already familiar with the concept of a Fourier transform, you may skip this section and proceed to the next one, where I’ll talk about the need for the FFT algorithm and how it works.
The Fourier transformation is essentially a mathematical technique that allows us to convert a signal from its time domain to its frequency domain. But what do you even mean by time and frequency domain? To understand this, first let us think about the fundamental question: what is a signal?
In the simplest sense, a signal is just a variation in a physical quantity. This physical quantity could be anything measurable: speed, voltage, current, pressure, energy, you name it. Signals can be broadly categorized into two domains: time domain and frequency domain.
Time Domain
In the time domain, a signal is represented as a function of time. In other words, this means that we can plot/represent the signal’s behavior with respect to time, and you observe how it changes over a specific time interval.
For example, if you are measuring the voltage across a resistor in an electrical circuit, the signal in the time domain would show you how the voltage varies at different points in time. Similarly, the time domain representation of a sound just shows how the amplitude of the sound wave (which is just the extent of air pressure) varies over time. The signal could be represented mathematically as a function:
x(t) = 2t² + 2t − 1
This is the continuous representation of the signal x as a direct function of time, t (in seconds). However, for most practical applications, we don’t know the true functional form of the signal. All we have is a discrete sample of the signal (at different points of time) that could be represented as a simple vector such as this:
The vector just shows the value of x at 8 different (equally spaced) intervals of time. The spacing between the time intervals is called the time period, T of the signal. So, if the signal was sampled at intervals of 2 seconds each, the time period would be T = 2 seconds.
Frequency Domain
In the frequency domain, a signal is represented as a function of frequency. Instead of analyzing how a signal changes over time, the focus is on its frequency components or the different frequencies present in the signal. This may be a bit more difficult to understand, so let’s spend some more time talking about this with an example of sound waves. Imagine you’re listening to a piece of music on the radio. In the time domain, you experience the music unfolding over time — how the melody progresses, the rhythm of the beats, and the duration of each note.
Now, let’s switch to the frequency domain. Think of the frequency domain as if you’re looking at the music from a different perspective. Instead of focusing on how the music evolves over time, you’re interested in the individual tones or pitches that make up the overall sound. Imagine you can isolate the specific musical notes, such as the deep bass, the mid-range tones, and the high-pitched elements. How cool would that be? Think about what constitutes the music: the individual instruments and the singers. Each instrument and voice present in the music has a unique signature in the frequency domain. The bass guitar might dominate in the lower frequencies, the vocals may cover a broad range, and the cymbals and high hats contribute to the higher frequencies. That’s where the frequency domain steps in as a superhero of sorts; it allows you to break down the complex mixture of sounds into its constituent parts. In essence, it provides a different viewpoint, focusing on the building blocks of a signal’s sound rather than its progression over time.
In its frequency domain, the signal could be represented as a function (continuous) like y(f) = 2f² + 3 or as a simple vector (akin to the time domain) such as this:
The vector just shows the amplitude/extent of the presence of the different frequency components. The first element (1) could represent the amplitude of the lowest frequency component (say 1 Hz, Hz is the unit of frequency). Likewise, the second element (2) could represent the amplitude of the next frequency component, and so on.
The Fourier Transform: A Mathematical Perspective
Now that we have some idea of how the signal can be represented, visualize the Fourier Transformation as a magical lens that allows you to switch your view between the two representations of signals. . It acts as a bridge between the time and frequency domains, allowing us to analyze and understand signals in both time and frequency perspectives.
Now, we analyze what I just said using some math. The Fourier transformation is a function that takes the signal in its time domain as input and decomposes it into a sum of sine and cosine waves of varying frequencies having their amplitude and phase. The resulting representation is nothing but the frequency domain (or what we also call the spectrum) of the signal. Mathematically, the Fourier transform of a continuous signal in its time domain x(t) is defined as follows:
where i = √(-1) is the imaginary number. Yes, the Fourier transformation yields a complex output as a result that includes both a complex phase and magnitude. Nevertheless, in many practical scenarios, our focus is primarily on the magnitude of the transformation, and we often disregard the accompanying phase. Given that digitally processed signal is discrete, we can establish the discrete Fourier transform (DFT) as its analogous counterpart:
Here we have simply replaced the integral with sum as we may only have discrete time samples and the true functional form of the signal may be unknown to us. Instead of an infinite number of samples, suppose we have a finite number of samples, call it N: the number of time samples or the length of the vector representing the signal. Then we get the so-called short-term Fourier transform of the signal:
where T is the time period. The above function can be computed for any value of f and its magnitude just shows us the extent to which that particular frequency component is present / power of that frequency component. For instance, given the following vector representation, we may compute the Fourier transform at f = 0.5 and f = 0.25:
Suppose the values of x are measured in intervals of T = 1 second each. Then,
The above calculation requires the use of some basic complex number properties, mostly the Euler’s identity: exp{πi} = −1. The output, essentially allows us to compare the presence of different frequency components. This leads us to the next question: what values of f do we consider? Theoretically, we could obtain the value of the Fourier transform for any value of f, thus it becomes imperative to find the right range of values of f, for which the Fourier Transform gives a good picture of the underlying signal and is also interchangeable i.e., it can be used to obtain the time domain back. For most practical applications, we only consider frequency bins that are an integral multiple of 1/(TN) where TN is the total duration of the signal (the number of samples N times the duration of each sample, T) we have. The reason for this is closely related to the concept of sampling and the Nyquist-Shannon sampling theorem, an idea that is not particularly relevant to this article. But if you’re curious about it, feel free to refer to this page.
Before moving forward, let’s take a moment to summarise everything we’ve covered so far: a signal is just a variation in a physical quantity that can be expressed as a function of time (time domain) or as a function of frequency (frequency domain). Both these representations are equivalent (i.e., one can give us the other) and the Fourier transform is the method for converting one representation to another. The Fourier Transform of a continuous signal is represented as a continuous function in the frequency domain. However, when we work with digital signals (discrete-time signals), we sample the continuous signal at discrete points. This gives us the following formula for computing the short-term discrete Fourier transform of a signal:
Since we only consider frequencies that are an integral multiple of 1/N, we get the following:
where i is the complex number √(-1), j is the index of the sample in the signal and k is the index of the frequency bin for which we’re computing the power. Since it’s cumbersome to write y(k/(NT)) again, we simply define a new function:
And that my friends, is the equation for the Fourier Transform that we commonly encounter in textbooks. If you’re still unsure about how and why this works, here’s an excellent explanation of Fourier transforms. For the next section of this article, we will put in some real data to calculate the complete Fourier Transform of a signal, write up some code, and discover the limitations of the traditional approach to computing the Fourier Transform of a signal. This will eventually lead us to the core of this article: the FFT algorithm.
The Limitation of the Traditional Discrete Fourier Transformation Calculation
Let’s start this section by calculating the Fourier Transformation of a simple signal consisting of just 8 samples:
As we can see from the formula, we don’t care about the time period i.e., the intervals at which the quantity is measured as long as it is sampled uniformly. Now, we may proceed to find the Fourier transform by plugging in the values into the formula for different values of k (ranging from k = 0 to k = N — 1 = 7). Consequently, we need to calculate the following:
Let us simplify the calculations by taking α = exp{-2πi/N}. Thus, all we need is:
and so on for all 8 values of k. This is quite a lengthy calculation. Can we do better? Of course, we can use a simple Python program to do the job for us. Here’s a traditional (referred to as the brute force) approach that essentially goes through every every element in the vector and computes the required term for all values of k from 0 to N — 1:
import numpy as np
def simple_dft(signal):
# Get the number of samples in the signal
N = len(signal)
# Initialize an empty list to store the result (DFT coefficients)
res = []
# Iterate over each frequency bin (k)
for k in range(N):
# Initialize the current DFT coefficient for the given frequency bin
cur_value = 0
# Iterate over each sample in the signal (j)
for j in range(N):
# Calculate the complex exponential term and accumulate
cur_value += signal[j] * np.exp(-2 * np.pi * 1j * j * k / N)
# Append the result for the current frequency bin to the list
res.append(np.round(cur_value, 5))
# Return the list of DFT coefficients
return res
simple_dft([1, 2, 0, 5, 9, 2, 0, 4])
# Output: [(23+0j), (-8.70711-0.70711j), (10+5j), (-7.29289-0.70711j),
# (-3-0j), (-7.29289+0.70711j), (10-5j), (-8.70711+0.70711j)]
Using this function, we easily get the 8 DFT coefficients required. We can also verify our calculations using the fft function provided by numpy:
# Compute the FFT using NumPy's fft function
a = np.fft.fft([1, 2, 0, 5, 9, 2, 0, 4])
# Compute the DFT using our simple_dft function
b = simple_dft([1, 2, 0, 5, 9, 2, 0, 4])
# Check if the results are element-wise close within a tolerance
print(np.allclose(a, b))
# Output: True
Cool, we can get the result correctly! But, is the method of calculating really efficient? What do you think is the time complexity of this function? It involves two nested for loops each iterating over the entire range of values from 0 to N — 1. Thus, the time complexity is of the order O(N²). This may not seem too bad, but for most practical applications, an O(N²) time complexity could mean extremely slow time to get the results. Let’s put this to numbers.
Suppose we are computing the Fourier transform of an audio sample that’s just 10 minutes long. For most traditional applications, we often sample at a rate of 22050 i.e., we measure 22050 samples at every second (this may seem a lot, but it really isn’t, this is the sampling rate that’s most commonly used to maintain the quality of the audio sample). This means that we have about 10*60*22050 = 13230000 samples. To take the DFT of this sample, we will therefore need at least N² = 13230000² = 175032900000000 number of computations, and that’s a lot! If you’re using something like C++ (which is one of the most efficient programming languages), the max you can do is possibly 2 × 108 computations per second. This means, that calculating the DFT of a short 10-minute audio would take 175032900000000/200000000 = 2875164.5 seconds or about 10 days! This would make it practically impossible to compute the DFT of large signals, rendering the application of the Fourier transform quite limited. But fear not! Enter the Fast Fourier Transform (FFT), the magical algorithm that swoops in, making DFT computations lightning-fast. It helps reduce the time complexity of DFT calculation from O(N²) to mere O(N log N). For the 10-minute sample, we would now require only 13230000*log(13230000) = 216945507 floating point operations. This translates to a mere 1.08 seconds, much more efficient than the traditional DFT algorithm. This means we’re not just limited to small signals — FFT unleashes the power of Fourier Transforms on massive datasets. Cool, right? But how does the algorithm even work and what makes it so efficient? This leads us to the next section of this article: the FFT algorithm!
The FFT Algorithm
The core idea of FFT lies in the inherent symmetry of the Fourier Transformation that helps us reduce some of the redundant calculations. FFT works by harnessing the symmetry of the DFT computation and feeding it into an elegant recursive divide and conquer model effectively reducing time complexity from O(N²) to O(N log N). But, what is this symmetry we are talking about? Recall the formula for the DFT computation:
What happens if we use this formula for N + k instead of k? Let’s see:
By the properties of complex numbers, e−2πij = 1 for any value of j. Thus,
In other words, the value simply repeats itself after k = N. Thus, F(1) = F(N + 1); F(2) = F(N + 2) and so on. This is the reason, why we only compute the Fourier transform as k ranges from 0 to N — 1. The values simply keep on repeating afterward. More generally by simple induction, we have that for any nonnegative integer s ∈ Z≥0,
This is the symmetric property that we are going to use to come up with the much faster (as is said in its name), the FFT algorithm. But how do we come up with a divide-and-conquer strategy that uses this symmetry? The idea is to split up the terms into odd and even terms and look at the DFT for each of them separately:
In the above formulation, we’ve split the DFT terms into two groups: one with even indices (j = 2m) and the other with odd indices (j = 2m + 1). As you can see, this gives us two separate DFTs, one computed only on the even terms of the signal, while the other computed on the odd terms. But, does this help us reduce the number of operations? Not yet, as we still need to evaluate all N/2 terms for all values of k from 0 to N — 1 for both odd and even terms i.e., still 2*N*(N/2). Or do we need to? Here’s when we use the symmetric property of FFT! Suppose we can evaluate the above expression for some integral value a that lies between 0 and N/2–1. Thus,
Using just the value of F₁(a) and F₂(a) (and the symmetric property shown earlier), we can easily calculate the value of F(a + b) = F(c) for some integral value c that lies between N/2 and N — 1:
Here’s the key idea! We don’t need to calculate F(c) all over again, this saves us about N/2*N operations every time. All we need to do is calculate F1(a) and F2(a) for every integral value a between 0 and N/2–1 (which takes a total of (N/2)*(N/2) = N²/4 operations for both the even and odd terms. Doing this and applying some simple symmetric logic will allow us to calculate the value of F(k) for all integral values of k between 0 and N — 1, effectively reducing the number of operations from N² to 2 × (N/2) × (N/2) = N²/2 i.e, a factor of half. Isn’t it just amazing?
Now, it may seem that we just reduced the time complexity by half, isn’t it still O(N²) at the end of the day? That’s true only if we split up the signal once. But, nothing stops us from splitting it further! We can continue this chain:
If we assume, that N is a power of 2, we can repeat this process for a total r times such that:
Each individual evaluation takes O(N) time to compute, and we do this for r = log2(N) times, giving us a time complexity of O(Nr) = O(N log N) (we can ignore the base of the logarithm when describing time complexities). For those of you who’d like to see this in terms of the recurrence relation, it is given by:
Here, T(N) represents the time complexity of solving a problem of size n. In the case of FFT, it is the number of elements in the input signal, and O(N) represents the time required for combining the results of smaller sub-problems.
The recurrence relation indicates that to solve a problem of size N, the FFT algorithm recursively divides the problem into two sub-problems of size N/2 (one for the odd terms, and the other for the even terms), computes the solutions for these sub-problems in a total of 2T(N/2) time, and then combines the results in O(N) time. Solving this recurrence relation also leads us to the O(N log N) time complexity shown earlier.
Everything looks good in theory! But does this even work? Let’s test it out by writing a simple Python function that calculates the DFT using the FFT algorithm instead. Here’s the code:
import numpy as np
def nice_fft(signal):
# Get the number of samples in the signal
N = len(signal)
# Base case: if the signal has only 1 samples, use simple_dft
if N == 1:
return simple_dft(signal)
else:
# Initialize an empty list to store the result (DFT coefficients)
res = []
# Separate the signal into even and odd terms
even_terms = signal[::2]
odd_terms = signal[1::2]
# Recursively compute FFT for even and odd terms
f1 = nice_fft(even_terms)
f2 = nice_fft(odd_terms)
# Combine the results using the Cooley-Tukey FFT algorithm
for k in range(N):
# Calculate the complex exponential term
mult = np.exp(-2 * np.pi * 1j * k / N)
# Determine the index for the even and odd terms
INDEX = (k % int(N / 2))
# Combine the results for the current frequency bin
dft_value = f1[INDEX] + mult * f2[INDEX]
# Append the result for the current frequency bin to the list
res.append(np.round(dft_value, 5))
# Return the list of DFT coefficients
return res
nice_fft([1, 2, 0, 5, 9, 2, 0, 4])
# Output: [(23+0j), (-8.70711-0.70711j), (10+5j), (-7.29289-0.70711j),
# (-3-0j), (-7.29289+0.70711j), (10-5j), (-8.70711+0.70711j)]
And it gives the same results as before, but much faster! The above code follows a recursive approach based on the divide-and-conquer strategy discussed. Note that this code works only for signals whose length is a power of 2 for simplicity. For signals with a length that is not a power of two, we can simply append 0s at the start or the end to get the desired result. To test our two functions (simple dft and nice fft), we can try to generate a random array of a size corresponding to a large power of 2 and measure the time taken:
import timeit
# Generate a random array of size 2^14 (16384)
random_array = np.random.rand(2**14)
# Measure the execution time for simple_dft
time_simple_dft = timeit.timeit(lambda: simple_dft(random_array), number=1)
# Measure the execution time for nice_fft
time_nice_fft = timeit.timeit(lambda: nice_fft(random_array), number=1)
# Print the results
print(f"Time taken for simple_dft: {time_simple_dft:.5f} seconds")
# Output: Time taken for simple_dft: 149.81244 secondss
print(f"Time taken for nice_fft: {time_nice_fft:.5f} seconds")
# Output: Time taken for nice_fft: 1.28395 seconds
That’s a massive improvement. To get a better view of the time differences, we can also make a line plot (log scale) showing the difference in the amount of time taken for arrays of different sizes:
# Define array sizes to test
array_sizes = [2**n for n in range(5, 14)] # Sizes from 2^5 to 2^14
# Measure execution time for each array size
time_simple_dft = []
time_nice_fft = []
for size in array_sizes:
random_array = np.random.rand(size)
time_simple_dft.append(timeit.timeit(lambda: simple_dft(random_array), number=1))
time_nice_fft.append(timeit.timeit(lambda: nice_fft(random_array), number=1))
# Plotting
import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))
plt.plot(array_sizes, time_simple_dft, label='simple_dft')
plt.plot(array_sizes, time_nice_fft, label='nice_fft')
plt.xlabel('Array Size')
plt.ylabel('Time (seconds)')
plt.title('Execution Time for simple_dft and nice_fft')
plt.legend()
plt.show()
Isn’t it cool how a simple idea of symmetry nested within the elegant framework of divide and conquer produced such a wonderful algorithm?
The Matrix View
In this article, we started with the formulation of the Fourier Transform and simply manipulated the expression by splitting the odd and even terms to arrive at the FFT algorithm. There’s another way to view it: through the lens of matrix manipulation. The idea is to think of the Fourier transform as simply multiplying the input signal with a matrix (called the Fourier matrix). Recall the definition of the Fourier transform:
As you can see, each of the individual DFT is calculated by simply taking a linear combination of the signal measurements. We can take α = exp{-2πi/N}, and we get:
This allows us to represent the signal and the transformation using a simple notation of vectors and matrices:
The whole of DFT boils down to finding that big N × N matrix F (called the Fourier matrix) and multiplying it with the input signal. Using the FFT algorithm, we can decompose the Fourier matrix as a product of 3 sparse matrices:
Now it may seem overwhelming, but at the root, it’s just expressing our earlier divide and conquer using matrices. The I_{N/2} is just the identity matrix of N/2 rows/columns, known to us. D_{N/2} is simply the diagonal entries of the first N/2 × N/2 partition of the N × N Fourier matrix F . This can be calculated easily in O(N) as it only requires us to calculate the values of 1, α, α⁴, …., α^{(N/2–1)²}, which just corresponds to the multiplier term in our original formulation.
The F_{N/2} corresponds to the recursive sub-problem, the Fourier matrix of size N/2 × N/2. Finally, P is a permutation matrix (a matrix filled with all 0s except for just one 1 in every row/column). The purpose of P is to segregate the odd and even terms of the input signal, by bringing the even terms to the top and the odd terms to the bottom. The rest of the matrix manipulation follows as before. We can keep repeating this process again and again, breaking the Fourier matrix F_{N/2} continuously until we reach the base case when N = 1. As before the time complexity remains O(N log N), it’s just a more elegant way to write down the equations without the messy summations!
The Bottom Line
The Fast Fourier Transform (FFT) stands as a testament to the beauty of simplicity and elegance in algorithmic design. It has revolutionized signal processing, data analysis, and various scientific disciplines and its importance lies not only in its computational efficiency, as evidenced by the remarkable speed gains over na ̈ıve approaches, but also in its versatility, enabling breakthroughs in diverse fields such as telecommunications, image processing, and quantum computing. From audio compression algorithms to medical imaging techniques, the FFT underpins a myriad of applications that have become integral to our daily lives.
As we reflect on the journey from a simple idea to a groundbreaking algorithm, it’s awe-inspiring to appreciate how a fundamental understanding of symmetry, coupled with innovative algorithmic design, can yield solutions of profound significance. The FFT, with its elegance and efficiency, encapsulates the essence of ingenuity in computer science. So, next time you marvel at the clarity of a digital image or enjoy the fidelity of a music stream, just remember that behind these technological marvels stands the remarkable FFT — a true testament to the power of simple yet ingenious ideas.
Hope you enjoyed reading this article! In case you have any doubts or suggestions, do reply in the comment box. Please feel free to contact me via mail.
If you liked my article and want to read more of them, please follow me.
Note: All images (except for the cover image) have been made by the author.
References
- cs.cornell.edu. https://www.cs.cornell.edu/~bindel/class/cs5220-s10/slides/FFT.pdf. [Accessed 05–01- 2024].
- The Fast Fourier Transform (FFT): Most Ingenious Algorithm Ever? — youtube.com. https://www.youtube.com/watch?v=h7apO7q16V0. [Accessed 05–01–2024].
- Shaw Talebi. The Fast-Fourier Transform (FFT) — medium.com. https://medium.com/swlh/ the-fast-fourier-transform-fft-5e96cf637c38#:~:text=The%20FFT%20is%20an%20efficient,the% 20Permutation%20matrix%2C%20used%20above. [Accessed 05–01–2024].
- Jake VanderPlas. Understanding the FFT Algorithm — Pythonic Perambulations — jakevdp.github.io. https: //jakevdp.github.io/blog/2013/08/28/understanding-the-fft/. [Accessed 05–01–2024].
Algorithmic Alchemy with The Fast Fourier Transform was originally published in Towards Data Science on Medium, where people are continuing the conversation by highlighting and responding to this story.
Originally appeared here:
Algorithmic Alchemy with The Fast Fourier Transform
Go Here to Read this Fast! Algorithmic Alchemy with The Fast Fourier Transform