<--- Return to Excel Statistics Page
The Excel Toolpak Fast Fourier Transform (FFT)
Keith Greiner
August 8, 2020
The Fast Fourier Transform (FFT) is an amazing function that converts a time series set of numbers into a frequency distribution under the assumption that the series of numbers has one or more underlying sinusoidal or near-sinusoidal components. The FFT is a revolutionary method of calculating the Discrete Fourier Transform (DFT) which has the same output as the FFT, but is inefficient, as it calculates N iterations of N variables, and therefore requires N2 loops. With a large series, the DFT can use up computer time that programmers would rather not be burdened with, while the FFT can achieve the calculation in less time. At least that's what many of the essays on the subject are seen to report. My calculations using both the FFT and the DFT for small samples of data showed little difference. With a series of 128 values, using Excel VBA, the DFT required 0.359375 seconds, while the FFT required 0.359375. Yes, the values are the same.
Output from the FFT (or the DFT) is displayed as a series of numbers representing the amplitude level of each identified frequency within the original series. Think of the FFT output as a histogram of frequencies. By setting one or more of the FFT output bins to zero, and then by performing an inverse of the function, an analyst can discover, and graph, the underlying frequency components that are contained in the original series.
The following text will describe this tool in greater detail. It is based on the Microsoft FFT function that is contained in the Excel Analysis Toolpak™.
A Little History
Fourier Analysis takes its name in honor of Jean-Baptiste Joseph Fourier (1768 – 1830) , a French mathematician who explored the phenomena of heat transfer. Of interest, here, is Fourier’s suggestion that any series of numbers can be represented by a collection (distribution) of values that represent the amplitude of periodic (sines) values, and that it can be converted from series to the distribution of sines and back to the series. Authors, Heideman, Johnson, and Burrus (1984 at https://www.cis.rit.edu/class/simg716/Gauss_History_FFT.pdf) reported that Fourier first presented his ideas on the calculation of the series to the French Academy of Sciences on December 21, 1807. It was not well received by mathematician Joseph Louis LaGrange (1736 – 1813), and was rejected for publication. Actual publication finally occurred 15 years later. Meanwhile, the three authors uncovered information that Carl Friedrich Gauss (1777 – 1855) proposed and published a series conversion algorithm in 1805. So their conclusion is that it should be called the Gausian Transform instead of the Fourier Transform. Still, it remained a daunting calculation until James Cooley (1926 – 2016) and John Tukey (1915 – 2000) proposed an algorithm in 1964. (http://www.ams.org/journals/mcom/1965-19-090/S0025-5718-1965-0178586-1/S0025-5718-1965-0178586-1.pdf). Introduction of the Cooley-Tukey calculation, and then its translation to computer code, was a major contribution to the ability to digitize a series of values into frequencies. Digitization allows audio, video, and other signals to be converted from analog to digital. As digital values, the signal can be more easily edited before it is converted back to analog. The conversion allows one to modify a photograph, filter specific frequencies from an audio signal and a host of other values. If you read the Cooley-Tukey calculation, you will see some adaptations are used between the signal proposed by Cooley and Tukey vs. what actually happens in the code. However the code does work; if you obtain the correct code.
I found that the code published in the Numerical Recipes books for Basic and C did not function correctly. Code proposed by Lee Zhen Yong (https://bruceoutdoors.wordpress.com/2017/03/23/the-programmers-guide-to-fft-part-2-fft/) did function correctly for the forward DFT, the inverse DFT, the forward FFT and the Inverse FFT. The inverse FFT, however, functioned well in C++, but not in Excel VBA. In VBA it encountered a problem with memory as it uses extensive memory for recursion. Lee Zhen Yong’s functions match the values returned by the Microsoft Excel FFT tool in the Analysis Toolpak add-in, although some near zero values had what apparently are rounding differences. Overall, they had little impact on the outcome.
Calculating an FFT in Excel Toolpak
The following text will list the steps of an FFT analysis in Excel, and then will show an example.
The example is a special case of a perfect sine wave that is sampled within a 128 point window. One hundred twenty-eight is used because it is a power of 2 (that is, 2^7), and is suitable to show a nice smooth sine wave. The steps that are presented, here, are applicable to all FFT analyses. Similarly, you will be able to see the interpretation of the output bins as they apply to the special case. You will be able to apply this knowledge to more complicated cases.
The steps of an FFT analysis in Excel are as follows:
Use the correct Excel and prepare some data:
Run Excel FFT
The output will be text, in a single column, containing two components: a real number on the left and an imaginary number on the right. The first line is a real number and the imaginary number is zero. The first ten lines of output, for the data shown above, look like this
All of the values that are output from the Fourier Analysis (FFT) function are presented as text. That’s why the values are left-justified. The number in row 2 is -64i which means that it is the imaginary component of a complex number that can also consist of a real number and a complex number presented as text. An example of a more complicated complex number will be presented later. You can extract real and imaginary numbers by using the appropriate Excel functions. To extract a real component use the =IMReal(Cell_Reference) function. You can create a column of the imaginary numbers by using the =IMAGINARY(Cell_Reference) function. In this case where we have a perfect sine wave, both real and imaginary numbers will be zero except the imaginary of -64i.
Some C++ programs will present only the real numbers as output, which will be a problem for this situation.
Now, if we multiply the original radian values of each row by 2, and then calculate sine values for each row, we have the following graph.
The FFT on the set of 128 rows of a sine wave that is 2x the original. Notice in the example shown below that the -64i is now in Row 3.
In a third test, lets add the first sine (the fundamental) and the second harmonic (sine of fundamental’s radians times 2). The value of the FFT is that it will sort out such complex waveforms.
The image of the combined waveforms looks like the following.
The FFT output looks like the following.
Note the similarities between this and the two previous FFT series’.
Now, we need to run the FFT Inverse.
To do this, open the Factor Analysis tool in Excel Toolpak. As the input, select all 128 rows of the output that has -64i in both the second and third rows.
Be sure to check the “Inverse” box, and run the analysis.
Output will, again be as a combination of real and imaginary numbers, so use the Excel =IMReal(Cell_Reference) for the column of real numbers, and =IMAGINARY(Cell_reference) for the imaginary values. Set the real and imaginary code up and copy it down to all 128 rows. Run the inverse function. Your output will be the original summed sine values in the column of real numbers and zero in the column for imaginary numbers. Graph the real numbers and you will see they are the same as the original sum of the sine values plus the second harmonic.
The Meaning of Returned FFT Values
For the following discussion, keep in mind the following definitions. The rows of values returned by an FFT function are called “frequency bins”. The bins have greater meaning when we count them in a numbering method that begins with 0 through 127. That is the same counting system used for arrays in C++ and some other computer languages. It is not the counting system you see when looking at the marginal numbers shown on an Excel sheet. Those numbers begin with 1. Also, in Excel VBA you have to define the array to be counted as 0 to 127 if that’s what you want. In this application you need to use 0 through 127, and not 1 through 128. The references below are all in the 0 to 127 method.
Each bin value returned by the FFT function represents the amplitude of frequency multiples contained in the original data. In Row 0, we see the amplitude of the original data, In Row 1, we see the fundamental of a single cycle, In Row 2, we see the first harmonic. This harmonic has two positive peaks and two negative peaks: double the fundamental. In this example, just look at the number in the 0 through 127 counting system, and you have the number of cycles that you will see if you put -64i into a row and then calculate the inverse.
If your original data have 128 data points, (as in this example) then the N value is 128. N/2 = 64.
Now return to that list of the original output. The list where all values are zero except for the -64i in row 1 number 127 (remember, we’re using a 0 to 127 numbering system). Replace the two values of -64i with the text value “0” without the quotes, and put the -64i into Rows 64 and 66 in the 0 to 127 system. Row 65 is the central point where the series folds over into a mirror image. The frequency represented by each bin number is lowest frequency (the frequency in bin 1) multiplied by the row number. Therefore, if we put the value -64i into row, bin, number 64, and then run the FFT inverse, then the returned value should have 64 cycles. Run the inverse FFT function, and graph the =IMReal(Val1) value and the =IMAGINARY(Val1) numbers. The numbers in the =IMReal(Val1) column, show 64 cycles (just as expected) superimposed onto a one cycle super-series. Isn’t that cool!!? We know it is one cycle if you follow a series of peaks in that low-frequency super-series through a high, then low, then high. It is actually a cosine series because it begins at zero then rises to a high value, then down, and back up to zero. Now, count the smaller peaks. There are 64 as predicted.
If you graph the =IMAGINARY(Val1) series, you’ll see a cosine series that also shows 64 cycles superimposed onto one cycle.
So the maximum number of cycles displayed by FFT is the window size (N = 128) divided by 2 (128/2 = 64). The frequency value of any bin, relative to the number 1 bin (in the 0 to N-1 system) is, the bin number.
Max_Cycles = N/2
Any given frequency = n, where n is any bin in the 0 to N-1 system.
Now, I know if you search the internet you will see some other ways to determine the frequency value of any bin. I’m going with what I learned from the steps described above.
Following is the imaginary component of the Inverse FFT function.
Lets say you have a window size of 1024. If N = 1024 (2^10) then the maximum number of bins is 1024/2 = 512.
The number 1 bin, in the 0 to N-1 system has one cycle and bin number 500 in the 0 to N-1 system will have 500 cycles
Sine Series’ with Noise
Not all data series’ are as neat and clean as those shown above. If we add noise, things get more complicated.
Lets take the original one-cycle graph and add noise. I do this by changing the sine function calculation (ie: line 7) from =SIN(F7) to =SIN(F7) + NORMINV(RAND(),$G$1,$G$2). In this example the value G1 is the mean and is 0. The value $G$2 is the standard deviation, and is 0.2. The complete calculation returns a normally distributed random number between -1 and 1, which is added to the sine function. Beginning with Row 7, the Excel code is copied down for the entire 128 rows. The graph of sine values looks like this.
When the noisy series is converted to frequency domain we see the following for the first 10 rows.
Where we previously saw a lot of zeros, we now have a numbers.
And that's just the beginning. The possibilities of analysis with the Excel Toolpak Fourier Analysis tool are endless. Following are three extracted cycle series taken from a 128 day series of closing prices from the New York Stock Exchange. The source is the Yahoo Finance site. The NYSE data are graphed on the right-hand scale. Here we see a fundamental, a x2 harmonic and an x3 harmonic. There are additional components that, when added in via the inverse FFT function, can recreate the original.. or at least close to the original. My analyses from a collection of NYSE data suggest that the actual frequency of the fundamental and harmonics have a random variability that adds no predictability. However, they do seem to add a new sense to a very complex series.
The following graph shows the same series of input data after the following steps of processing:
The graph follows:
And then, there's the United States Leading Indicators with 256 data points beginning on February 1, 1997. Below is a graph of filtered leading indicators with the high frequency "noise" eliminated.
And finally, below is the graph of U. S. Leading Indicators plus the filtered FFT/IFFT series plus a six degree polynomial trend line. Here, the R^2 value for the FFT/IFFT filtered series is 0.75 while the R^2 value for the polynomial trend line is only 0.32. That's good for the filtered data.