Tutorial: The fast Fourier transform

Learn how to display incoming audio data as a spectrogram by using the FFT class of the DSP module. Understand the benefits of using a Fast Fourier Transform.

Level: Intermediate

Platforms: Windows, macOS, Linux

Classes: dsp::FFT, Image, Colour, FloatVectorOperations

Download the demo project for this tutorial here: PIP | ZIP. Unzip the project and open the first header file in the Projucer.

- Note
- If your operating system requires you to request permission to access the microphone (currently iOS, Android and macOS Mojave) then you will need to set the corresponding option under the relevant exporter in the Projucer and resave the project.

If you need help with this step, see Tutorial: Projucer Part 1: Getting started with the Projucer.

When completed, the demo project will display the incoming audio data as a three-dimensional spectrogram in the time (x-axis), frequency (y-axis) and amplitude (colour) domains. The values displayed on the screen will be updated 60 times a second and the window at any time frame may look something like this:

- Note
- The code presented here is broadly similar to the
**SimpleFFTExample**from the JUCE Examples.

A time or space domain signal can be converted to the frequency domain by using a transformation formula called the Fourier transform. A common efficient implementation of this transformation function is the Fast Fourier Transform or FFT, which is included in the JUCE DSP module and which we will use in this tutorial.

The FFT allows us to decompose an audio signal into its frequencies and represent the magnitude and phase information for each of these frequencies. Using its inverse function, we can revert the signal into its original domain thus making it really useful to process individual frequency components such as for filtering.

Since this tutorial only deals with displaying the audio data without actual processing for output, we focus on the forward FFT rather than the inverse FFT.

Currently our application does not display nor process any incoming audio signals so let's start by implementing the FFT.

In the `SpectrogramComponent`

class, start by declaring an enum as a public member to define useful constants for the FFT implementation:

//...

enum

{

fftOrder = 10, // [1]

fftSize = 1 << fftOrder // [2]

};

private:

- [1]: The FFT order designates the size of the FFT window and the number of points on which it will operate corresponds to 2 to the power of the order. In this case, let's use an order of 10 which will produce an FFT with 2 ^ 10 = 1024 points.
- [2]: To calculate the corresponding FFT size, we use the left bit shift operator which produces 1024 as binary number 10000000000.

Next, declare private member variables required for the FFT implementation as shown below:

private:

dsp::FFT forwardFFT; // [3]

Image spectrogramImage;

float fifo [fftSize]; // [4]

float fftData [2 * fftSize]; // [5]

int fifoIndex = 0; // [6]

bool nextFFTBlockReady = false; // [7]

//...

- [3]: Declare a dsp::FFT object to perform the forward FFT on.
- [4]: The fifo float array of size 1024 will contain our incoming audio data in samples.
- [5]: The fftData float array of size 2048 will contain the results of our FFT calculations.
- [6]: This temporary index keeps count of the amount of samples in the fifo.
- [7]: This temporary boolean tells us whether the next FFT block is ready to be rendered.

Now let's initialise these variables in the member initialisation list of our constructor like so:

public:

SpectrogramComponent()

: forwardFFT (fftOrder),

spectrogramImage (Image::RGB, 512, 512, true)

{

//...

The FFT object has to be explicitly initialised with the correct order at this point.

In the overriden `getNextAudioBlock()`

function, we simply push all the samples contained in our current audio buffer block to the fifo to be processed at a later time:

{

{

pushNextSampleIntoFifo (channelData[i]);

}

}

To push the sample into the fifo, implement the `pushNextSampleIntoFifo()`

function as described below:

void pushNextSampleIntoFifo (float sample) noexcept

{

if (fifoIndex == fftSize) // [8]

{

if (! nextFFTBlockReady) // [9]

{

zeromem (fftData, sizeof (fftData));

memcpy (fftData, fifo, sizeof (fifo));

nextFFTBlockReady = true;

}

fifoIndex = 0;

}

fifo[fifoIndex++] = sample; // [9]

}

- [8]: If the fifo contains enough data in this case 1024 samples, we are ready to copy the data to the fftData array for it to be processed by the FFT. We also set a flag to say that the next line should now be rendered and always reset the index to 0 to start filling the fifo again.
- [9]: Every time this function gets called, a sample is stored in the fifo and the index is incremented.

The fifo data now occupies the first half of the FFT input array and is ready to be processed and displayed.

In the `drawNextLineOfSpectrogram()`

function, insert the pixel drawing implementation as explained below:

void drawNextLineOfSpectrogram()

{

auto rightHandEdge = spectrogramImage.getWidth() - 1;

auto imageHeight = spectrogramImage.getHeight();

spectrogramImage.moveImageSection (0, 0, 1, 0, rightHandEdge, imageHeight); // [1]

forwardFFT.performFrequencyOnlyForwardTransform (fftData); // [2]

for (auto y = 1; y < imageHeight; ++y) // [4]

{

}

}

- [1]: First, shuffle the image leftwards by 1 pixel using the
`moveImageSection()`

function on the Image object. Specify the image section as the whole width minus one pixel and the whole height. - [2]: Then, render the FFT data using the
`performFrequencyOnlyForwardTransform()`

function on the FFT object with the fftData array as an argument. - [3]: Find the range of values produced, so that we can scale our rendering to show up the detail clearly. We can do so using the FloatVectorOperations::findMinAndMax() function.
- [4]: Now in the for loop for every pixel in the spectrogram height, calculate the level proportionally to the sample set. To do this, we first need to skew the y-axis to use a logarithmic scale to better represent our frequencies. We can then feed this scaling factor to retrieve the correct array index and use the amplitude value to map it to a range between 0.0 .. 1.0.
- [5]: Finally set the appropriate pixel with the correct colour to display the FFT data.

As a final step, update the spectrogram using the timer callback function by calling the `drawNextLineOfSpectrogram()`

only when the next FFT block is ready, reset the flag and update the GUI using the `repaint()`

function:

void timerCallback() override

{

if (nextFFTBlockReady)

{

drawNextLineOfSpectrogram();

nextFFTBlockReady = false;

repaint();

}

}

- Exercise
- Try to increase the resolution of the FFT and change the rate at which the spectrogram updates.

- Note
- The source code for this modified version of the code can be found in the
`SimpleFFTTutorial_02.h`

file of the demo project.

In this tutorial, we have learnt how to use an FFT function to display audio data in a spectrogram. In particular, we have:

- Learnt the basics of a fast fourier transform function.
- Processed audio sample by sample using a fifo.
- Displayed the data in an Image object pixel by pixel.

void performFrequencyOnlyForwardTransform(float *inputOutputData) const noexcept

Takes an array and simply transforms it to the magnitude frequency response spectrum.

static Colour fromHSV(float hue, float saturation, float brightness, float alpha) noexcept

Creates a colour using floating point hue, saturation and brightness values, and an 8-bit alpha.

Type jlimit(Type lowerLimit, Type upperLimit, Type valueToConstrain) noexcept

Constrains a value to keep it within a given range.

void setPixelAt(int x, int y, Colour colour)

Sets the colour of one of the image's pixels.

const Type * getReadPointer(int channelNumber) const noexcept

Returns a pointer to an array of read-only samples in one of the buffer's channels.

int numSamples

The number of samples in the buffer which the callback is expected to fill with data.

int startSample

The first sample in the buffer from which the callback is expected to write data.

AudioBuffer< float > * buffer

The destination buffer to fill with audio data.

int getNumChannels() const noexcept

Returns the number of channels of audio data that this buffer contains.

JUCE_CONSTEXPR Type jmap(Type value0To1, Type targetRangeMin, Type targetRangeMax)

Remaps a normalised value (between 0 and 1) to a target range.

void moveImageSection(int destX, int destY, int sourceX, int sourceY, int width, int height)

Copies a section of the image to somewhere else within itself.

static Range< float > JUCE_CALLTYPE findMinAndMax(const float *src, int numValues) noexcept

Finds the minimum and maximum values in the given array.

void zeromem(void *memory, size_t numBytes) noexcept

Fills a block of memory with zeros.

JUCE_CONSTEXPR Type jmax(Type a, Type b)

Returns the larger of two values.