Skip to main content

Programming Nerd-Talk: Volume 1 - Audio Processing

Posted by Zoid on Mon, 09/02/09 - 5:36 PM

Hey everyone,

I thought we could start a new topic for this. I've been looking up some stuff in my spare hours today regarding ways to input WAV/MP3 files, etc, and be able to analyse them in realtime, recognise beats, etc.

This is my simplistic (having not done anything like this before) view of how it would be done:

* Assuming the music tracks wouldn't be stop/started during play, the track can be parsed while each level loads, so not all the work need be done during play.
* A common method for making a simplified model of the audio waveform seems to be Fourier Transforms, with 'Fast Fourier Transforms' (FFT's) being a more simplified but... well, faster... method (ie realtime i believe). Fourier Transforms basically make an approximation of a complex waveform of the actual audio, by adding together a specific number of sine waves of differing wavelengths and magnitudes.
* Analysis of the FFT's allow recognition of spikes in audio in certain wavebands and at certain volumes.

I've added some links here for reading and discussion. I'm still looking into a good way for us to use it, but I've heard it mentioned as having been done using FMOD.

Some of the link contents are fairly simplistic, but will help you get the basic picture. There's some good stuff in the rest, and I'm still yet to finish reading read all of it.

Fourier Transforms:

Reading/Writing WAV files:

Beat Recognition:
* (this one uses Actionscript, but still good to read).

Anyways, have a look if you like, and let me know anything else you've found. This may not even be the best way to go, but in my searching today it seemed to be a commonly-used method. Hopefully I'll be able to finish reading them all tomorrow.

I'll see if i can dig up my uni notes on Fourier Series, but there may already be functions to generate some of this without us having to delve lots into the fourier side of the maths at least.


Submitted by BattleElf on Mon, 09/02/09 - 5:42 PM Permalink

I've found out pretty much the same thing. I hadn't found any examples of actually implementing the FFT though. (I'm at work so haven't had time to go through your links).

Audiosurf parses the track and builds the level during level load too. Not sure if they use FFT's though, all I could really find mentioned was "Frequency Analysis".

If pausing the game and the sound occur instantaneously I can't see a problem with losing sync.

Submitted by Zoid on Mon, 09/02/09 - 5:48 PM Permalink

Just found another similar link on GameDev, where someone commented that creating a 'beat' file for each music track would be much better (though far less preferable for our purposes).

Let's look into it, but they seem to think that a suitably accurate algorithm is either not possible, or will be to CPU intensive anyway.

I'll see if i can get a look at FMOD tonight to see what it has to offer.

Submitted by Bittman on Mon, 09/02/09 - 6:10 PM Permalink

Good work guys. I expect me and Jarrod will use your research here for the programming considerations section of our musically inclined game proposals.

Submitted by Wednesday on Mon, 09/02/09 - 9:37 PM Permalink

You guys rock - now just gotta make sense of it all...


A man goes to knowledge as he goes to war, wide awake, with fear, with respect, and with absolute assurance. - The Teachings of Don Juan