As mentioned last week, my next project revolves around audio analysis. The first step is acquiring data and for that, I had already found the perfect Java solution. JLayer makes it easy to obtain data, but a sound file contains very large amounts of it. This post goes into a basic architecture to tame that data and get it into a form that can be processed.
Download the NetBeans example project
There are a few ways to get the wave data out of an mp3 file using JLayer. You could use the MP3SPI driver to play data through the JavaSound API and capture the bytes. But there’s no reason to make it that complicated.
JLayer can stream data to its own AudioDevice class. This is a callback class that has hooks for opening and closing a device, which we don’t need. The important hook is the one that sends the bytes to the device. This is where you can capture the raw stream. Most audio analysis, however, doesn’t use the raw stream, but averages the data to reduce the amount of data to process.
So that is how I ended up with my layered architecture:
- JLayer itself decodes the mp3 file and offers the byte stream to …
- … the audio device. This layer does only very basic processing. In the current example project you’ll find a device that averages the samples. I could add Fast Fourier Transform in the future.
- The final layer are the actual brains. I’ve called them “processor”. This is where the magic will happen.
That’s all there is to the current version of my project. For now it’s just a zip file, but if there is interest, I might put it on some public VCS.
The main advantage of the current architecture are:
- There’s only one dependency to outside projects: JLayer. This has advantages for portability.
- Splitting the basic processing that every analysis project needs anyway into its own layer, frees the actual processing from doing actual processing.