Thursday, January 11, 2007

File Type Metadata Discovery, Part 1: Audio

In a previous article, I evaluated various libraries to determine which most accurately identified a file's type. This article represents part one in a series of articles that explore how to discover metadata about a file after its type has been detected.

Audio Metadata

Based on the nature of digital audio Wikipedia cites "sample rate, resolution and number of channels" as important audio file format parameters. This type of technical metadata is important in classifying and organizing digital assets. Determining non-technical metadata, such as title, author and date are beyond the scope of this article, although there are many resources that address this type of metadata discovery.

Much of Java media development revolves around playing, streaming, recording and editing. However, only metadata discovery is consider in digital asset management (and in this article).

Discovering Audio Metadata with the Java Sound API

According to Sun, the Java Sound API "provides low-level support for audio operations." The javax.sound.sampled.AudioFormat and javax.sound.sampled.AudioFileFormat both allow access to an audio file's metadata. Using the static methods of the javax.sound.sampled.AudioSystem class, we can get an AudioFileFormat based on a File, InputStream or URL. An AudioFormat is obtained by invoking the AudioFileFormat's getFormat() method. Java Sound Resources provides an excellent example of how this is done. In summary, the following list shows what metadata is accessible through the Java Sound API.



audio file type, such as WAVE or AU


unmodifiable map of properties that specify additional informational meta data (like a author, copyright, or file duration). Properties are optional information, and file reader and file writer implementations are not required to provide or recognize properties


size in bytes of the entire audio file (not just its audio data)


length of the audio data contained in the file, expressed in sample frames



number of channels


type of encoding for sounds in this format


frame rate in frames per second


frame size in bytes


sample rate


size of a sample


an unmodifiable map of properties

NOTE: The duration of an audio file (in seconds) can be computed by multiplying the frame length by the frame rate.

The Java Sound API appears to meet our needs well, however, it only supports AIFF, AU, WAV and some MIDI based formats! To remedy this, the API provides a service provider interface (SPI) to support more file formats. MP3, OGG, APE (Monkey's Audio), FLAC and even Speex SPIs are available. An alternative implementation of the Java Sound API, Tritonus, also provides some SPI plug-ins (the OGG SPI requires Tritonus).

It should also be noted here that an SPI will implement the properties() method of the AudioFileFormat and AudioFormat classes. The properties returned may include non-technical metadata, but this is solely dependent upon the audio file format.

What about other Java media libraries?

According to Sun, the Java Media Framework (JMF) "can capture, playback, stream, and transcode multiple media formats." In comparing the JMF to Java sound, we find that the JMF does have more codecs, however the ability to capture audio specific metadata is limited. The focus of the JMF is not on metadata. At least from what I can tell. If anyone has found the case to be different, please let me know. The same goes with the FMJ and Enterprise Media Beans.


The Java Sound API currently provides the best means of discovering an audio file's technical metadata.

1 comment:

Anonymous said...

good information.
but how about video files metadata?
thank you