Introduction to the Java Sound API
The Java Sound API provides support for the capture, processing, and rendering of audio and MIDI data.
Some gallant efforts to do more with audio in Java have been made, such as the Emblaze audio streaming product and Kees van den Doel's physical modeling work (which is the audio engine at the heart of the interactive Java instrument "Ripple" that I created with Mark Napier (www.potatoland.org).
All that has changed with the incorporation of the Java Sound engine in Java 2. Another step forward has been taken with the beta release of Java 2 v1.3, which exposes the Java Sound API. Java Sound can be made available to earlier Java 1.x platforms through the use of the Java Media Framework 2.0.
The Java Sound API provides support for the capture, processing, and rendering of audio and MIDI data. Java Sound, which is based on the Beatnik audio platform from Headspace, is a 16-bit, 32-channel audio rendering and MIDI controlled sound synthesis engine. Java Sound supports a wide variety of file types including AIFF, AU, and WAV. It can render both 8- and 16-bit audio data in sample rates from 8 KHz to 48 KHZ. It will also support Type 0 and Type 1 MIDI files, and RMF, Headspace's optimized audio and MIDI file format. It also supports General MIDI soundbank formats.
Java Sound is designed to use only a small amount of system resources. According to Sun "a 24-voice MIDI file uses only 20 percent of the CPU on a Pentium 90-MHz system." Because the engine can begin playback as soon as it begins to receive sample data or MIDI commands, it can be used with streaming data systems.
Java Sound provides low-level audio support for Java with a very high degree of control over audio functionality, including the ability to manipulate system resources such as digital audio and MIDI devices like soundcards. It is not meant to function as a standalone application environment, but as a set of interfaces to be used in the creation of applications such as audio editors, MIDI sequencers, content delivery systems, interactive applications such as games, and communication and telephony applications. It can also be used to add audio in the form of prompts and alerts to other types of applications not usually associated with multimedia, such as word processors, spreadsheets, text-based chats, and more.
The Java Sound API consists of four packages:
The first two packages provide interfaces supporting digital audio and MIDI sequencing and synthesis; the .spi packages provide service providers with abstract interfaces to enable the installation of custom components. This article will focus on the first package,
provides interfaces for the handling of digital audio data, also referred to as "sampled audio." These interfaces provide for the capture, processing and output of digital audio. Java Sound is designed to support common functionality such as audio input and output and the mixing of multiple streams of audio. It allows for the installation of different sorts of audio components to be installed in a system and does not assume a specific hardware configuration.
The three major components of
are Lines, Controls, and AudioSystem.
A line is an element of the digital audio "pipeline." It can consist of things like an input or output port, a mixer or a data path. A line is divided into the subinterfaces: Port, Mixer, DataLine, and GroupLine. DataLines are further divided into SourceDataLine, TargetDataLine, and Clip. Lines can have instances of objects that implement the control interface, such as gain, pan, and reverb.
A Port is a line for input or output from and to an audio device such as a line input, a line output, a microphone, a speaker or a CD-ROM drive.
A Mixer is an audio device that has one or more input Lines and one or more output Lines. A Mixer can have a set of controls that are global to the mixer through the use of its own Line interface, as well as the controls that may be associated with individual Lines.
On close inspection, it appears that the wait for the Java Sound API was worth it.
A DataLine adds media functionality to Line. This functionality includes transport control such as start, stop, and pause, as well as reporting functions related to the nature of the media, such as format, volume and current position in the playback stream. SourceDataLine functions as an input to a mixer and TargetDataLine serves as an output. A GroupLine specifies a number of DataLines that can be treated as a group. These lines can be operated on by sending a single message, such as "stop," to the group instead of each line individually.
A Clip is a data line that is preloaded with audio data instead of having that data streamed. This allows an application to address the audio data at any point in its duration and allows for processes such as looping, rewinding, fast forwarding, and more.
Controls can be added to DataLines and Ports. Controls are used to affect the audio signal passing through the line. There are four types of controls provided in the Java Sound API: GainControl, PanControl, ReverbControl, and SampleRateControl.
GainControl is used to manipulate an audio signal's volume. Two methods are provided. The first is muting. Muting the signal cuts the flow of sound through the Line containing the GainControl. The second is gain. Gain is the amount in decibels that is added to the original level of the audio signal. A positive amount results in an increase in volume, a negative amount results in a reduction of volume. Gain can be used for fading effects. GainControl can also be used to control the level of audio signal sent to and received from a ReverbControl.
PanControl is used to control the balance between the left and right channels of a stereo audio signal.
ReverbControl allows for the manipulation of the audio signal by a number of different reverberation settings. The reverberation settings made available by ReverbControl are decay time, late reflection intensity, early reflection intensity, early reflection delay, and late reflection delay. By combining these parameters with a variety of values, a variety of room types such as bright rooms, concert halls, and caves can be simulated.
SampleRateControl allows for the manipulation of the playback rate. For example, an audio file encoded at 48,000 Hz and rendered at 24,000 Hz will play back at half speed. This will also result in a change of pitch as well.
AudioSystem pulls all these resources together and manages them. It allows an application to learn what components are installed and then makes them available. Types of resources that can be obtained from AudioSystem are Mixers, Lines, files and streams. AudioSystems also provides format conversions services that allow for the conversion between various AudioFormats and between files and streams.
The Java Sound API has been in the works for a long time. On close inspection, it appears that the wait was worth it. The API provides interfaces to allow most commonly used audio functions. Audio editors, interactive playback systems, MIDI sequencers, and more can be assembled fairly easily. Through the .spi classes, it makes available the means to extend it to allow more elaborate functionality. Further articles will cover the
, look deeper into the ReverbControl classes, and explore what is currently being done with the API in the real world.
About the author
John Maxwell Hobbs is a musician and has been working with computer multimedia for over 15 years. He is currently Head of Creative Developmet at Ericsson CyberLab NY. His interactive compositions "Web Phases" and "Ripple" can be found at Cinema Volta. His CDs are available through MP3.com. He is also on the board of directors of Vanguard Visions, an organization dedicated to fostering the work of artists experimenting with technology and is a member of the Subcommittee on Digital Art of the Mayor's Council on New Media in New York City. He is the former Producing Director for The Kitchen (in NYC). John Maxwell Hobbs can be reached at: firstname.lastname@example.org.