The conventional wisdom among IT professionals is that audio is a toy, for entertainment purposes only. To date, the best use of audio in user interfaces is in video games and CD ROMs. But thanks to faster processors, sophisticated new algorithms, and well-thought-out new APIs, user interface designers stand poised to take advantage of perhaps the most subtly sophisticated sensing devices human beings have — our ears.
The elements of hearingHuman hearing is often overlooked by hardware and software developers, but it’s a rich and subtle sense. It is also always with us. We cannot close our ears the way we close our eyes. We even hear in our sleep, as anyone who has been awakened at 6 a.m. by a noisy garbage truck can tell you. Unlike vision, which focuses on one image at a time, we can “focus” our hearing on several things simultaneously — it is possible to hold a conversation while listening to music, and even follow several conversations going on at once.
People have been trained by evolution to respond to very subtle sound cues. We can:
- hear in all directions at once.
- hear around corners.
- hear through doors.
- hear through walls.
- identify the location of the source of a sound by hearing alone.
- hear loud noises, like thunder, from miles away.
- identify the dimensions of a space by acoustic clues.
- identify movement behind us via subtle changes in the sound field.
Problems and potential for audioCurrently there are many companies that allow no audible sound from computers in their workplace. Sound is regarded as intrusive and unnecessary. Perhaps the problem is not with sound per se, but the way it has been used in the past. Until recently, the audio-processing capability of a standard PC was fairly low. Unpleasant bleeps, bloops and quacks were the only types of sounds that could be produced. For the most part, audio alerts have been used only to notify the user of error — and nobody wants to share a mistake with the entire office.
Recent innovations are changing this trend. Standard-issue computers are now able to produce high-quality sound. Most systems come with at least a 16-bit sound card. Sun’s JavaSoft division, Microsoft, Intel and Elemedia, a division of Lucent have all released high-quality, software-based audio processing tools that can run on most standard machines. Separately, a number of very good software-based, text-to-speech systems have been released for several platforms.
Other companies are beginning to take tentative steps toward incorporating sound into their interfaces. Sun has licensed the Headspace sound engine (see below) and is releasing it as part of JDK 1.2, as the basis for the Java Sound API. Sun’s Java Media Framework API, out in public beta now, has a very thorough object-oriented approach to rendering multimedia files that makes the incorporation of audio events in Java applications both simple and flexible. Microsoft’s Windows 95 and Internet Explorer 4.0 offer limited options for using sounds as feedback for certain actions. Qualcomm’s Eudora Pro allows the user to select an audio alert instead of an alert box.
Sounds provide valuable feedback in the Web environment. Web sites often feature frustrating delays that may indicate either a process in the works or a failed action. Because of delayed reactions to form submissions or clicked hyperlinks, users often click several times, unsure whether their click has “taken.” An audible “click” can reassure users that a response will be forthcoming. Added assurance would come if the browser supplied audio feedback when it opened a connection to a remote server and began downloading data. Yet another sound could signal the completion of the download. During long downloads, sound can also be used as the equivalent of the music played over the telephone to people on hold — perhaps not a major feature, but a courtesy to the user. onClick can be used to initiate both the file download and to start a musical sequence or streaming audio file. When the download is complete, onLoad can stop playback of the music.
Beatnik gives users unprecedented control over the playback itself. Users can not only control volume, they can change the tempo, pitch and even the instruments used to play musical sequences. A Java applet that watches the number and speed of user clicks could use that information to create a customized soundtrack. A user who hopped around a site quickly, would hear an uptempo version of the site’s theme music while a more leisurely surfer would get a soundtrack to reflect that.
Sseyo is promoting the use of what they call “generative music” with its Koan plug-in for Windows. The Koan plug-in reads a very small file, sometimes as small as 1K, that acts as a “seed” that creates an ever-evolving piece of music. The “seed” file specifies things like feel, tempo, key, basic melodies and instruments used. The plug-in takes it from there and “improvises” a new composition each time. This eliminates the use of repetitious “loops” that can quickly become annoying. Because of the small files used, this technology offers an incredibly fast download time.
Microsoft has an ActiveX control for Win32 and IE called Interactive Music Control that has similarities to both Beatnik and Koan. As with Beatnik, sound events contained in a single file can be scripted and dowloadable sounds are supported. As with Koan, ever-changing compositions are generated on the fly based on very simple initial parameter settings. Interactive Music Control can also combine elements of both plug-ins, so that a user’s interaction shapes the qualities of the music.
ConclusionSound is still a rarity on the Internet, often used for its novelty value. As technology progresses and ideas catch up with PCs’ new abilities, the power of one of our most important senses is likely to play a growing role. The role that sound already plays in computer games may be an indicator of the future — imagine Quake silent. The challenge to software developers today is to bring that quality of sound design to productivity and communications software.
John Maxwell Hobbs is a musician and has been working with computer multimedia for over fifteen years. Most recently, he headed up multimedia research and development for EarthWeb, Inc. He is also on the board of directors of Vanguard Visions, an organization dedicated to fostering the work of artists experimenting with technology. He is the former Producing Director for The Kitchen. John Maxwell Hobbs can be reached at: [email protected].