To write an audio application that samples, edits, or otherwise manipulates sound, the first decision you have to make is choosing which platform you want to lock yourself into. After all, even the most basic real-time audio playback functions are close to the bare metal of the operating system. If you’re going to put time and maybe money into an audio development effort, of course you want the widest swath of platforms for release. PortAudio answers the call by delivering a free, cross-platform, open-source audio I/O library. It lets you write simple audio programs in C that will compile and run on many platforms, including Windows, Mac, and Linux/Unix.
PortAudio, which provides a very simple API for recording and/or playing sound using a simple callback function, is intended to promote the exchange of audio synthesis software between developers on different platforms. It includes example programs that synthesize sine waves and pink noise, perform fuzz distortion on a guitar, list available audio devices, and much more. Carnegie Mellon University’s PortMusic project, which includes MIDI and soon will provide sound file support, recently selected PortAudio as its audio component.
Playing Musical Platforms
The PortAudio library supports an array of platforms including Windows, Linux, and Macintosh variants (see Table 1), but if you don’t have prior audio development experience you quickly will find yourself adrift in a sea of API standards. After computer audio became mainstreamed with Windows 3.1 MultiMedia Extension (MME) and the ubiquitous .WAV file back in 1991, a variety of solutions followed. First came Direct Sound in the Windows 95 era, which unfortunately lacked a record capability. The Windows 2000/XP generation then introduced the fastest solution for Windows users: the Windows Driver Model.
Platform Code | Minimum PortAudio Version | Description |
---|---|---|
pablio | 19.0 | PortAudio Blocking I/O (PABLIO) |
pa_asio | 18.1 | ASIO for Windows and Macintosh |
pa_beos | 18.1 | BeOS |
pa_jack | 19.0 | JACK for Linux and OSX |
pa_linux_alsa | 19.0 | Advanced Linux Sound Architecture (ALSA) |
pa_mac_sm | 18.1 | Macintosh Sound Manager for OS 8, 9, and Carbon |
pa_mac_core | 18.1 | Macintosh Core Audio for OS X |
pa_sgi | 18.1 | Silicon Graphics AL |
pa_unix_oss | 18.1 | Open Sound System (OSS) implementation for various Unix variants |
pa_win_ds | 18.1 | Windows Direct Sound |
pa_win_wdmks | 19.0 | Windows Driver Model with Kernel Support (WDMKS) |
pa_win_wmme | 18.1 | Windows MultiMedia Extension |
Table 1. Platforms Supported by PortAudio
The latency requirements of your application should dictate your choice of API. If your sound program does not require a quick response time (close to a “live” performance), you are certainly free to use the MME or Direct Sound platform. However, if you require very low latency (below 20ms response time), you will need ASIO or WDMKS. The downside of ASIO is that it requires (usually) proprietary drivers that at best require end-user installation and at worst are not even available for cheaper audio systems. (For more details, refer to the SoundCard FAQ.)
Getting Ready to Sound Off
To start programming with PortAudio, the first thing you need to do is go to www.portaudio.com and pick out a relevant distro. Because V18.1, the last official release, is nearing the three-year-old mark, you might as well start with a current V19 code snapshot. (An older precompiled DLL for PortAudio V17 also is available, but that’s all as of this writing.) Either way, it’s a matter of unpacking a ZIP file or tarball, because PortAudio is pretty much distributed in a source-only format.
As you might expect with any streaming interface, PortAudio supports two different programming models: a blocking API and a non-blocking API. The non-blocking API was developed first. The blocking API came later and is still unofficial. Although simple command-line type tools can use a blocking API with little impact, a modern GUI application would need to invoke a thread to manage blocking I/O calls. Otherwise, the app looks dead to both the OS and the end-user during I/O.
This article examines only the non-blocking API. A typical non-blocking PortAudio application requires the following steps:
- Write a callback function that PortAudio (PA) will call when audio processing is needed.
- Initialize the PA library and open a stream for audio I/O.
- Start the stream: PA now will call your callback function repeatedly in the background.
- Inside your callback, you can read audio data from the inputBuffer and/or write data to the outputBuffer.
- Stop the stream by returning a 1 from your callback or calling a stop function.
- Close the stream and terminate the library.
Hello PortAudio, A Sample Application
Although ASIO, WMSDK, and DirectSound layers are available, the sample application discussed in this section uses the Windows MME, the lowest common denominator. First, you need to build a static library out of the following modules:
- “Common” base library
- “Win” platform library (You will disable Direct Sound and ASIO for simplicity’s sake.)
- Layer-specific interface module
You do this from the DOS window by using Visual Studio C++ as follows (you may want to make this a .BAT file):
cd pa_snapshot_v19portaudiopa_common del *.lib copy ..pa_win cl /c /DPA_NO_DS /DPA_NO_ASIO *.c lib /out:portaudio.lib *.obj cd ..pa_win_wmme cl /c pa_win_wmme.c /I../pa_common
On this foundation, you can pick out a test program and link the thing together to see how it goes:
cd ..pa_tests cl patest_saw.c /I../pa_common /link ..pa_commonportaudio.lib ..pa_win_wmmepa_win_wmme.obj
Note: In the preceding three lines of code, lines two and three should be one continuous line. The line was broken only to display properly on this Web page.
What you get is about five seconds of pure sawtooth wave pleasure! But, that’s not the point. You now have a platform-independent, sound-synthesizing piece of code with which you could implement any number of effects.
PortAudio comes with about four dozen test programs. Look at the guitar fuzz distortion box simulator “pa_fuzz.c” (see below) so you can rock on like Peter Frampton and Joe Walsh. Use essentially the same build command as before:
cd ..pa_tests cl patest_toomanysines.c /I../pa_common /link ..pa_commonportaudio.lib ..pa_win_wmmepa_win_wmme.obj pa_fuzz.c: 1 #include <stdio.h> 2 #include <math.h> 3 #include "portaudio.h" 4 /* 5 ** Note that many of the older ISA sound cards on PCs do NOT 6 ** support full duplex audio (simultaneous record and playback). 7 ** And some support only full duplex at lower sample rates. 8 */ 9 #define SAMPLE_RATE (44100) 10 #define PA_SAMPLE_TYPE paFloat32 11 #define FRAMES_PER_BUFFER (64) 12 13 typedef float SAMPLE; 14 15 /* Non-linear amplifier with soft distortion curve. */ 16 float CubicAmplifier( float input ) 17 { 18 float output, temp; 19 if( input < 0.0 ) { 20 temp = input + 1.0f; 21 output = (temp * temp * temp) - 1.0f; 22 } else { 23 temp = input - 1.0f; 24 output = (temp * temp * temp) + 1.0f; 25 } 26 return output; 27 }
You can represent the signal in many ways with PortAudio. The most common mechanism is to use float values from -1.0 to +1.0 to represent the audio signal (paFloat32). You can also use 16-bit integers if you are more comfortable with that or some other representation. The CubicAmplifier() function simulates the distortion that an analog amplifier would produce, the mathematics of which are beyond the scope of the current discussion.
28 #define FUZZ(x) CubicAmplifier(CubicAmplifier(CubicAmplifier(CubicAmplifier(x)))) 29 30 static int gNumNoInputs = 0; 31 /* This routine will be called by the PortAudio engine /* when audio is needed. 32 ** It may be called at interrupt level on some machines, so ** don't do anything that could mess up the system, like 33 ** calling malloc() or free(). 34 */ 35 static int fuzzCallback( const void *inputBuffer, void *outputBuffer, 36 unsigned long framesPerBuffer, 37 const PaStreamCallbackTimeInfo* timeInfo, 38 PaStreamCallbackFlags statusFlags, 39 void *userData ) 40 { 41 SAMPLE *out = (SAMPLE*)outputBuffer; 42 const SAMPLE *in = (const SAMPLE*)inputBuffer; 43 unsigned int i; 44 45 if( inputBuffer == NULL ) { 46 for( i=0; i<framesPerBuffer; i++ ) 47 { 48 *out++ = 0; /* left - silent */ 49 *out++ = 0; /* right - silent */ 50 } 51 gNumNoInputs += 1; 52 } else { 53 for( i=0; i<framesPerBuffer; i++ ) 54 { 55 *out++ = FUZZ(*in++); /* left - distorted */ 56 *out++ = *in++; /* right - clean */ 57 } 58 } 59 60 return paContinue; 61 }
The PortAudio system is designed to work in a near real-time environment, thus the use of callback functions. The fuzzCallback() function sends an input buffer, output buffer, number of frames (for example, samples), time sequence, buffer status flags, and a pointer to a user-defined storage area. A frame in an input or output buffer contains a complete set of samples for all channels involved (in this case, two for stereo). The program has as many tuples as specified by the incoming frameCount (which may be zero); you’ve asked for 64 samples (FRAMES_PER_BUFFER).
Although this example uses two-channel audio, you can set up any number of channels. The fuzzCallback() function generates an empty buffer in the case of “no input.” If you do have input, you fuzz the left channel (zero) and copy the input clean on the right channel (one). If your distortion was sensitive in the time domain, you could use the timeInfo struct to retrieve the following times in seconds:
- When the first sample of the input buffer was received at the audio input
- When the first sample of the output buffer will begin being played at the audio output
- When the stream callback was called
63 /*************************************************************/ 64 int main(void) 65 { 66 PaStreamParameters inputP, outputP; 67 PaStream *stream; 68 PaError err; 69 70 err = Pa_Initialize(); 71 if( err != paNoError ) goto error; 72 74 inputP.device = Pa_GetDefaultInputDevice(); /* default input device */ 75 inputP.channelCount = 2; /* stereo input */ 76 inputP.sampleFormat = PA_SAMPLE_TYPE; 77 inputP.suggestedLatency = Pa_GetDeviceInfo( inputP.device ) ->defaultLowInputLatency; 78 inputP.hostApiSpecificStreamInfo = NULL; 79 80 outputP.device = Pa_GetDefaultOutputDevice(); /* default output device */ 81 outputP.channelCount = 2; /* stereo output */ 82 outputP.sampleFormat = PA_SAMPLE_TYPE; 83 outputP.suggestedLatency = Pa_GetDeviceInfo( outputP.device ) ->defaultLowOutputLatency; 84 outputP.hostApiSpecificStreamInfo = NULL; 85 86 err = Pa_OpenStream( 87 &stream, 88 &inputP, 89 &outputP, 90 SAMPLE_RATE, 91 FRAMES_PER_BUFFER, 92 0, /* paClipOff, */ /* we won't output out of range samples so don't * bother clipping them */ 93 fuzzCallback, 94 NULL ); 95 if( err != paNoError ) goto error;
Initializing PA and opening the stream are next. Pa_Initialize() must of course be the first PortAudio call your application uses, just as Pa_Terminate() is the last. After that, you need to set up the parameters of your input streams and output streams. The default input device is usually Microsoft Sound Mapper, which flows from the line-in input of your soundcard (or equivalent). Other possible inputs might be your modem input, CD audio, or other things depending on drivers and hardware. You also could create sophisticated callback algorithms where you mix multiple channels down to one channel or vice-versa.
Finally, you are ready to call Pa_OpenStream() and get the streams ready for immediate use. Because latency is always your enemy, separate opening the stream from starting the stream. The input and output channels must agree to the same sample rate (in this case, CD quality 44100Hz) and the same number of samples per buffer-load.
96 97 err = Pa_StartStream( stream ); 98 if( err != paNoError ) goto error; 99 100 printf("Hit ENTER to stop program.n"); 101 getchar(); 102 err = Pa_CloseStream( stream ); 103 if( err != paNoError ) goto error; 104 105 printf("Finished. gNumNoInputs = %dn", gNumNoInputs ); 106 Pa_Terminate(); 107 return 0; 108 109 error: 110 Pa_Terminate(); 111 fprintf( stderr, "An error occurred while using the portaudio streamn" ); 112 fprintf( stderr, "Error number: %dn", err ); 113 fprintf( stderr, "Error message: %sn", Pa_GetErrorText( err ) ); 114 return -1; 115 }
At first glance, the remainder of the program may leave you scratching your head. The Pa_StartStream() calls a platform-specific function to get a thread going, which begins callbacks immediately. The Win32 implementations all eventually call CreateThread(), although to me the WDMKS code seems a lot simpler than the Win MME version. The two ways to get out of the callback loop are returning a value of 1 or calling Pa_CloseStream().
Get Creative
Your creativity is the limit to what you can do with PortAudio: convert data streams from one format to another in real time, simulate surround sound or other sophisticated multi-channel audio, or even create performance-quality effects. Best of all, you aren’t overcommitted to any platform, which makes PortAudio my choice for open source audio projects.
About the Author
Victor Volkman has been writing for C/C++ Users Journal and other programming journals since the late 1980s. He is a graduate of Michigan Tech and a faculty advisor board member for Washtenaw Community College CIS department. Volkman is the editor of numerous books, including C/C++ Treasure Chest and is the owner of Loving Healing Press. He can help you in your quest for open source tools and libraries; just drop an e-mail to sysop@HAL9K.com.