Back to article

Add Text-To-Speech and Speech Recognition to Your Android Applications

April 29, 2011

Incorporating speech recognition and text-to-speech (TTS) services may prove very beneficial to certain types of applications and certain groups of users. Speech recognition involves listening for user's voice input, processing the recorded sound, and interpreting the results. TTS services involve taking text string data and having the device "read" the content aloud using a "voice". Hands-free applications, such as turn-based navigation utilities routinely use both technologies. Users with special needs, such as the visually impaired, also benefit from these features.

Android speech services are available within the SDK in the android.speech package. The speech recognition classes, such as the definition of the RecognizerIntent, can be found within this package. The TTS features are found in the android.speech.tts sub-package.

Applications require no special permissions to use Android speech services. Be aware, though, that speech recognition does require a data connection.

Note: Open source code is available for this tutorial.

Implementing Android Speech Recognition

Speech or voice recognition involves recording voice input using the device's microphone. The resulting sound file is then analyzed and translated into a string. The built-in speech recognition services available in the Android SDK come in two forms: "free form" is used for dictation purposes and "web search" is used for short command-like phrases. You can also develop your own recognition services using the classes available in the android.speech package.

Access to speech recognition is built into the default software keyboard starting in Android 2.1. Therefore, your application may already support basic voice input without any changes whatsoever. However, directly accessing the recognizer can allow for more interesting spoken word control over applications.

The simplest speech recognition case involves launching the android.speech.RecognizerIntent intent to leverage the built-in speech recorder. This launches the Android speech recorder which prompts the user to record speech input. The resulting sound file is sent to an underlying recognition server for processing, requiring an internet connection. The results are returned to the calling activity. Here's an example:

public class SimpleSpeechActivity extends Activity
private static final int VOICE_RECOGNITION_REQUEST = 0x10101;
public void onCreate(Bundle savedInstanceState) {
public void speakToMe(View view) {
Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
"Please speak slowly and enunciate clearly.");
startActivityForResult(intent, VOICE_RECOGNITION_REQUEST);
protected void onActivityResult(int requestCode, int resultCode, Intent data) {
if (requestCode == VOICE_RECOGNITION_REQUEST && resultCode == RESULT_OK) {
ArrayList matches = data
TextView textView = (TextView) findViewById(;
String firstMatch = matches.get(0);

In this case, the intent is initiated through the click of a Button control, which causes the speakToMe() method to be called. The RecognizerIntent is configured as follows:

  • The intent action is set to ACTION_RECOGNIZE_SPEECH in order to prompt the user to speak and send that sound file in for speech recognition.
  • An intent extra called EXTRA_LANGUAGE_MODEL is set to LANGUAGE_MODEL_FREE_FORM in order to perform standard speech recognition. There is also another language model especially for Web searches called LANGUAGE_MODEL_WEB_SEARCH.
  • An intent extra called EXTRA_PROMPT is set to a string to display to the user during speech input.

Android Speech Services

Sitemap | Contact Us