January 23, 2021
Hot Topics:

Exploring the Android Speech API for Voice Recognition

  • By Chunyen Liu
  • Send Email »
  • More Articles »

This tutorial will give you a brief introduction of the Android Speech API used for voice recognition, which is an area of computational linguistics that develops methodologies and technologies automating recognition and translation of spoken language into text—Speech-to-Text (STT). Previously, in another tutorial, we covered the topic for Text-to-Speech (TTS); the tutorial was called "Adding Basic Android Text-To-Speech to Your Apps." You are more than welcome to check it out also. STT has numerous practical applications—home automation, security authentication, data entry, subtitling and translation, robotics, gaming, and so forth.

Learn Mobile Development and Start your Free Trial today!

The Android Speech API provides recognition control, background services, intents, and support for multiple languages. Again, it can look like a simple addition to the user input for your apps, but it's a very powerful feature that makes them stand out. Imagine how helpful this feature can be for those people with disabilities using a keyboard or simply for those trying to find a way to increase productivity and improve their work flow.

API Package and Device Support

Android's official Speech API with main programming interfaces and classes since Level 3 can be located at this link.

The classes we are mainly interested in for voice recognition are SpeechRecognizer and RecognizerIntent. The most important intent is RecognizerIntent.ACTION_RECOGNIZE_SPEECH with only one required extra data source, RecognizerIntent.EXTRA_LANGUAGE_MODEL, in the bundle to start the recognition process. If you want to use a language other than the default one, you can specify RecognizerIntent.EXTRA_LANGUAGE for that purpose.

Has voice recognition been used anywhere on your device already? Similar to the settings on my Nexus 6P running Android 8.0 Oreo, you also can find the option in "Settings -> System -> Languages and input -> Advanced -> Virtual keyboard -> Google voice typing," as shown in Figures 1 and 2. This is why you can simply do a Web search by speaking into the microphone or emulate the typing when presented with an on-screen keyboard. You can see that the possibilities of using this technology are unlimited.

Virtual Keyboard
Figure 1: Virtual Keyboard

Google Voice Typing
Figure 2: Google Voice Typing

Speech Data for Multiple Languages

First off, we can check if your device even supports the STT feature by using SpeechRecognizer.isRecognitionAvailable(). If it does, we can go ahead and use sendOrderedBroadcast() to request the current voice data details, as demonstrated in Listing 1. Through the broadcast receiver, we can unpack the result bundle associated with RecognizerIntent.EXTRA_SUPPORTED_LANGUAGES, as in Listing 2. The results are captured in Figure 3. They are in the international format of Best Current Practice (BCP) 47.

public class ShowSupportedLanguages extends Activity {
   private TextView mTextView;

   protected void onCreate(Bundle savedInstanceState) {
      if (!SpeechRecognizer.isRecognitionAvailable(this)) {
         updateResults("\nNo voice recognition support on
            your device!");
      } else {
         LanguageDetailsReceiver ldr = new
            .getVoiceDetailsIntent(this), null, ldr, null,
             Activity.RESULT_OK, null, null);

   void updateResults(String s) {
      mTextView = (TextView)findViewById(R.id.tvlanglist);

Listing 1: Display Supported Speech Languages

public class LanguageDetailsReceiver extends BroadcastReceiver {
   List<string> mLanguages;
   ShowSupportedLanguages mSSL;

   public LanguageDetailsReceiver(ShowSupportedLanguages ssl) {
      mSSL = ssl;
      mLanguages= new ArrayList<string>();

   public void onReceive(Context context, Intent intent)
      Bundle extras = getResultExtras(true);
      mLanguages = extras.getStringArrayList
      if (mLanguages == null) {
         mSSL.updateResults("No voice data found.");
      } else {
         String s = "\nList of language voice data:\n";
         for (int i = 0; i < mLanguages.size(); i++) {
            s += (mLanguages.get(i) + ", ");
            s += "\n";

Listing 2: Language Details Broadcast Receiver

Supported Speech Data
Figure 3: Supported Speech Data

Basic Voice Recognition Example

We are now ready to start a basic example utilizing the voice recognition feature. RecognizerIntent.ACTION_RECOGNIZE_SPEECH is the intent defining the request. The only requirement is to specify RecognizerIntent.EXTRA_LANGUAGE_MODEL, which is assigned with RecognizerIntent.LANGUAGE_MODEL_FREE_FORM in our case. If another language is needed, you can supply the data for RecognizerIntent.EXTRA_LANGUAGE. Otherwise, the recognizer will simply use the default locale. To make the example more interesting, we also use RecognizerIntent.EXTRA_PROMPT to prompt a question. Then, we can start the recognition intent.

Once the recognition results are returned, they are saved in the data bundle associated with RecognizerIntent.EXTRA_RESULTS. In this example, we basically check if the answer contains a substring "Amazon". Depending on your voice input, it will respond with the message on screen accordingly. The code is implemented in Listing 3.

When the app is run, it will prompt you the question message with a microphone icon waiting for you to say something, as in Figure 4. In Figure 5, I intentionally responded with "Google", which does not contain the substring "Amazon" and therefore the result message was displayed that way.

public class StartVoiceRecognition extends Activity {
   private final int REQUEST_SPEECH_RECOGNIZER = 3000;
   private TextView mTextView;
   private final String mQuestion = "Which company is the largest
      online retailer on the planet?";
   private String mAnswer = "";

   protected void onCreate(Bundle savedInstanceState) {
      mTextView = (TextView)findViewById(R.id.tvstt);

   private void startSpeechRecognizer() {
      Intent intent = new Intent
      intent.putExtra(RecognizerIntent.EXTRA_PROMPT, mQuestion);
      startActivityForResult(intent, REQUEST_SPEECH_RECOGNIZER);

   protected void onActivityResult(int requestCode, int resultCode,
         Intent data) {
      super.onActivityResult(requestCode, resultCode, data);

      if (requestCode == REQUEST_SPEECH_RECOGNIZER) {
         if (resultCode == RESULT_OK) {
            List<string> results = data.getStringArrayListExtra
            mAnswer = results.get(0);

            if (mAnswer.toUpperCase().indexOf("AMAZON") > -1)
               mTextView.setText("\n\nQuestion: " + mQuestion +
                  "\n\nYour answer is '" + mAnswer +
                  "' and it is correct!");
               mTextView.setText("\n\nQuestion: " + mQuestion +
                  "\n\nYour answer is '" + mAnswer +
                  "' and it is incorrect!");

Listing 3: Basic Voice Recognition Example

Voice Recognition in Action
Figure 4: Voice Recognition in Action

Voice Recognition Result
Figure 5: Voice Recognition Result


Android makes the speech API easy and powerful enough to use for anyone interested in adding the voice recognition feature to their apps. We made a brief introduction of how to set it up, what recognizer intents are, what your device supports, and how to provide multi-lingual support through some basic examples. Because Speech-to-Text (STT) technology is popular in many practical applications, ranging from improving personal productivity to controlling complicated robots, it surely will become more and more common in daily-life software and hardware alike.

There are some other sample projects available in the Google Android official repository. They also have the voice recognition feature integrated, so it is highly recommended you check out different applications; they may give some great ideas for your users. For advanced developers, you should find something interesting in this Speech Recognition API offered by Google Cloud Platform.


About the Author

Author Chunyen Liu has been a software veteran in Taiwan and the United States. He is a published author of 40+ articles and 100+ tiny apps, a software patentee, technical reviewer, and programming contest winner by ACM/IBM/SUN. He holds advanced degrees in Computer Science with 20+ graduate-level classes. On the non-technical side, he is enthusiastic about the Olympic sport of table tennis, being a USA certified umpire, certified coach, certified referee, and categorized event winner at State Championships and the US Open.

This article was originally published on January 18, 2018

Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Thanks for your registration, follow us on our social networks to keep up-to-date