dcsimg
December 18, 2017
Hot Topics:

Adding Basic Android Text-To-Speech to Your Apps

  • December 1, 2017
  • By Chunyen Liu
  • Send Email »
  • More Articles »

Text-to-Speech (TTS), also known as speech synthesis, in Android is an easy yet powerful feature you can use to supplement your apps in terms of benefiting your users in a thoughtful way. For people who have learning disabilities, visual impairment, or many other things going on at the same time, this simple addition will make their lives a lot easier and your apps friendlier. The applications or devices utilizing TTS technology cover a large variety of areas—education, mobile technologies, screen readers, communications, disabilities, and so forth.

Android TTS API offers multiple-language support, control of voice characteristics and features, file output, and so on. With just a small number of lines of code, you can make your apps reach out to a wider audience. Needless to say, you will see more and more apps that support this feature, so yours also should take advantage of it and follow suit in the the modern software trend.

API Package and Device Support

Android's official Text-to-Speech API with main programming interfaces and classes since early Level 4 can be found here.

Before we start looking into this feature from a software developer's perspective, let us see what TTS functionalities are already available on your device. On almost all Android devices, you can start the settings and find the option for "Text-to-speech output." My mobile phone, a Nexus 6P, is running on Android 8.0 Oreo and. I locate the option in "Settings -> System -> Languages and input -> Advanced -> Text-to-speech output." You should see something similar to Figure 1. When you select "Language" on Figure 1, there will be a list of supported languages you can download to use, as in Figure 2, depending on what device you have. When it comes to TTS-enabled apps, the common place to find TTS in action is when you open an electronic book from Google Play Books. From the book's menu, you can see a menu item called "Read aloud," as in Figure 3.

Once it is selected, the book content will be read out to you; it's as simple as that. Of course, the well-known Google Translate also has multi-language TTS support and it can be used as a basic language learning tool.

Text-to-speech on the device
Figure 1: Text-to-speech on the device

TTS voice data
Figure 2: TTS voice data

TTS example on Google Play Books
Figure 3: TTS example on Google Play Books

TTS Data, Languages, and Locales

There are a few things to check to make sure the TTS feature is intended for us to use the way we want it to work. Although your device has some TTS resources pre-installed by default, the apps you are working on may still need some different resources. Owing to a device's limited storage space and the TTS data size, some resources are only made available for installation while needed. First, we should ask whether TTS data is installed at all, as shown in Listing 1. If not, we should go ahead and query the platform for the availability of language files and start the downloading process, as in Listing 2. When data is there, we also can set up TextToSpeech.OnInitListener to see if the intended locale is supported or available, as in Listing 3. Display a notification to users if the specific locale is not supported. When it does support and it is loaded, you can make the engine say a short greeting message.

Intent ttsIntent = new Intent();
ttsIntent.setAction(TextToSpeech.Engine.ACTION_CHECK_TTS_DATA);
startActivityForResult(ttsIntent, ACT_CHECK_TTS_DATA);

Listing 1: Checking TTS resources

private TextToSpeech mTTS;
protected void onActivityResult(int requestCode, int resultCode,
     Intent data) {
   if (requestCode == ACT_CHECK_TTS_DATA) {
     if (resultCode ==
         TextToSpeech.Engine.CHECK_VOICE_DATA_PASS) {
       // Data exists, so we instantiate the TTS engine
       mTTS = new TextToSpeech(this, this);
     } else {
       // Data is missing, so we start the TTS installation
       // process
       Intent installIntent = new Intent();
       installIntent.setAction(TextToSpeech.Engine.ACTION
         _INSTALL_TTS_DATA);
       startActivity(installIntent);
     }
   }
}

Listing 2: Installing TTS resources if needed

public void onInit(int status) {
   if (status == TextToSpeech.SUCCESS) {
     if (mTTS != null) {
       int result = mTTS.setLanguage(Locale.US);
       if (result == TextToSpeech.LANG_MISSING_DATA || result ==
            TextToSpeech.LANG_NOT_SUPPORTED) {
         Toast.makeText(this, "TTS language is not supported",
            Toast.LENGTH_LONG).show();
       } else {
         // Do something here
       }
      }
   } else {
     Toast.makeText(this, "TTS initialization failed",
       Toast.LENGTH_LONG).show();
   }
}

Listing 3: Implementing TextToSpeech.OnInitListener

Basic TTS Example

With the general settings and validations from the previous section, we are ready to start the text-to-speech engine to read out the text in the basic example. In the layout, there are only a text field where you can enter whatever text you want and a button to issue the reading request, as you can see in Figure 4.

In Listing 4, the main speech engine mTTS is declared by TextToSpeech. We also should remember to terminate the service when it is no longer in use, as in mTTS.shutdown(). In saySomething(), there are two playback queue modes. TextToSpeech.QUEUE_ADD adds the new entry at the end of the playback queue. TextToSpeech.QUEUE_FLUSH drops all entries in the playback queue and replaces with the new entry.

TTS in action
Figure 4: TTS in action

public class TutorialOnTTS extends Activity implements
      TextToSpeech.OnInitListener {
   TextToSpeech mTTS = null;
   private final int ACT_CHECK_TTS_DATA = 1000;

   @Override
   protected void onCreate(Bundle savedInstanceState) {
      super.onCreate(savedInstanceState);
      setContentView(R.layout.main);
      final EditText ettext = (EditText)findViewById(R.id.ettext);
      final Button bsay = (Button)findViewById(R.id.bsay);
      bsay.setOnClickListener(new View.OnClickListener() {
         public void onClick(View v) {
            saySomething(ettext.getText().toString().trim(), 1);
         }
      });

      // Check to see if we have TTS voice data
      Intent ttsIntent = new Intent();
      ttsIntent.setAction(TextToSpeech.Engine.ACTION
         _CHECK_TTS_DATA);
      startActivityForResult(ttsIntent, ACT_CHECK_TTS_DATA);
   }

   private void saySomething(String text, int qmode) {
      if (qmode == 1)
         mTTS.speak(text, TextToSpeech.QUEUE_ADD, null);
      else
         mTTS.speak(text, TextToSpeech.QUEUE_FLUSH, null);
   }

   @Override
   protected void onActivityResult(int requestCode, int resultCode,
         Intent data) {
      if (requestCode == ACT_CHECK_TTS_DATA) {
         if (resultCode == TextToSpeech.Engine.CHECK_VOICE
               _DATA_PASS) {
            // Data exists, so we instantiate the TTS engine
            mTTS = new TextToSpeech(this, this);
         } else {
            // Data is missing, so we start the TTS
            // installation process
            Intent installIntent = new Intent();
            installIntent.setAction(TextToSpeech.Engine.ACTION
               _INSTALL_TTS_DATA);
            startActivity(installIntent);
         }
      }
   }

   public void onInit(int status) {
      if (status == TextToSpeech.SUCCESS) {
         if (mTTS != null) {
            int result = mTTS.setLanguage(Locale.US);
            if (result == TextToSpeech.LANG_MISSING_DATA ||
                  result == TextToSpeech.LANG_NOT_SUPPORTED) {
               Toast.makeText(this, "TTS language is not
                  supported", Toast.LENGTH_LONG).show();
            } else {
               saySomething("TTS is ready", 0);
            }
         }
      } else {
         Toast.makeText(this, "TTS initialization failed",
            Toast.LENGTH_LONG).show();
      }
   }

   @Override
   protected void onDestroy() {
      if (mTTS != null) {
         mTTS.stop();
         mTTS.shutdown();
      }
      super.onDestroy();
   }
}

Listing 4: Basic TTS example

File-based TTS

Sometimes, you will want to read the same text over and over again, so for performance reasons, you can save the result into an audio file instead of doing speech synthesis conversion repeatedly. Before you output the result to the external storage, you need to request the permission dynamically as well as add it to the AndroidManifest.xml file. Here is the required permission:

<uses-permission android:name="android.permission.WRITE_
   EXTERNAL_STORAGE" />

In Figure 5 and Listing 5, we add a new button for saving the result into an audio file and try to use checkSelfPermission to make sure the permission specified in the manifest file is granted; in other words, the permission status for Manifest.permission.WRITE_EXTERNAL_STORAGE. If not, we should dynamically request the permission through requestPermissions(). Then, we create the folder for the audio files, if needed. The code segment for the file saving process is in saveToAudioFile(). To maintain backward compatibility, TextToSpeech.synthesizeToFile()is called with the versions before and after the dividing Android platform 5.0.

TTS saved into file
Figure 5: TTS saved into file

   public class TutorialOnTTS extends Activity implements
      TextToSpeech.OnInitListener {
   ...
   private final int REQUEST_PERMISSION_WRITE_EXTERNAL_STORAGE =
      2000;
   private int permissionCount = 0;
   private String mAudioFilename = "";
   private final String mUtteranceID = "totts";

   @Override
   protected void onCreate(Bundle savedInstanceState) {
      ...

      final Button bsave = (Button)findViewById(R.id.bsave);
      bsave.setOnClickListener(new View.OnClickListener() {
         public void onClick(View v) {
            saveToAudioFile(ettext.getText().toString().trim());
         }
      });

      // Perform the dynamic permission request
      if (checkSelfPermission(Manifest.permission.WRITE_EXTERNAL
            _STORAGE) != PackageManager.PERMISSION_GRANTED)
         requestPermissions(new String[]{Manifest.permission
               .WRITE_EXTERNAL_STORAGE},
               REQUEST_PERMISSION_WRITE_EXTERNAL_STORAGE);

      // Create audio file location
      File sddir = new File(Environment
         .getExternalStorageDirectory() + "/TutorialOnTTS/");
      sddir.mkdirs();
      mAudioFilename = sddir.getAbsolutePath() + "/" +
         mUtteranceID + ".wav";

      ...
   }

   @Override
   public void onRequestPermissionsResult(int requestCode,
         String[] permissions, int[] grantResults) {
      super.onRequestPermissionsResult(requestCode, permissions,
         grantResults);
      switch (requestCode) {
         case REQUEST_PERMISSION_WRITE_EXTERNAL_STORAGE:
            if (grantResults[0] ==
               PackageManager.PERMISSION_GRANTED)
               permissionCount++;
         default:
            break;
      }
   }

   private void saveToAudioFile(String text) {
      if (Build.VERSION.SDK_INT >= Build.VERSION_CODES.LOLLIPOP) {
         mTTS.synthesizeToFile(text, null, new
            File(mAudioFilename), mUtteranceID);
      } else {
         HashMap<string, string> hm = new HashMap();
         hm.put(TextToSpeech.Engine.KEY_PARAM_UTTERANCE_ID,
            mUtteranceID);
         mTTS.synthesizeToFile(text, hm, mAudioFilename);
      }
      
      mTTS.setOnUtteranceCompletedListener(new TextToSpeech
            .OnUtteranceCompletedListener() {
         public void onUtteranceCompleted(String uid) {
            if (uid.equals(mUtteranceID)) {
               Toast.makeText(TutorialOnTTS.this, "Saved to " +
                  mAudioFilename, Toast.LENGTH_LONG).show();
            }
         }
      });
   }
   
   ...
}

Listing 5: Adding file-based TTS support

Conclusion

In this tutorial, Android's Text-to-Speech API was introduced and demonstrated how straightforward and simple it is to add to your apps in the examples. We also looked at how to check your device settings and language availability. With this handy addition, many apps from different areas can improve their quality and presence as far as user interaction and accessibility is concerned.

TensorFlow, in the References section, is one of the advanced applications that uses the TTS feature. After the image is captured from the camera, it then is converted and piped into an object recognition model that identifies what is in the image. After it is classified and labeled, the text then is read out directly to the user. You are strongly recommended to check out its implementation.

References

About the Author

Author Chunyen Liu has been a software veteran in Taiwan and the United States. He is a published author of 40+ articles and 100+ tiny apps, a software patentee, technical reviewer, and programming contest winner by ACM/IBM/SUN. He holds advanced degrees in Computer Science with 20+ graduate-level classes. On the non-technical side, he is enthusiastic about the Olympic sport of table tennis, being a USA certified umpire, certified coach, certified referee, and categorized event winner at State Championships and the US Open.





Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Enterprise Development Update

Don't miss an article. Subscribe to our newsletter below.

Sitemap | Contact Us

Thanks for your registration, follow us on our social networks to keep up-to-date