This tutorial will give you a brief introduction of the Android Speech API used for voice recognition, which is an area of computational linguistics that develops methodologies and technologies automating recognition and translation of spoken language into text—Speech-to-Text (STT). Previously, in another tutorial, we covered the topic for Text-to-Speech (TTS); the tutorial was called “Adding Basic Android Text-To-Speech to Your Apps.” You are more than welcome to check it out also. STT has numerous practical applications—home automation, security authentication, data entry, subtitling and translation, robotics, gaming, and so forth.
Learn Mobile Development and Start your Free Trial today!
The Android Speech API provides recognition control, background services, intents, and support for multiple languages. Again, it can look like a simple addition to the user input for your apps, but it’s a very powerful feature that makes them stand out. Imagine how helpful this feature can be for those people with disabilities using a keyboard or simply for those trying to find a way to increase productivity and improve their work flow.
API Package and Device Support
public class ShowSupportedLanguages extends Activity { private TextView mTextView; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.lang); if (!SpeechRecognizer.isRecognitionAvailable(this)) { updateResults("nNo voice recognition support on your device!"); } else { LanguageDetailsReceiver ldr = new LanguageDetailsReceiver(this); sendOrderedBroadcast(RecognizerIntent .getVoiceDetailsIntent(this), null, ldr, null, Activity.RESULT_OK, null, null); } } void updateResults(String s) { mTextView = (TextView)findViewById(R.id.tvlanglist); mTextView.setText(s); } }
Listing 1: Display Supported Speech Languages
public class LanguageDetailsReceiver extends BroadcastReceiver { List<string> mLanguages; ShowSupportedLanguages mSSL; public LanguageDetailsReceiver(ShowSupportedLanguages ssl) { mSSL = ssl; mLanguages= new ArrayList<string>(); } @Override public void onReceive(Context context, Intent intent) { Bundle extras = getResultExtras(true); mLanguages = extras.getStringArrayList (RecognizerIntent.EXTRA_SUPPORTED_LANGUAGES); if (mLanguages == null) { mSSL.updateResults("No voice data found."); } else { String s = "nList of language voice data:n"; for (int i = 0; i < mLanguages.size(); i++) { s += (mLanguages.get(i) + ", "); } s += "n"; mSSL.updateResults(s); } } }
Listing 2: Language Details Broadcast Receiver
Figure 3: Supported Speech Data
Basic Voice Recognition Example
We are now ready to start a basic example utilizing the voice recognition feature. RecognizerIntent.ACTION_RECOGNIZE_SPEECH is the intent defining the request. The only requirement is to specify RecognizerIntent.EXTRA_LANGUAGE_MODEL, which is assigned with RecognizerIntent.LANGUAGE_MODEL_FREE_FORM in our case. If another language is needed, you can supply the data for RecognizerIntent.EXTRA_LANGUAGE. Otherwise, the recognizer will simply use the default locale. To make the example more interesting, we also use RecognizerIntent.EXTRA_PROMPT to prompt a question. Then, we can start the recognition intent.
Once the recognition results are returned, they are saved in the data bundle associated with RecognizerIntent.EXTRA_RESULTS. In this example, we basically check if the answer contains a substring “Amazon”. Depending on your voice input, it will respond with the message on screen accordingly. The code is implemented in Listing 3.
When the app is run, it will prompt you the question message with a microphone icon waiting for you to say something, as in Figure 4. In Figure 5, I intentionally responded with “Google”, which does not contain the substring “Amazon” and therefore the result message was displayed that way.
public class StartVoiceRecognition extends Activity { private final int REQUEST_SPEECH_RECOGNIZER = 3000; private TextView mTextView; private final String mQuestion = "Which company is the largest online retailer on the planet?"; private String mAnswer = ""; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.voicerecog); mTextView = (TextView)findViewById(R.id.tvstt); startSpeechRecognizer(); } private void startSpeechRecognizer() { Intent intent = new Intent (RecognizerIntent.ACTION_RECOGNIZE_SPEECH); intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM); intent.putExtra(RecognizerIntent.EXTRA_PROMPT, mQuestion); startActivityForResult(intent, REQUEST_SPEECH_RECOGNIZER); } @Override protected void onActivityResult(int requestCode, int resultCode, Intent data) { super.onActivityResult(requestCode, resultCode, data); if (requestCode == REQUEST_SPEECH_RECOGNIZER) { if (resultCode == RESULT_OK) { List<string> results = data.getStringArrayListExtra (RecognizerIntent.EXTRA_RESULTS); mAnswer = results.get(0); if (mAnswer.toUpperCase().indexOf("AMAZON") > -1) mTextView.setText("nnQuestion: " + mQuestion + "nnYour answer is '" + mAnswer + "' and it is correct!"); else mTextView.setText("nnQuestion: " + mQuestion + "nnYour answer is '" + mAnswer + "' and it is incorrect!"); } } } }
Listing 3: Basic Voice Recognition Example
Figure 4: Voice Recognition in Action
Figure 5: Voice Recognition Result
Conclusion
Android makes the speech API easy and powerful enough to use for anyone interested in adding the voice recognition feature to their apps. We made a brief introduction of how to set it up, what recognizer intents are, what your device supports, and how to provide multi-lingual support through some basic examples. Because Speech-to-Text (STT) technology is popular in many practical applications, ranging from improving personal productivity to controlling complicated robots, it surely will become more and more common in daily-life software and hardware alike.
There are some other sample projects available in the Google Android official repository. They also have the voice recognition feature integrated, so it is highly recommended you check out different applications; they may give some great ideas for your users. For advanced developers, you should find something interesting in this Speech Recognition API offered by Google Cloud Platform.
References
- Android Developers
- android.speech
- SpeechRecognizer
- RecognizerIntent
- Google Cloud Speech API
- Source Code of this Tutorial
About the Author
Chunyen Liu has been a software veteran in Taiwan and the United States. He is a published author of 40+ articles and 100+ tiny apps, a software patentee, technical reviewer, and programming contest winner by ACM/IBM/SUN. He holds advanced degrees in Computer Science with 20+ graduate-level classes. On the non-technical side, he is enthusiastic about the Olympic sport of table tennis, being a USA certified umpire, certified coach, certified referee, and categorized event winner at State Championships and the US Open. |