September 1, 2014
Hot Topics:
RSS RSS feed Download our iPhone app

Processing Speech with Java

  • September 26, 2002
  • By Sams Publishing
  • Send Email »
  • More Articles »

Let's work an example that illustrates how to use JSML to improve speech output. First, we need to learn how to remove the hard-coded text from our program so that it can play different JSML documents. The easiest way to do this is by creating a new class that implements the javax.speech.synthesis.Speakable interface. This interface requires one method, getJSMLText(). It returns a string that contains JSML text. Listing 12.5 shows us an example of this class.

Listing 12.5 The SpeakableDate Class

/*
 * SpeakableDate.java
 *
 * Created on March 12, 2002, 11:54 AM
 */

package unleashed.ch12;

import javax.speech.synthesis.Speakable;
import java.util.Date;

/**
 *
 * @author Stephen Potts
 * @version
 */
public class SpeakableDate implements Speakable
{

  /** Creates new SpeakableDate */
  public SpeakableDate()
  {
  }

  /** getJSMLText is the only method of Speakable */
  public String getJSMLText()
  {
    StringBuffer buf = new StringBuffer();
    String todayString = "3/12/2002";

    // Speak the sender's name slower to be clearer
    buf.append("<jsml>");
    buf.append("Today is " + todayString );
    buf.append("Today is <sayas class=\"date\">"+ todayString +" </sayas>");
    buf.append("</jsml>");

    return buf.toString();
  }
}

This class exists to provide a JSML document to a synthesizer. Some programmers consider it more convenient to create their XML documents in a class such as this one instead of in a file. This class uses StringBuffer while the string is being built up because it performs better than the String class when it is changed frequently.

    buf.append("Today is " + todayString );
    buf.append("Today is <sayas class=\"date\">"+ todayString +" </sayas>");

Two separate strings are added to the JSML document. The first string simply adds the string literal. The second identifies it as a date. Listing 12.6 shows us the synthesizer class that runs this JSML.

Listing 12.6 The JSMLSpeaker Class

/*
 * JSMLSpeaker.java
 *
 * Created on March 5, 2002, 3:32 PM
 */

package unleashed.ch12;

/**
 *
 * @author Stephen Potts
 * @version
 */
import javax.speech.*;
import javax.speech.synthesis.*;
import java.util.Locale;

public class JSMLSpeaker
{

  public static void main(String args[])
  {
    Speakable sAble = new SpeakableDate();
    try
    {
      // Create a synthesizer for English
      Synthesizer synth = Central.createSynthesizer(
          new SynthesizerModeDesc(Locale.ENGLISH));
      synth.allocate();
      synth.resume();

      // Speak the string
      synth.speak(sAble, null);
      System.out.println("You are hearing the JSML output now.");

      // Wait till speaking is done
      synth.waitEngineState(Synthesizer.QUEUE_EMPTY);

      // release the resources
      synth.deallocate();
    } catch (Exception e)
    {
      e.printStackTrace();
    }
  }
}

This class resembles the synthesizer classes that we mentioned earlier in that it allocates and deallocates the synthesizer. It is different, though, because it expects to process a Speakable class instead of a string literal. The Speakable class is called SpeakableDate, and it is shown earlier in Listing 12.5.

    Speakable sAble = new SpeakableDate();

Conceptually, this is a class version of a JSML document. We can pass a handle to that class to the synthesizer's speak() method. This method treats it similar to an XML document that is parsed and interpreted as speech.

The output from running this program is audible and visible. The audible portion reads the date as characters with the slashes pronounced:

Today is three slash twelve slash two thousand two

The second time, the date is pronounced more like a date with the slashes omitted:

Today is three twelve two thousand two

In addition, the following visual message is written to the console:

You are hearing the JSML output now.

Listing 12.7 illustrates another example that shows some interesting JSML elements.

Listing 12.7 The SpeakableSlow Class

/*
 * SpeakableSlow.java
 *
 * Created on March 12, 2002, 11:54 AM
 */

package unleashed.ch12;

import javax.speech.synthesis.Speakable;

/**
 *
 * @author Stephen Potts
 * @version
 */
public class SpeakableSlow implements Speakable
{

  /** Creates new SpeakableShares */
  public SpeakableSlow()
  {
  }

  /** getJSMLText is the only method of Speakable */
  public String getJSMLText()
  {
    StringBuffer buf = new StringBuffer();

    // Speak the sender's name slower to be clearer
    buf.append("<jsml>");
    buf.append("I own 1500 shares");
    buf.append("I own 1500 shares.");
    buf.append("<div>I own <PROS RATE=\"-20%\">1,500</PROS> shares</div> ");
    buf.append("I own <PROS RATE=\"-50%\">1500</PROS> shares");
    buf.append("</jsml>");

    return buf.toString();
  }
}

You can parse this document by changing the JSMLSpeaker class that is shown in Listing 12.6 by altering the following line:

    Speakable sAble = new SpeakableSlow();

Note - The code file for this modified example is available in the source code for this chapter and is named JSMLSpeaker2.java.


Several features are illustrated in this example. The first line is read at a normal speed. When the end is reached, it combines the first and second line and then tries to pronounce "sharesI." The first attempt to cure this problem was to add a period at the end of the next line. This causes "dot" to be uttered:

    buf.append("I own 1500 shares");
    buf.append("I own 1500 shares.");

Finally, the elements <div> and </div> identified the text between them as a division, and the synthesizer stopped slurring the words together. A space will work in some instances, but the explicit <div> tag is normally used.

Next, we added some special elements called prosody elements to slow down the speaking of the numbers. Prosody elements provide details on the rate of speech that is desired:

    buf.append("<div>I own <PROS RATE=\"-20%\">1,500</PROS> shares</div> ");

The rate dropped by 20% so that the most important part of the sentence, the number of shares, would be easier to comprehend. Notice the difference between the way that the number is pronounced, depending on the presence or absence of the comma in the numerals. If the comma is present, the number is pronounced one thousand five hundred. Without the comma, it is pronounced like the year 1500.

You can slow the rate down even more if you choose, as we did in the last sentence:

    buf.append("I own <PROS RATE=\"-50%\">1500</PROS> shares");

This will make the output sound as if it needs to take some vitamins.

The output from running this will be the phrase repeated four times with the previously mentioned variations. At the same time, the following phrase will appear on your console to prove that the program ran:

You are hearing the JSML output now.

Finally, we will create a JSML version of the HelloShares example in Listing 12.4. As you recall, the "1999" phrase was pronounced "nineteen ninety-nine" instead of "one thousand nine hundred ninety nine." The JSML version of the program is shown in Listing 12.8.

Listing 12.8 The SpeakableShares Class

/*
 * SpeakableShares.java
 *
 * Created on March 12, 2002, 11:54 AM
 */

package unleashed.ch12;

import javax.speech.synthesis.Speakable;
import java.util.Date;

/**
 *
 * @author Stephen Potts
 * @version
 */
public class SpeakableShares implements Speakable
{
  
  /** Creates new SpeakableShares */
  public SpeakableShares()
  {
  }
  
  /** getJSMLText is the only method of Speakable */
  public String getJSMLText()
  {
    StringBuffer buf = new StringBuffer();
    String shareString = "1,999";
    
    // Speak the shareString
    buf.append("<jsml>");
    buf.append("I own 1999 shares of stock");
    buf.append("I own <sayas class=\"number\">"+ shareString 
               +" </sayas>" + " shares of stock");
    buf.append("</jsml>");
    
    return buf.toString();
  }
}

Notice that the shareString contains a comma between the first and second digits:

    String shareString = "1,999";




Page 4 of 5



Comment and Contribute

 


(Maximum characters: 1200). You have characters left.

 

 


Sitemap | Contact Us

Rocket Fuel