[fa.info-mac] Using Smoothtalker

info-mac@uw-beaver (12/07/84)

From: Mike Schuster <MIKES@CIT-20.ARPA>

Smoothtalker Demo Disk
 
 I talked with the people at First Byte last week about their product
 Smoothtalker.  They mentioned that they had a software developers
 package for about $500, a retail package for $150, and a promotional
 demo disk, which they offered to send me free of charge.
 
 I received the disk two days later, along with a few pages of notes.
 The disk contains the Smoothtalker demo application and its companion
 picture resource file.  No licensing agreement was included in the 
 package, and the disk is stamped "Demo - Ok to copy".
 
 I concluded that the software is in the public domain and decided to
 provide some documentation.
 
 The demo application (which I will call ST from now on) is a rolling
 demonstration consisting of a sequence of cartoons and spoken text.
 Using Apple's RMOVER, I found that the text fragments are contained
 in ST as 'STR ' resources with id numbers 256 and up.  Each of these
 resources contains a length byte followed by a sequence of ASCII 
 phoneme codes.  Here is 'STR ' resource 256, which says "The freedom
 of speech":
 
 V6S25DHAXf9rIYS9d5AHS6m3AHv S79spy0IYCH
 
 After studying all of these examples, I compiled the following dictionary
 of phoneme codes:
 
 AE pAt/EY Ate/b BiB/CH CHew/d DeeD/EH pEt/IY bE/AX thE
 f Fit/g GaG/h Hat/IH hIt/AY pIE/j Just/k KiCK/l seLf
 m MuM/n No/NG siNG/AA pOt/OW gO/AO fOr/OY bOY/AW OUt
 UH tOOk/UW cOO/p PoP/r Run/s SauCe/SH SHy/t To/TH THin
 DH THe/AH cUt/ER vERb/v oF/w Wag/y Yes/z siZe/ZH viSion
 V# volume #/S# speed #/# pitch #/B base/T treble/] +stress/[ -stress
 
 I have listed each phoneme code along with a short word containing that
 phoneme.  I capitalized the corresponding letters in the word.  Notice
 that the double letter phoneme codes are in upper case, where as the
 single letter codes are lower case.  The volume, speed and pitch values
 are the ASCII digits 0, 1, ..., 9.  These values apply to all subsequence 
 phonemes until changed.  The ] and [ codes seem to add and subtract
 stress from the following phoneme, respectively. The B and T codes 
 select low base and high treble voice, respectively.
 
 Here is a rough translation of 256:
 
 V6S25DHAXf9rIYS9d5AHS6m3AHv S79spy0IYCH
 
 V6S25DAHX --- "the" spoken with volume 6, speed 2, and pitch 5
 
 f9rIYS9d5AHm --- "freedom" spoken with two changes in pitch and one 
                  change in speed
 
  S79spy0IYCH --- a pause, speed 7, pitch 9, followed by "speech"
                  with a pitch change to 0 just before the "eech"
 
 Here is another useful example, its the "ABC Song":
 
 4S2EY bIY 6sIY dIY 7IY EHf 6jIY    S55EYCH AY 4jEY kEY 3S7EHl EHm EHn OWh 
 1pIY    S46kyUW AHr 5EHs    tIY yUW 4vIY   6dAHblyUW   5EHkz   6wAY   1zIY 
 S45nAWAY6nOWm7AYS3EYbIY6sIYz    5tEHlmIY4wAHtyUW3THIHnkAHv2mIY
 
 In addition to the 'STR ' resources, ST contains four 'CODE' resources.
 Using Apple's "Examine File" and the jump table contained in the 'CODE'
 0 resource, I found that the phoneme-to-speech translator is contained
 in the 'CODE' 2 resource.  Its entry point offset is 4 (that is, its
 first instruction is just after the standard 4 byte segment loader
 header).
 
 By replacing the first instruction with a trap to MacsBug, I found that
 the translator has the following argument interface:
 
 short speak(text, volume, speed, pitch)
    char *text;
    short volume;
    short speed;
    short pitch;
 
 The arguments are pushed on the stack using the standard stack-based
 toolbox conventions.  The text argument points to the phoneme string
 (in pascal format with leading length byte).  The volume, speed, and
 pitch arguments apparently specify the initial values for these
 parameters.
 
 After some experimentation, I found that the result returned is zero
 unless the text contains something that isn't a valid phoneme or control
 code.  In this case, nothing is spoken and the result contains an offset
 into the text of the illegal item.
 
 Armed with this information, you can use the phoneme-to-speech translator
 in your own applications.  For example, I wrote one application 
 which I call "ABSpeak" that presents a dialog box containing two editText 
 boxes.  The phonetic strings you type in each box are can be spoken
 alternately for a A versus B comparison.  Also, I wrote a desk accessory
 that speaks the time of day.  (This one is not too useful on a thin Mac,
 since the phoneme-to-speech resource is 24k bytes).  I will distribute
 "ABSpeak" to anyone who wishes to experiment with Smoothtalker.
 
 S8mAYkAOl S56SHUW5stER
 Michael Schuster
 @cit-20
-------