info-mac@uw-beaver (12/07/84)
From: Mike Schuster <MIKES@CIT-20.ARPA>
Smoothtalker Demo Disk
I talked with the people at First Byte last week about their product
Smoothtalker. They mentioned that they had a software developers
package for about $500, a retail package for $150, and a promotional
demo disk, which they offered to send me free of charge.
I received the disk two days later, along with a few pages of notes.
The disk contains the Smoothtalker demo application and its companion
picture resource file. No licensing agreement was included in the
package, and the disk is stamped "Demo - Ok to copy".
I concluded that the software is in the public domain and decided to
provide some documentation.
The demo application (which I will call ST from now on) is a rolling
demonstration consisting of a sequence of cartoons and spoken text.
Using Apple's RMOVER, I found that the text fragments are contained
in ST as 'STR ' resources with id numbers 256 and up. Each of these
resources contains a length byte followed by a sequence of ASCII
phoneme codes. Here is 'STR ' resource 256, which says "The freedom
of speech":
V6S25DHAXf9rIYS9d5AHS6m3AHv S79spy0IYCH
After studying all of these examples, I compiled the following dictionary
of phoneme codes:
AE pAt/EY Ate/b BiB/CH CHew/d DeeD/EH pEt/IY bE/AX thE
f Fit/g GaG/h Hat/IH hIt/AY pIE/j Just/k KiCK/l seLf
m MuM/n No/NG siNG/AA pOt/OW gO/AO fOr/OY bOY/AW OUt
UH tOOk/UW cOO/p PoP/r Run/s SauCe/SH SHy/t To/TH THin
DH THe/AH cUt/ER vERb/v oF/w Wag/y Yes/z siZe/ZH viSion
V# volume #/S# speed #/# pitch #/B base/T treble/] +stress/[ -stress
I have listed each phoneme code along with a short word containing that
phoneme. I capitalized the corresponding letters in the word. Notice
that the double letter phoneme codes are in upper case, where as the
single letter codes are lower case. The volume, speed and pitch values
are the ASCII digits 0, 1, ..., 9. These values apply to all subsequence
phonemes until changed. The ] and [ codes seem to add and subtract
stress from the following phoneme, respectively. The B and T codes
select low base and high treble voice, respectively.
Here is a rough translation of 256:
V6S25DHAXf9rIYS9d5AHS6m3AHv S79spy0IYCH
V6S25DAHX --- "the" spoken with volume 6, speed 2, and pitch 5
f9rIYS9d5AHm --- "freedom" spoken with two changes in pitch and one
change in speed
S79spy0IYCH --- a pause, speed 7, pitch 9, followed by "speech"
with a pitch change to 0 just before the "eech"
Here is another useful example, its the "ABC Song":
4S2EY bIY 6sIY dIY 7IY EHf 6jIY S55EYCH AY 4jEY kEY 3S7EHl EHm EHn OWh
1pIY S46kyUW AHr 5EHs tIY yUW 4vIY 6dAHblyUW 5EHkz 6wAY 1zIY
S45nAWAY6nOWm7AYS3EYbIY6sIYz 5tEHlmIY4wAHtyUW3THIHnkAHv2mIY
In addition to the 'STR ' resources, ST contains four 'CODE' resources.
Using Apple's "Examine File" and the jump table contained in the 'CODE'
0 resource, I found that the phoneme-to-speech translator is contained
in the 'CODE' 2 resource. Its entry point offset is 4 (that is, its
first instruction is just after the standard 4 byte segment loader
header).
By replacing the first instruction with a trap to MacsBug, I found that
the translator has the following argument interface:
short speak(text, volume, speed, pitch)
char *text;
short volume;
short speed;
short pitch;
The arguments are pushed on the stack using the standard stack-based
toolbox conventions. The text argument points to the phoneme string
(in pascal format with leading length byte). The volume, speed, and
pitch arguments apparently specify the initial values for these
parameters.
After some experimentation, I found that the result returned is zero
unless the text contains something that isn't a valid phoneme or control
code. In this case, nothing is spoken and the result contains an offset
into the text of the illegal item.
Armed with this information, you can use the phoneme-to-speech translator
in your own applications. For example, I wrote one application
which I call "ABSpeak" that presents a dialog box containing two editText
boxes. The phonetic strings you type in each box are can be spoken
alternately for a A versus B comparison. Also, I wrote a desk accessory
that speaks the time of day. (This one is not too useful on a thin Mac,
since the phoneme-to-speech resource is 24k bytes). I will distribute
"ABSpeak" to anyone who wishes to experiment with Smoothtalker.
S8mAYkAOl S56SHUW5stER
Michael Schuster
@cit-20
-------