[comp.sys.amiga] amiga speech synthesis

kosma%human-torch@stc.lockheed.com (Monty Kosma) (06/01/90)

Here's a question along the lines of the recent discussions of speech
synthesis on the amiga.

Can somebody give me a brief rundown or pointer to a reference on how 
it works?  Does the amiga actually have a speech synthesis chip, or is
it done through audio samples of phonemes, or what?  And the real question,
how good **can** it be with the current **hardware** limitations? 

monty
kosma@alan.decnet.lockheed.com

peterk@cbmger.UUCP (Peter Kittel GERMANY) (06/08/90)

In article <20704@snow-white.udel.EDU> kosma%human-torch@stc.lockheed.com (Monty Kosma) writes:
>
>it works?  Does the amiga actually have a speech synthesis chip, or is
>it done through audio samples of phonemes, or what?  
                 ^^^^^^^^^^^^^^^^^^^^^^^^^ 
Yes, this way. And there is very tricky software to bind these phonemes
together and make fluent speech of them. What is to be said about quality:
The max. sample rate of the Amiga beyond 18 kHz is far enough for speech
(regard the bandwidth of a telephone with 3 kHz!), only limitation is in
the quality of the digitized phonemes. Frankly speaking, they are such a
pure Texan style American (just for a German ear) that they are nearly
useless for other languages than American. I believe even Britons would
need different phonemes.
All this was done by an external software company and sold to Commodore.
They even had offers to prepare it for other languages, but the charge
was so expensive that noone took it.

-- 
Best regards, Dr. Peter Kittel       E-Mail to 
Commodore Frankfurt, Germany         rutgers!cbmvax!cbmbsw!cbmger!peterk

kosma%human-torch@stc.lockheed.com (Monty Kosma) (06/09/90)

   >
   >it works?  Does the amiga actually have a speech synthesis chip, or is
   >it done through audio samples of phonemes, or what?  
		    ^^^^^^^^^^^^^^^^^^^^^^^^^ 
   Yes, this way. And there is very tricky software to bind these phonemes
   together and make fluent speech of them. What is to be said about quality:
   The max. sample rate of the Amiga beyond 18 kHz is far enough for speech
   (regard the bandwidth of a telephone with 3 kHz!), only limitation is in
   the quality of the digitized phonemes. Frankly speaking, they are such a
   pure Texan style American (just for a German ear) that they are nearly
   useless for other languages than American. I believe even Britons would
   need different phonemes.
   All this was done by an external software company and sold to Commodore.
   They even had offers to prepare it for other languages, but the charge
   was so expensive that noone took it.

   -- 
   Best regards, Dr. Peter Kittel       E-Mail to 
   Commodore Frankfurt, Germany         rutgers!cbmvax!cbmbsw!cbmger!peterk

hmm, so let's say I (or somebody) wanted to improve on this, or just fool
around a bit.  Would having different/better samples help? Or does the 
"binding together" software need work?  What I guess I'm asking is what
is the weak link that makes the voice so poor.

monty

bard@jessica.stanford.edu (David Hopper) (06/09/90)

In article <21526@snow-white.udel.EDU> kosma%human-torch@stc.lockheed.com (Monty Kosma) writes:
>   >
>   >it works?  Does the amiga actually have a speech synthesis chip, or is
>   >it done through audio samples of phonemes, or what?  
>		    ^^^^^^^^^^^^^^^^^^^^^^^^^ 
>   Yes, this way. And there is very tricky software to bind these phonemes
>   together and make fluent speech of them. What is to be said about quality:
	[...]
>   -- 
>   Best regards, Dr. Peter Kittel       E-Mail to 
>   Commodore Frankfurt, Germany         rutgers!cbmvax!cbmbsw!cbmger!peterk
>
>hmm, so let's say I (or somebody) wanted to improve on this, or just fool
>around a bit.  Would having different/better samples help? Or does the 
>"binding together" software need work?  What I guess I'm asking is what
>is the weak link that makes the voice so poor.

This has indeed been done.  I have a program called 'Talk' that was written by
one Jon L. Sherling on 6/18/88.  To include the documentation (I have the 
source but I am hardly a programmer, yet):

(Note:  this is to be read by Jon's 'Talk' program, so forgive the atrocious
phonetics)

"This is the TALK program written by John Sherling on June eighteenth, nuynteen
eighty eight.  It is an attempt to reeplace the Amiga speech synthesis with my
own voice.  Digitized phownemes created with fewchersound are red in to the 
program and used in conjunction with the Amiga translaytor luybrary.

The phownemes are in standerd fewchersound format and can be reeplaced in
order to improve the voice quality or to change the voice entuyerly."

It is a valiant attempt and a fascinating-sounding voice, although I'm not
sure if it is more understandible.  Certainly, it is more humanlike.

I'd post the source, but it's 11416 bytes, and I figure this is long enough
already.

>
>monty

Dave Hopper      |      ///  Yesterday, CS.           | My favorite icebreaker:
                 |     ///    Today, Anthro/History.  | "If you were really my
bard@jessica.    | \\\///                             | friend, you'd kill me
   Stanford.EDU  |  \XX/ Tomorrow... bleeding ulcers. | now."

LEEK@QUCDN.QueensU.CA (06/09/90)

I wouldn't mind using extra memory for something that sounds a little better
- how about a sexy female voice for my Amiga :) (Research shown that pilot are
more alert to female voice than a male voice from the on-board instrumentations)
)  Hmmm.  I don't need research to tell me that... :)

Seriously I think now that most Amigas have enough memory for better samples
of phonemes and may be samples for different inflection instead od computed
ones.  This should improve on the quality of speech synthesizer if memory
is available and more and more applications would use this ignored feature
of the Amiga.

K. C. Lee