[comp.misc] Request for information on phoneme speech synthesis

dbb@aicchi.UUCP (Burch) (11/26/86)

Perhaps you can help me.
 
I am designing a system for a client that will use phoneme synthesis to
communicate with a user.  I am looking for references to papers on the
subject.  Due to parts count limitations, we will not be able to use
any of the commercial speech synthesis chips.  In this system, phonemes
will be stored as sampled A/D patterns, and spliced together in real
time to create speech.  Also, if anybody has public domain source code
for such a system, that would be greatly appreciated.
 
Thank You.
 

-- 
-David B. (Ben) Burch
 Analysts International Corp.
 Chicago Branch (ihnp4!aicchi!dbb)

"Argue for your limitations, and they are yours"

dennisg@fritz.UUCP (Dennis Griesser) (12/06/86)

In article <854@aicchi.UUCP> dbb@aicchi.UUCP (Burch) writes:
> 
>I am designing a system for a client that will use phoneme synthesis to
>communicate with a user.  I am looking for references to papers on the
>subject.  Due to parts count limitations, we will not be able to use
>any of the commercial speech synthesis chips.  In this system, phonemes
>will be stored as sampled A/D patterns, and spliced together in real
>time to create speech.  Also, if anybody has public domain source code
>for such a system, that would be greatly appreciated.

There's nothing wrong with WHAT you want.  In fact, I also interested in
this subject... perhaps you could post your results.

I have a problem with WHY you want it.

I think that storing sampled analog waveforms and playing them back to
avoid a synthesis chip is false economy.  You will need a large EPROM
with the samples in it.  The D-to-A will take a chip too.  Then, there is
the load placed on your CPU to keep the D-to-A fed.  There are plenty of
single-chip synthesizers around that you can poll every few hundred
milliseconds and hand it a phoneme if it's ready.

I would suggest one of the commercial allophone synthesis speech chips.
Makers include TI, National, and General Instruments, as well as
Votrax.

Radio Shack sells a GI chip that is part of a family that
usually comes pre-programmed with interesting words.  The version at
the 'Shack comes with allophones on it instead.  They also have a
companion chip that performs the text-to-allophone conversion.

The Circuit Celler column in Byte has had several voice synthesizers over
the years.  I think that the latest was about two years ago.

Good luck.

daveh@cbmvax.UUCP (12/10/86)

 In article <854@aicchi.UUCP> dbb@aicchi.UUCP (Burch) writes:
> 
>I am designing a system for a client that will use phoneme synthesis to
>communicate with a user.  I am looking for references to papers on the
>subject.  Due to parts count limitations, we will not be able to use
>any of the commercial speech synthesis chips.  In this system, phonemes
>will be stored as sampled A/D patterns, and spliced together in real
>time to create speech.  Also, if anybody has public domain source code
>for such a system, that would be greatly appreciated.

Parts count!?!?  Exactly what are you using to drive this system.  I've
found two essentially equivalent phoneme or allonym based speech synthesis
systems for the C64 computer in my work.  The first of these take the 
software approach.  It requires a microprocessor, some simple kind of 
sound chip (I know the generally available GI or Yamaha ships work fine),
and about 32K-64K or ROM.  The program called "SAM" for the C64 does exactly
this.  A hardware alternative is available in the speech cartridge made by
Steve Currah.  This device contains a GI speech chip, a PLA, and I believe
an 8K ROM.  The pure software solution ends up costing about the same as
the hardware assisted solution.  The GI phonemes replace the soft coded
phonemes.  You might end up with 1-1/2 extra chips in the hardware
solution, but you also end up with much more processor time available for
things other than talking.  

-- 
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Dave Haynie	{caip,ihnp4,allegra,seismo}!cbmvax!daveh

	"Laws to supress tend to strengthen what they would prohibit.
	 This is the fine point on which all the legal professions of
	 history have based their job security."
						-Bene Gesserit Coda

These opinions are my own, though for a small fee they may be yours too.

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ctp@pop.utexas.edu (Clyde T. Poole) (12/15/86)

>In article <854@aicchi.UUCP> dbb@aicchi.UUCP (Burch) writes:
> 
>I am designing a system for a client that will use phoneme synthesis to
>communicate with a user.  I am looking for references to papers on the
>subject.  Due to parts count limitations, we will not be able to use
>any of the commercial speech synthesis chips.  In this system, phonemes
>will be stored as sampled A/D patterns, and spliced together in real
>time to create speech.  Also, if anybody has public domain source code
>for such a system, that would be greatly appreciated.
>
I don't know if this will help, but I have a few basic references to
research in translation of English text to speech using Phonetics.
If your intention was to save phonemes and then put them together these
may help.

Unfortunately I didn't keep good notes so these may be incomplete.

Automatic Translation of English Text to Phonetics by Means of
Letter-to-Sound Rules, Elovitz, H.S., Johnson, R.W., McHush, A.,
Shore, J.E., Naval Research Laboratory, NRL Reprot 7948, January
21, 1976.

Synthetic English Speech by Rule, McIlroy, M.D., Bell Telephone
Laboratories, Computing Science Technical Report #14, March, 1974 and
revised September 14, 1977.

-----
Clyde T. Poole, Computing Resources Manager 
ARPA:     ctp@sally.utexas.edu               VOICE: (512) 471-9551
UUCP:     {harvard,ihnp4,seismo}!ut-sally!ctp  CIS: 75226,3135
Overland: UT at Austin, Department of Computer Sciences
          Taylor Hall 2.124, Austin, TX  78712-1188
"Life is a bitch ... and then you die"