[comp.music] Formants

allyn@milton.u.washington.edu (Allyn Weaks) (04/14/91)

gaspar@urz.unibas.ch writes:
>There is in fact a relationship between vowels and overtones. If you sing
>the vowel A and the vowel E at the same pitch it's the overtones that are
>forming the different sound of the vowels. 

Nope.  Vowels are recognizable because of formant frequencies caused by the
shape of the mouth and vocal tract, and are largely independent of pitch.  The
formant spectrum is basically an envelope that encloses the pitch spectrum.
Pitch is determined primarily by tension in the vocal chords.  You can test it
yourself by singing 'ah' (or any other vowel) at various pitches.  The pitch
overtones (and timbre) go with the pitch, but the 'ah' sound stays largely the
same - the formant frequency (and _it's_ overtones, which aren't simple
harmonics) stay the same for a given vowel.  The 'ah' as in father has strong
peaks at about 700, 1100, and 2600 Hz.  'oo' as in pool has its strongest
peaks at about 300, 700, and 2500 Hz.  All of this is somewhat oversimplified
of course.  I suggest looking up Arthur Benade's _Fundamentals of Musical
Acoustics_ for more details.

This is the reason that sampled voice sounds on a keyboard are so
unsatisfactory - they only sound like a voice at the pitch they were sampled
at and if you get more than a whole tone away it all goes to hell, since the
sampler is shifting the formant envelope frequencies as well as the pitch.


>That is also the way you distinguish different instruments, by the overtones.

Only partly.  Much more important is the attack - the wild ravings of the
first several milliseconds of a note.  If you hear only a sustained tone, it's
often hard to tell a violin from a flute from a horn.  Pianos are more
distinctive because of non-linearities in the overtones.

Allyn Weaks
allyn@milton.u.washington.edu

sandell@aristotle.ils.nwu.edu (Greg Sandell) (04/14/91)

 allyn@milton.u.washington.edu writes: 

> >That is also the way you distinguish different instruments, by the overtones.
> 
> Only partly.  Much more important is the attack - the wild ravings of the
> first several milliseconds of a note.  If you hear only a sustained tone, it's
> often hard to tell a violin from a flute from a horn.  Pianos are more
> distinctive because of non-linearities in the overtones.

The interesting thing is that in years of experiments spectrum always
shows up to be the most prominent dimension, and onset comes in second
place.  If you look at the semantic terms listeners use to describe
timbre, the most frequently used ones pertain more to spectrum (dark,
bright, rich, brilliant, mellow, warm, nasal, dull).  The ones pertaining to
attack (biting, incisive, soft, hard) are less frequent in use.

Outside of the laboratory, timbre is apprehended over time, over exposure 
to a number of notes.  Certainly part of the apprehending process is
hearing a number of different onsets over several different pitches.  The
attack's duration, noisiness, presence of inharmonicity changes a bit from one
note to the next, but not that much.  Not as dramatically as does the
spectral envelope from one pitch to the next, which in fact show rather
beautiful patterns of change across an instrument's playing range.
And spectrum changes in interesting ways according to the force with
which the note is played; the changes is onset caused by dynamic are
salient, but not as rich in patterned information.

Most of the seminal studies that revealed the priority of the attack 
portion were from around 1963, and were based on isolated notes.
We could make more sophisticated experiments now.  Here's my question:
if you took a clarinet melody of substantial length and melodic span,
and replaced all the attacks with trumpet attacks, would anybody 
think they were hearing a trumpet tune?

Roger Kendall has a study (MUSIC PERCEPTION 4/2) where he played
clarinet, violin and trumpet melodies in conditions with the
attacks excised, just steady states.  Listeners could tell what
instruments they were hearing.  They showed poorer performance
in conditions where only the attack portions were presented.  Oddly
enough, when listeners had to identify instruments from a singly
presented note, they did equally well with attack-only and 
steady-state only.  So the priority given to the attack from those
old 1963 studies needs to be re-evaluated.

Greg Sandell
--
Greg Sandell
sandell@ils.nwu.edu

galetti@uservx.afwl.af.mil (04/15/91)

In article <1991Apr13.235927.5503@milton.u.washington.edu>, allyn@milton.u.washington.edu (Allyn Weaks) writes:
> 
>>That is also the way you distinguish different instruments, by the overtones.
> 
> Only partly.  Much more important is the attack - the wild ravings of the
> first several milliseconds of a note.  If you hear only a sustained tone, it's
> often hard to tell a violin from a flute from a horn.  Pianos are more
> distinctive because of non-linearities in the overtones.
> 
I have to agree with you here.  I see this a lot when I edit a sample.  Without
a descriptive attack the sample could be a simple analog synth.  Vocal samples
sound really stupid without some form of complex attack.  For example, if you
sample someone singing "ah" and then edit it so the attack is gone, it sounds
pretty boring.  If you sample "Tah" and preserve the attack, it sounds much
better.

The Roland D-50 uses this concept.  It takes advantage of the primacy effect
that attack sounds have on the human ear.  When a distinctive attack is 
followed by a simple oscillation, the result is a realistic and pleasant sound.
Well, I like it anyway!

> Allyn Weaks
> allyn@milton.u.washington.edu
  ___________________________________________________________________________
 /   Ralph Galetti                  Internet:   galetti@uservx.afwl.af.mil   \
|    PL/LITT                        Interests:  computers, music, computers   |
|    Kirtland AFB, NM 87117-6008                and music, golf, sleep.       |
 \__"No, they couldn't actually prove that it was HIS vomit" - Nigel Tufnel__/