[comp.music] programs that can infer key/meter

hardt@linc.cis.upenn.edu (Dan Hardt) (10/06/89)

I'd like to know what programs exist that can
infer the key and meter of a melody, just based
on the pitch and duration information.  Does anyone
know about programs that can do this?

briang@bari.Sun.COM (Brian Gordon) (10/06/89)

In article <15170@netnews.upenn.edu> hardt@linc.cis.upenn.edu (Dan Hardt) writes:
>I'd like to know what programs exist that can
>infer the key and meter of a melody, just based
>on the pitch and duration information.  Does anyone
>know about programs that can do this?

Isn't that what Finale is supposed to do?  The early write-ups (when the
beta-quality version was $1,000+) said it could listen to you play something on
the piano and print it out virtually unassisted (more or less).  It is
supposedly in much better shape now, and a lot cheaper.  Unfortunately, I've
never actually met an owner/user, so I'm just guessing.
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Brian G. Gordon	briang@Corp.Sun.COM (if you trust exotic mailers)     |
|			...!sun!bari!briang (if you route it yourself)	      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

brownd@thor.acc.stolaf.edu (David H. Brown) (10/07/89)

In article <125936@sun.Eng.Sun.COM> briang@sun.UUCP (Brian Gordon) writes:
>In article <15170@netnews.upenn.edu> hardt@linc.cis.upenn.edu (Dan Hardt) writes:
>>I'd like to know what programs exist that can
>>infer the key and meter of a melody, just based
>>on the pitch and duration information.

>Isn't that what Finale is supposed to do?

	Well, yes, this is what everybody thought Finale was supposed to do.
However, I've always had to tell it what key and meter I'm playing in...
even in the transcription mode.  Even after it's been told what key a piece is
in, it will sometimes put Ab and F# in the same measure (should be G# and F#)
if these chroma are not part of the key sig.

	After playing a piece for transcription and setting the meter,
the tempo is specified by tapping on some MIDI event for each beat while
the computer plays the piece back.  This doesn't really feel like the
program is "infering" much of anything.

	Even so, Finale is the best program I've ever encountered for handling
complete pieces of music (as opposed to collections of short sequences) in a
comnputer/MIDI environment.  It's also so complex that it may well be possible
to get it to infer such things as meter and key, but it probably isn't as easy
as we've all been told.

St. Olaf College has very little to     | M M | M M M | M M | M M M | M M |   
do with the things I talk about!        | M M | M M M | M M | M M M | M M |   
                                        | M M | M M M | M M | M M M | M M |  
Dave Brown: brownd@thor.acc.stolaf.edu  | | | | | | | | | | | | | | | | | |  
"I _like_ programming the DX-7!"        |_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|

sandell@ferret.ils.nwu.edu (Greg Sandell) (10/10/89)

>In article <15170@netnews.upenn.edu> hardt@linc.cis.upenn.edu (Dan Hardt) writes:
>>I'd like to know what programs exist that can
>>infer the key and meter of a melody, just based
>>on the pitch and duration information.  Does anyone
>>know about programs that can do this?

Now that some responses directly addressing Dan's question have come in, let
me suggest an indirect method.  There have been several writings on meter in
music that attempt to list exactly those features which instantiate this or
that time signature.  This theoretical material can (and has, I think) become
the basis for something more computational.  In particular I am thinking about
A GENERATIVE THEORY OF TONAL MUSIC by Fred Lerdahl and Ray Jackendoff (MIT Press,
1983).  Lerdahl (a composer and theorist) and Jackendof (a linguist) use grouping
mechanisms from Gestalt psychology theory to define musical meter.  For example,
given a recurring pattern of two beats and silence, where the silence is longer
than the duration of time separating the onsets of the two beats, the second of
the two beats tends to be heard as a strong beat, or, the first beat of a measure.
If the silence is roughly twice as long as the inter-onset time, then 3/4 meter
will be perceived.  Of course, competing musical elements could contradict that
and create a different meter, but all things being equal, 3/4 time will be heard.
Of course, real music is more complicated than this simple example, but the book
also describes more complicated conditions

Regarding the ability to figure out the key signature of the piece, I'm sure that
what you are concerned with are non-trivial examples where more than seven
diatonic pitch-classes present.  One method uses a rather simple statistical
approach of counting the number of each instance of every pitch-class in the
piece, and comparing it with a prototypical distribution for every possible
major and minor key.  This method grew out of the research described 
by Carol Krumhansl in "Perceptual Structures for Tonal Music" (MUSIC PERCEPTION 1, 
1983, pp. 28-62). (The actual key-finding algorithm has never been published). To 
successfully match C major, for example, the count will have to show a tendency to 
have more C's than any other note, G's running in second place, E's third, and so on.  
I can't remember the exact order of the remaining nine notes, but the remainder of
the 'white notes' follow before any of the 'black notes.'  The actual prototypical
distribution is somewhat more detailed that a simple descending distribution
(e.g., the ratio of G's to C's is not identical to that of E's to G's).  Despite
the seemingly anti-musical approach of a statistical count, I have seen the 
approach work quite well, especially when the pitch classes are weighted according
to their duration and amplitude.  Using MIDI, I once used the algorithm to analyze
the keys of real-time performance using exactly that weighting scheme (key velocity
to measure amplitude), and it had pretty good accuracy.

***************************************************************
* Greg Sandell, Institute for Learning Sciences, Evanston, IL *
* sandell@ferret.ils.nwu.edu                                  *
***************************************************************
'Oh Hey, Look at that: there's a fish on a hat;
 And we'd like to treat everyone here to a cow souvenir.'
	- Peter & Lou Berryman
***************************************************************
* Greg Sandell, Institute for Learning Sciences, Evanston, IL *
* sandell@ferret.ils.nwu.edu                                  *
***************************************************************

mgresham@artsnet.UUCP (Mark Gresham) (10/11/89)

In article <7203@thor.acc.stolaf.edu> brownd@thor.stolaf.edu () writes:
>In article <125936@sun.Eng.Sun.COM> briang@sun.UUCP (Brian Gordon) writes:
>>In article <15170@netnews.upenn.edu> hardt@linc.cis.upenn.edu (Dan Hardt) writes:
>>>I'd like to know what programs exist that can
>>>infer the key and meter of a melody, just based
>>>on the pitch and duration information.
>
>>Isn't that what Finale is supposed to do?
>
>[Well, sort of,...]
>Even so, Finale is the best program I've ever encountered for handling
>complete pieces of music (as opposed to collections of short sequences) in a
>comnputer/MIDI environment.  It's also so complex that it may well be possible
>to get it to infer such things as meter and key, but it probably isn't as easy
>as we've all been told.

Good luck. Especially on things like changing meters and
syncopations within a strict meter.  If musicians can't agree
on how to notate such (and believe me, that's rampant) how can
someone program 'definitive' decisions about it?

>St. Olaf College has very little to
>do with the things I talk about!

"Vell, ve yust verk, und verk, und verk!" --F. Melius Christiansen

Cheers,

--Mark

========================================
Mark Gresham  ARTSNET  Norcross, GA, USA
E-mail:       ...gatech!artsnet!mgresham
or:          artsnet!mgresham@gatech.edu
========================================

mgresham@artsnet.UUCP (Mark Gresham) (10/13/89)

In article <1240@accuvax.nwu.edu> sandell@ferret (Greg Sandell) writes:
>>In article <15170@netnews.upenn.edu> hardt@linc.cis.upenn.edu (Dan Hardt) writes:
>>>I'd like to know what programs exist that can
>>>infer the key and meter of a melody, just based
>>>on the pitch and duration information.  Does anyone
>>>know about programs that can do this?
>
>Now that some responses directly addressing Dan's question have come in, let
>me suggest an indirect method.  There have been several writings on meter in
>music that attempt to list exactly those features which instantiate this or
>that time signature.  This theoretical material can (and has, I think) become
>the basis for something more computational.  In particular I am thinking about
>A GENERATIVE THEORY OF TONAL MUSIC by Fred Lerdahl and Ray Jackendoff (MIT Press,
>1983).
[...] mechanisms from Gestalt psychology theory to define musical meter.  For example,
>given a recurring pattern of two beats and silence, where the silence is longer
>than the duration of time separating the onsets of the two beats, the second of
>the two beats tends to be heard as a strong beat, or, the first beat of a measure.

One simple simple musical example that directly contradicts the
above is the typical 'sarabande' where the first beat of the meter
is the shorter of the two stresses that are felt behind the 3/4 or
3/2 meter of that dance, the second being twice as long.  (Half + whole in 3/2.)

>If the silence is roughly twice as long as the inter-onset time, then 3/4 meter
>will be perceived.

(Maybe it would be a slow 3/8 or a fast 3/2?)

>Of course, competing musical elements could contradict that
>and create a different meter, but all things being equal, 3/4 time will be heard.

But competing musical elements are as often the norm as they are
the exception.  The process would have a difficult time (pardon
the pun :-)) with even such familiar examples as Schumann's
"Traumerei" or Beethoven's Piano Sonata #5 in C-minor (Op. 10, #1).
Just the opening measures of either of these would give such a
program fits, even if played 'squarely.'
  I suspect the "Traumerei" would be recognized as a 5/4 bar with an
anacrusis followed by a 3/4 bar then followed by 4/4.  The
'strong' accents (thinking only metrically, even) do not
necessarily fall on the first pulse of any given measure.
  Most of Brahms' works in 3/4 have significant passages which
contradict the 'tyranny of the bar-line' ("How Lovely Is..." from
his Requiem, the "They praise Thee..." passages; a great deal of
the piano music in 3/4 frequently include significant passages
which 'go against' the meter.

>Of course, real music is more complicated than this simple example, but the book
>also describes more complicated conditions.

But there's more problems with simple conditions:
Where a pulse is divided into three parts, what will your program
automatically decide is 'correct'?
1) a 4/4 meter with triplet eighths?
2) a 6/8 meter?
3) a 12/8 meter? or 12/16?
4) a very fast 3/4 with one pulse to the measure? same for 3/8?
5) some other possibility? (Groups of 3 syncopated 16ths against
normal groupings of 4 16ths?)

If the passage has a series of, say, some 2 dozen repeated
G#'s in 3/4 before anything else happens (this could easily happen in
a pseudo-Beethoven scherzo):
1) is the first note an anacrusis?
2) is the first note on the first beat of the measure?
3) is the first beat silent?
4) will the program even recognize, in that instance, the 3/4 meter?

Part of the point of all this is differentiating between
'rhythmic' features of music and 'metrical.'

Even an example from the sixteenth century, "Ecce quomodo moritur"
(Jacob Handl) would be made into a metrical hash by the
Lerdahl/Jackendoff approach to analysis, eliminating the wonderful
rhythmic overlap that occurs throughout the constant duple (in the
example I have, 4/4) meter.  It would result in this:

2/2 |3/4 | |3/2 |4/2 |3/4 | |3/2 |3/4 | |5/4 |3/4 |4/4 etc.

...as a likely computational 'solution.'

(I'd like to see what such a program would do with examples from
the 'ars nova' period or Elliott Carter's Piano Concerto!)

>Regarding the ability to figure out the key signature of the piece, I'm sure that
>what you are concerned with are non-trivial examples where more than seven
>diatonic pitch-classes present.

How about 'tonal' music without key signature?

>One method uses a rather simple statistical
>approach of counting the number of each instance of every pitch-class in the
>piece, and comparing it with a prototypical distribution for every possible
>major and minor key.  This method grew out of the research described 
>by Carol Krumhansl in "Perceptual Structures for Tonal Music" (MUSIC PERCEPTION 1, 
>1983, pp. 28-62). (The actual key-finding algorithm has never been published). To 
>successfully match C major, for example, the count will have to show a tendency to 
>have more C's than any other note, G's running in second place, E's third, and so on.  

Greg, most music of an oral tradition contradicts that, with the
dominant being the most *frequently occurring* note, followed by
the mediant, then the submediant; the same goes for spontaneous songs
made by children.

Certainly we can spell 'fish' with
the series of letters 'ghoti' and find justifiable reasons for
doing so in some statistical-psychological manner.  But beyond
the curiosity, would it be meaningful and helpful in terms of
reproducing and communicating the music (the real thing, aside from its
representation on paper)?

Cheers,

--Mark

========================================
Mark Gresham  ARTSNET  Norcross, GA, USA
E-mail:       ...gatech!artsnet!mgresham
or:          artsnet!mgresham@gatech.edu
========================================

brownd@thor.acc.stolaf.edu (David H. Brown) (10/13/89)

In article <484@artsnet.UUCP> mgresham@artsnet.UUCP (Mark Gresham) writes:

>Good luck. Especially on things like changing meters and
>syncopations within a strict meter.  If musicians can't agree
>on how to notate such (and believe me, that's rampant) how can
>someone program 'definitive' decisions about it?

	The way to make 'definitive' decisions:

		1.	Define an arbitrary standard.
		2.	Use it consistently.

If you (a program) insist on such standard, some people will come to believe
that it's correct.   (My Acoustics prof. referred to this as "proof by
intimidation.")  If all goes welll, the standard will become accepted
generally.  Now, how to get all programs to agree (e.g. Finale, SCORE, etc.)?
I'm sure I don't know.  But a standard format for indicating syncopations
would be extremely useful-- apart from the sillico-musical environ, too.

>Mark Gresham  ARTSNET  Norcross, GA, USA
>E-mail:       ...gatech!artsnet!mgresham
>or:          artsnet!mgresham@gatech.edu

St. Olaf College has very little to     | M M | M M M | M M | M M M | M M |   
do with the things I talk about!        | M M | M M M | M M | M M M | M M |   
                                        | M M | M M M | M M | M M M | M M |  
Dave Brown: brownd@thor.acc.stolaf.edu  | | | | | | | | | | | | | | | | | |  
"I _like_ programming the DX-7!"        |_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|

sandell@ferret.ils.nwu.edu (Greg Sandell) (10/13/89)

***************************************************************
* Greg Sandell, Institute for Learning Sciences, Evanston, IL *
* sandell@ferret.ils.nwu.edu                                  *
***************************************************************

sandell@ferret.ils.nwu.edu (Greg Sandell) (10/14/89)

For some reason, my response got posted as a null posting.  Here's another
try....

Mark Gresham writes:
>But competing musical elements are as often the norm as they are
>the exception.

Competing musical elements are often local ambiguities that are 
resolved at higher levels.  One gestalt construct that L & J uses
is the "theory of good continuation" which says that if the 
piece has had previous indications of being in 3/4 time, then it
probably will stay in 3/4 time.  If an ambiguity is localized to
a small number of measures, and the preceding and following
portions suggest 3/4, and 3/4 can continue through the passage
correctly on a correct-number-of-beats level, then the theory is
biased to assume that 3/4 stayed in effect.

> The
>'strong' accents (thinking only metrically, even) do not
>necessarily fall on the first pulse of any given measure.

L&J distinguish between 'phenomenal' accents (accents overtly cued
by the presence of a note and reinforced by, say, dynamic markings)
and 'metrical' accents.  Metric accents are expectations that are
set up by other non-ambiguous cues.  Although there may be silence
on the first beat, metrical expectations cause a listener to 
experience it as a strong beat anyway.

>>Of course, real music is more complicated than this simple example, but the book
>>also describes more complicated conditions.
>
>But there's more problems with simple conditions:
>  (various metric ambiguities cited...)

I used the wrong phrase; I meant that there are additional, more complex
rules to handle the richness of "real" music.  Many classic ambiguities
such as hemiola rhythm are handled by rules in the system.  The system
would be able to handle other ambiguities it had never encountered
before, provided there were other cues for the meter, which there
frequently are.  But it would be easy to try to systematically
confound the approach by removing all cues.  When you get to that
point, even the listener has trouble coming up with a "correct meter."
And in any case, what L&J are trying to do is to model the
perceptions of an experienced listener.

>Even an example from the sixteenth century, "Ecce quomodo moritur"
>(Jacob Handl) would be made into a metrical hash by the
>Lerdahl/Jackendoff approach to analysis

Why would you want to 'discover the meter' of music of this period?
I don't know the specific piece, but I recognize the genre.  Even
if written in 4/4 time nobody *hears* a 4/4 meter marching along
in this music.  In any case, L&J limit their scope to music of
the common practice period.  (Complain if you like, but you try to
create an exhaustive theory of all musics of all periods, and see
how much success you have!)
>
>> (Stuff on the Krumhansl key-finding approach here...)

>Greg, most music of an oral tradition contradicts that, with the
>dominant being the most *frequently occurring* note, followed by
>the mediant, then the submediant; the same goes for spontaneous songs
>made by children.

I'm aware of this; in Irish folk music, for example, tunes frequently
end on scale-degree 5.  The algorithm might work in folk genres anyway,
because even if there are pieces which have more mediant and submediant
than tonic (really?), the tonic notes which are indeed present will
probably fall at metrically important locations.  (Recall that I
said the algorithm's note count is *weighted* by surface
emphases of notes, such as metric position, dynamics.) If you don't even
have this condition, then I submit that you have a key-ambiguous piece,
and in fact the algorithm would find more than one key vying for "first
place."  (The algorithm outputs a strength match with all 24 major and
minor keys, and the strongest key, assuming that one really stands out,
is selected as the answer.)

Mark, your questions are interesting, and your challanges are merited.  But
I think maybe you thought that what L&J are trying to do is discover
the *notated* meter.  To the contrary, they proceed from the assumption
that the notated meter is frequently in contrast to the experienced
meter.  On the Krumhansl algorithm, its success rate speaks for itself.
Except for the weighting mechanisms, it is an unmusical approach, so I
was expecting to get a rise out of somebody on the net when I brought
it up!

>
>Cheers,
>
>--Mark

Greg

***************************************************************
* Greg Sandell, Institute for Learning Sciences, Evanston, IL *
* sandell@ferret.ils.nwu.edu                                  *
***************************************************************

mgresham@artsnet.UUCP (Mark Gresham) (10/21/89)

In article <7354@thor.acc.stolaf.edu> brownd@thor.stolaf.edu () writes:
>In article <484@artsnet.UUCP> mgresham@artsnet.UUCP (Mark Gresham) writes:
>
>>Good luck. Especially on things like changing meters and
>>syncopations within a strict meter.  If musicians can't agree
>>on how to notate such (and believe me, that's rampant) how can
>>someone program 'definitive' decisions about it?
>
>	The way to make 'definitive' decisions:
>
>		1.	Define an arbitrary standard.
>		2.	Use it consistently.

Part of the power of musical notation is the fact that it is
flexible.  Aside from that, you're asking (to get back to the
psychological method mentioned) for the meter to be inferred 'as
the result of small amounts of information.  I would suggest that
1) a larger quantity would be helpful, and 2) we should recognize
that music is a 'map' by which we produce music, and different
methods of cartography may have different advantages/disadvantages
in any given situation.

BTW, a related programming puzzle would be to take photos from an
airplane (or satellite) and write a program that would create
roadmaps according to Rand McNally's arbitrary cartographical standards.
Try it sometime. :-)

>If you (a program) insist on such standard, some people will come to believe
>that it's correct.   (My Acoustics prof. referred to this as "proof by
>intimidation.")  If all goes welll, the standard will become accepted
>generally.

When the emperor has no clothes and there is always someone who will
gladly point it out.

>Now, how to get all programs to agree (e.g. Finale, SCORE, etc.)?

You have to get the musicians to agree first!

>I'm sure I don't know.  But a standard format for indicating syncopations
>would be extremely useful-- apart from the sillico-musical environ, too.

Certainly, because that's where the ultimate tests lie
(repeatedly).

I suggest you look at Joseph Shillinger's (sp?) treatise on theory
which includes an attempt to come to 'definitive standards' for
explaining musical structures mathematically.  It is all but
ignored now, and is generally thought of only in negative terms
these days.  But you might want to check it out.

>St. Olaf College has very little to     | M M | M M M | M M | M M M | M M |   
>do with the things I talk about!        | M M | M M M | M M | M M M | M M |   
>                                        | M M | M M M | M M | M M M | M M |  

Really?  Say 'hi' to Jennings and the choir.

>Dave Brown: brownd@thor.acc.stolaf.edu  | | | | | | | | | | | | | | | | | |  
>"I _like_ programming the DX-7!"        |_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|_|  

Cheers,

--Mark

========================================
Mark Gresham  ARTSNET  Norcross, GA, USA
E-mail:       ...gatech!artsnet!mgresham
or:          artsnet!mgresham@gatech.edu
========================================

mgresham@artsnet.UUCP (Mark Gresham) (10/21/89)

In article <1277@accuvax.nwu.edu> sandell@ferret (Greg Sandell) writes:
>Mark Gresham writes:
>>Even an example from the sixteenth century, "Ecce quomodo moritur"
>>(Jacob Handl) would be made into a metrical hash by the
>>Lerdahl/Jackendoff approach to analysis
>
>Why would you want to 'discover the meter' of music of this period?

I wouldn't.  But if I were to notate it for performers in our
time, I might want to use metrical indications for practicality.
In light of that, I have seen published editions of music using
the kind of metrical hash I described, especially where it is
either unnecessary or damaging.

>I don't know the specific piece, but I recognize the genre.  Even
>if written in 4/4 time nobody *hears* a 4/4 meter marching along
>in this music.

Of course not.  (Note the Hindemith example I'll cite later...)

>In any case, L&J limit their scope to music of
>the common practice period.  (Complain if you like, but you try to
>create an exhaustive theory of all musics of all periods, and see
>how much success you have!)

I wouldn't try (as other theorists have tried before) because of
the differences in intent and historical perspective if nothing
else.

>Mark, your questions are interesting, and your challanges are merited.  But
>I think maybe you thought that what L&J are trying to do is discover
>the *notated* meter.  To the contrary, they proceed from the assumption
>that the notated meter is frequently in contrast to the experienced
>meter.

I'll buy that.  But this discussion (I thought) began with a desire
to have a program 'perceive' the midi input and then *notate* it,
including key and meter.  Like the "Ecce..." example (where you
might not want to notate any meter anyway) the results are likely
to be far afield from the kinds of results desired.

Incidentally, I'd be interested in what the L&J method would turn
out as results (yes, notated!) from a hearing of the double
fugue ("Lo, body and soul, this land...") from Hindemith's
requiem, "When Lilacs Last In The Dooryard Bloomed."

Yes, percieved and notated meter are often quite different, that's
one of my points.  The notation is a 'roadmap' to the performance
of the music, not vice versa.

Cheers,

--Mark

========================================
Mark Gresham  ARTSNET  Norcross, GA, USA
E-mail:       ...gatech!artsnet!mgresham
or:          artsnet!mgresham@gatech.edu
========================================