sandell@ils.nwu.edu (Greg Sandell) (06/16/91)
Vance Maverick writes: > To reverse the decay of comp.music into rec.music.synth.backup, Greg > Sandell proposes chasing all the synth people away, and then says such > researchers as he have no time to contribute. I think the latter is > the real problem -- the only way to influence the tone of the newsgroup > is by positive contributions. Bravo, Vance. I'll make a contribution too. Get ready, though, it's 150 or so lines long... For the last three years I have been doing research on the perception of musical timbre, with an eye towards learning something about orchestration. I am working on a Ph.D. in Music Theory at Northwestern University. Orchestration is not considered normal territory for a music theorist, but I don't think that this is a very good state of affairs for the field. The current interests which dominate music theory, still mostly pitch set theory and Schenker theory, further the belief that the key to unlocking a piece's meaning is in discovering some tightly organized network of pitch relations. As a result works that are masterpieces in part due to their orchestration yet which exhibit pitch structures resistant to both pitch set theory or Schenker theory are in effect snubbed by music theorists. And this is in stark contrast to the interests of composers, who are quick to recognize effective exploitation of the timbral domain and at times exhibit a language for discussing qualities of sounds that theorists are not privy to. So what can a theorist do to bring orchestration into the fold of music theory? Remember, theorists are creatures who shun vague, qualitative descriptions of personal listening experiences (although lately narrative analyses have come in fashion) and seek to create and use tools that categorize and quantify the elements which make up a piece of music. My choice was to focus on that aspect of orchestration pertaining to choosing combinations of instruments for concurrent presentation. This is one of the great mysteries of orchestration which fascinates many musicians, and which is clearly not merely a pragmatic issue of `instrumentation.' Next, I chose to restrict the field of interest to largely homophonic combinations: melodies in unison other semi-fixed intervals, or vertical sonorities. (This is purely practical; research has to start somewhere.) But before can categorizing or quantifying of this domain can begin, we need to know how such combinations are evaluated in orchestration practice. Many orchestration manuals, especially the ones by Rimsky-Korsakov, Rogers and Piston, spend a great deal of time instructing the student how to choose timbres that "blend." With few exceptions, the use of the term suggests a fusion phenomenon: blended combinations are those in which the timbral line of demarcation cannot be distinguished, e.g., a cello and bass clarinet which merges into some hybrid, single timbral quality. The other end of this spectrum are combinations which clearly separate into distinct timbres (tuba and piccolo, to name an extreme example). Other than the obvious factors that separate timbres (gross differences in attack time, intonation, etc.), what are the acoustics underlying the phenomenon of blend? Although orchestration manuals offer plenty of prescriptions (do's and don'ts) and examples ("Ravel did it, so it must be good"), they offer no metric for evaluating any particular instance. High speed analysis and visualization of timbre by computer is here and it's cheap; Rimsky-Korsakov's and Joseph Schillinger's dream of a scientific basis for orchestration is within reach, or at least, the ability to investigate its possibility is. (I might add that Piston and one or two others deplored the idea of a systematization of orchestration, but there will always be conservatives.) The ideal orchestration manual would offer the student not only isolated cases, but empower him or her with (to borrow a concept from my current employer, Roger Schank) case-based reasoning. Students could generalize from the acoustical properties which lead to the good blend between cello and bass clarinet and apply the principle to other combinations: for example, perhaps low-register violin and English horn blend for analogous reasons. While blend is certainly not the only quality one evaluates in concurrent timbres, I decided to focus on blend and investigate it in a series of perceptual experiments using musical listeners. I ran three experiments using sysnthesized musical instrument tones (same tones as in John Grey's 1975 dissertation). Tones were presented in concurrent pairs, and listeners rated how well they "blended". Poor blend was defined as the case where individual timbres could be clearly heard (say, piccolo and tuba) and good blend was defined as the case where the timbres fused to form a single sonic impression (say, French horn and trumpet). A single trial in one of my experiments consisted of two instruments playing one short note each, in rhythmic unison. I explored instruments in unison and in minor thirds, and timbral modifications such as "artificially bright", "increased inharmonicity", and unequal intensity levels to observe the effect on "blend." The main findings have to do with the centroid (level of brightness/ darkness) and the length of the attack portion. (Examples of dark timbres are: bassoon, French horn; bright timbres are oboe, clarinet in clarion register; instruments of moderate brightness include flute and trumpet.) For unisons, dark tones blend best of all, and the blend steadily worsens as a function of the increasing brightness of one or both of the tones in a concurrent pair. This was true in the case of artificially darkened or brightened tones: for example, you can improve the blend of oboe and clarinet by artificially darkening the oboe (like turning up the "bass knob" on a stereo). Analogous effects to darkening and brightening were found with respect to length of attack time: the longer the attack time of one or both of the tones, the worse the blend. However, the effect for attack time is not as strong as the effect for centroid. For non-unison intervals (the minor third), an additional factor for centroid played a role: instruments that were close in centroid (similar degrees of bright/darkness) blended well. So, although the presence of low centroids strongly influenced the blend of the pair, pairs not including an instrument with a low centroid blended fairly well nonetheless if they were similarly bright. Still other factors appeared to involve mechanisms of auditory stream segregation: instruments with hightly correlated amplitude envelopes (also, centroid envelopes, the change in centroid over the duration of the tone) tended to blend well, while non-correlated envelopes led to greater separation. (Note: Unfortunately this medium does not allow me to present all the data which led me to come to these conclusions; this requires a number of high-resolution graphs, tables of correlation coefficients, and alot of space devoted to explaining exactly how each analysis was done. But the statistical methods employed in coming to these conclusions included regression, correlation, t-tests, and Multidimensional Scaling, and the measures of statistical significance were those used in standard psychological experimentation. Eventually the dissertation will be available through standard channels (NU's dissertations are carried in University Microfilms) or directly from me, for those who want to follow up on the details. The dissertation will be completed in December 1991.) John Grey's landmark 1975 dissertation in timbre perception ("An Exploration of Musical Timbre", Stanford University) proposed a timbre space of three dimensions, pertaining to (and I am somewhat generalizing here) centroid, attack and harmonic synchrony. These dimensions emerged from an statistical analysis of timbral similarity judgments from a group of musical listeners. The results of the present experiments, although involving a different task, largely corroborate Grey's timbre space. A "blend space" can be constructed for each instrument with centroid, attack duration and envelope correlation as three of the primary dimensions. The experiments I ran were obviously limited in musical breadth, so my findings only suggest a beginning for further research in timbre perception and orchestration. First of all, one would also want to investigate blend using melodies rather than isolated note- pairs, and other intervals should be investigated. Furthermore, other ways of evaluating concurrent timbres could be explored. For one thing, the relative salience of two timbres in a pair is an important factor: loud flute and soft trumpet is very different than soft flute and loud trumpet. Part of how we may evaluate those combinations depends on which of the two qualities dominate the sum perception of the sound (e.g., flute in the former, trumpet in the latter). Next, it is expected that masking among harmonics or masking due to noisy aspects of instruments (air streams and bow scrapes) play an important role in evaluating combinations. One frequent observation in orchestration manuals is that the flute somehow "softens" the effect of other instrumental combinations (for example, Rimsky Korsakov says the flute can soften the harsh combination of oboes and clarinets; see p. 78 of his orchestration manual). The way in which some timbres seem to modify others is a point of interest to many musicians, and experiments to explicate these effects would make an interesting investigation. Finally, one can investigate the acoustic dissonance of a pair of timbres based on Plomp and Levelt's measures of "roughness" (in fact, I haven't eliminated this as a possibility in my dissertation). Greg Sandell (sandell@ils.nwu.edu) Northwestern Computer Music Northwestern University, Evanston, IL p.s. I call my research area "Concurrent Timbre." Eliot ("Music Mediates Mind[tm]") Handelman, if you're reading this, I could use some advice on how I can patent this phrase and license it for profit... :-) -- Greg Sandell sandell@ils.nwu.edu
curt@cynic.wimsey.bc.ca (Curt Sampson) (06/17/91)
Greg, I found your posting on the timbre perception research you've been doing quite fascinating. One thing that I am struck by, however, is the rather unquantified nature of your descriptions of the timbre of various instruments. For example, you call a claranet "dark" and an oboe "bright." Not having seen any of the details of your research, it could well be that you have quantified the timbres much better there than in your summary. If not, have you considered doing a Fourier analysis of the various instruments and looking for correlations in the data from that? My ears tell me that as well as some instruments being darker or brighter than others (that is, the average energy in the harmonics well above the fundemental being higher or lower), different instruments often have very different distributions of the energy within those upper harmonics. Some instruments have a lot of energy concentrated in a few harmonics (such as the oboe--or so my ears tell me :-)) and some have their energy spread out more evenly over many harmonics (piano). I suspect that these differing distrubtions would make quite a difference in the blending characteristics (and recognition characteristics, for that matter). Another thing to look at would be the amount and distribution of non-harmonic energy in an instrument's sound (the scraping of the bow, and the like). This might lead to some interesting expriments with computer-generated tones of varying harmonic structure. Synthesized tones created with a decent additive synthesizer would give you far more flexibility when testing blends of various kinds. It would also provide a good control in that one would expect that synthesized waveforms with characteristics similar to acoustic instruments would generate similar results when blended for listeners. That is to say, if you have two waveforms with a concentrated peak in the upper harmonics and they don't blend, but two acoustic waveforms with a concentrated peak in the upper harmonics do blend, there's obviously something else we should be looking for as an important factor in blending. So perhaps you could do a few experiments in this area too. It's only June, so I'm sure that you'll have plenty of time to research this whole area and fit that into a brief Appendix in your dissertation. :-) cjs -- Curt Sampson | "This sound system comes to you with fuel injection. curt@cynic.uucp | Toes tapping, the unthinking masses dance to a new curt@cynic.wimsey.bc.ca | tune...." --Gary Clail
maverick@mahogany.Berkeley.EDU (Vance Maverick) (06/18/91)
Sounds interesting. I'd love to see orchestration get a better rap in music theory. I'll bet (for example) that most analyses of the "Tombeau" from /Pli Selon Pli/ look at its pitch content, even though (at least for this listener) it's the orchestration that makes it go* -- particularly since Boulez personally takes the position that timbre is just icing on the cake of pitch. Will you have time, in your thesis, to take on the pedagogical aspects of the teaching of orchestration? Surely the goal of an orchestration course is to enable the composer to hear the combination of instruments mentally; such apparent prescriptions as R-K's dictum about flutes softening the combination of clarinets and oboes may serve their real function when, armed with knowledge of a score, the student listens for this effect, and hears, not "softness", but the sound of flutes, clarinets and oboes. Do you think computer representations of the sounds of instruments are far enough along that we could write software to help teach composers this skill of the mental ear? Vance * until it stops going, before the ludicrous entrance of the voice....
sandell@ils.nwu.edu (Greg Sandell) (06/18/91)
In article <1991Jun17.030934.499@cynic.wimsey.bc.ca>, curt@cynic.wimsey.bc.ca (Curt Sampson) writes: > > One thing that I am struck by, however, is the rather unquantified > nature of your descriptions of the timbre of various instruments. > For example, you call a claranet "dark" and an oboe "bright." The posting itself never said anything about the clarinet being "dark," but, I get your point. > > Not having seen any of the details of your research, it could well > be that you have quantified the timbres much better there than in > your summary. The scale that I used for brightness and darkness was "centroid." Centroid refers to the distribution of spectral energy in a complex sound. You calculate it by weighting each frequency component by its amplitude, summing all such values, and dividing it by the sum of the amplitudes alone. The division step factors out the amplitude and leaves a single frequency which identifies the midpoint of spectral energy concentration. Consider the following 4-harmonic spectra, each with a fundamental of 100 Hz. The amplitude scale shown is linear. The spectra are identical except for the third harmonic. In the latter spectrum, the distribution of spectral energy is shifting slightly higher in frequency. 8 8 8 | | | | | | | 6 | 6 | | | | | | | | | | | | | 4 | | | | | | | | | | | | | | | | | | 2 | | | 2 | | | | | | | | | | | | | | | | | | | | | | | | |____|____|____|___________ |____|____|____|___________ 100 200 300 400 100 200 300 400 The first spectrum's centroid is calculated as: (8*100)+(6*200)+(4*300)+(2*400)/8+6+4+2 = 4000/20 = 200 Hz. The second spectrum, if you work it out, yields a higher centroid (216.7 Hz.) This measure has been used with great success in perceptual experiments of timbre; that is to say, the magnitude of listeners' evaluations of timbres of different degrees of brightness and darkness are frequently paralleled (correlated with) by the centroids for those sounds. Research showing these results have been reported in Grey(1975), Grey & Gordon (1978) and Wessel (1978). I think that the first experiments to show its perceptual significance were by Lichte(1941) and von Bismarck(1974). Beauchamp(1982) provides the most explicit published definition of centroid. (Citations below.) > different instruments often have very > different distributions of the energy within those upper harmonics. > Some instruments have a lot of energy concentrated in a few harmonics > (such as the oboe--or so my ears tell me :-)) and some have their > energy spread out more evenly over many harmonics (piano). I suspect > that these differing distrubtions would make quite a difference in the > blending characteristics (and recognition characteristics, for that > matter). Right you are. Centroid is a statistical convenience, but obviously an impoverished representation of timbre. I have experimented with other ways of comparing spectra but haven't found any especially effective ones yet. One thing I haven't tried, which is suggested to me by what you say here, is defining some upper frequency region and taking *its* centroid. I'll let you know what I learn. But there is another representation of spectrum which collapses it down into three values (rather than one, as in centroid). This is the "tristimulus method" by Pollard & Jansson (1982). They break up the spectrum into: percentage of energy in the fundamental, percentage of energy in harmonics 2-5, and percentage of energy in all harmonics above 5. I haven't yet found a great use for this measure, myself. > > Another thing to look at would be the amount and distribution of > non-harmonic energy in an instrument's sound (the scraping of the bow, > and the like). In my study, I account for this by quantifying the amount of precedent noise at attack time. This turns out to be a pretty strong cue for blend. However, I found that a more general measure, the duration of the attack time, matched more closely to the judgments. > > This might lead to some interesting expriments with computer-generated > tones of varying harmonic structure. Synthesized tones created with a > decent additive synthesizer would give you far more flexibility when > testing blends of various kinds. The "John Grey tones" that I used *were* additive synthesis descriptions of the sound, by the way...that's what made it possible for me to analyze higher-level acoustic properties such as harmonic synchrony, inharmonicity, etc. > It would also provide a good control > in that one would expect that synthesized waveforms with characteristics > similar to acoustic instruments would generate similar results when > blended for listeners. That is to say, if you have two waveforms with > a concentrated peak in the upper harmonics and they don't blend, but > two acoustic waveforms with a concentrated peak in the upper harmonics > do blend, there's obviously something else we should be looking for as > an important factor in blending. Well, centroid comes in "first place" in my experiment, but of course it's not the only acoustic factor. It would not be hard to magnify the differences in attack characteristics and envelope similarity to override what should be a "good blend" from the perspective of spectrum content. > So perhaps you could do a few experiments in this area too. It's only > June, so I'm sure that you'll have plenty of time to research this > whole area and fit that into a brief Appendix in your dissertation. :-) If one of my committee members dies on me, you'll be the first person I call.... :-) Thanks for your response! -- Greg Sandell sandell@ils.nwu.edu Here are the sources I cited: Beauchamp, J.W. (1982). Synthesis by spectral amplitude and 'Brightness' matching of Manalyzed musical instrument tones. Journal of the Audio Engineering Society 30, 396-406. Grey, J.M., & Gordon, J.W. (1978). Perceptual effects of spectral modifications on musical timbres. Journal of the Acoustical Society of America 63, 1493-1500. von Bismarck, G. (1974a). Timbre of steady sounds: a factorial investigation of its verbal attributes. Acustica 30, 146. Lichte, W.H. (1941) "Attributes of complex tones," Journal of Experimental Psychology 28, 455-480. Wessel, D.L. Low dimensional control of musical timbre. Tech. Rept. 12, IRCAM, Paris, 1978. Pollard, H.F. and Jansson, E.V. (1982), "A tristimulus method for the specification of musical timbre." Acustica 51, 162-171.
eliot@phoenix.Princeton.EDU (Eliot Handelman) (06/18/91)
In article <2118@anaxagoras.ils.nwu.edu> sandell@ils.nwu.edu (Greg Sandell) writes:
;
;p.s. I call my research area "Concurrent Timbre." Eliot ("Music Mediates
;Mind[tm]") Handelman, if you're reading this, I could use some advice
;on how I can patent this phrase and license it for profit... :-)
I made it down this far, anyhow.
There is no distinction between music and theory.
I don't have to LISTEN to a piece of music in order to find out
how it goes. I am conversant with mountains of music sufficiently
"the same" that my thumb can listen: I can read a 12 or 15 minute
long orchestra piece in about 10 or 15 seconds.
It's more difficult to scan CD's, but it can be done. Speeds
roughly 20 to 30 times specs, especially for slow computer music,
are completely adequate to the aim of framing a few quick
perceptions. Bear in mind: not the realism of these perceptions,
only their formation, is of interest.
This faculty is less pronounced in some theorists.
mig@cunixb.cc.columbia.edu (Meir) (06/18/91)
* * * * * * ====================== Meir Green * * * * * * ====================== (Internet) mig@cunixb.cc.columbia.edu * * * * * * ====================== meir@msb.com mig@asteroids.cs.columbia.edu * * * * * * ====================== (Amateur Radio) N2JPG
sandell@ils.nwu.edu (Greg Sandell) (06/18/91)
In article <1991Jun17.170258.17498@agate.berkeley.edu>, maverick@mahogany.Berkeley.EDU (Vance Maverick) writes: > -- particularly since Boulez personally takes the position that > timbre is just icing on the cake of pitch. Can you think of a particular source where he says this? > Will you have time, in your thesis, to take on the pedagogical > aspects of the teaching of orchestration? Surely the goal of an > orchestration course is to enable the composer to hear the > combination of instruments mentally; such apparent prescriptions as > R-K's dictum about flutes softening the combination of clarinets and > oboes may serve their real function when, armed with knowledge of a > score, the student listens for this effect, and hears, not > "softness", but the sound of flutes, clarinets and oboes. What are the mechanics of the orchestrator's ear, though? When you hear bass clarinet and cello, does your mind automatically recognize it as a learned sound, "bass clarinet and cello", or does it first decompose the sound into "bass clarinet" and "cello"? Well maybe for such frequently used combinations as that one (especially for dramatic effect in late-19th cent. opera), the first mechanism applies. But what about the infinite number of other timbre combinations (different instruments, dynamics, registers, etc.)? If I want to learn from someone else's orchestration (whether I have just the recording or the score as well), I need to be able to (1) decompose the sound, and (2) hypothesize about the process behind the sum effect. That's what the listener does with the flutes/clarinets/oboes example. I think alot has been said about visual perception of color mixture which pertains to the issue of timbre mixture. I have experienced firsthand some surprising effects while playing with colors on color computer monitor. Say you have text on top of a background, and you want to find a combination of colors for foreground (the text) and background. Suppose I found a foreground color I like and I'm sampling various backgrounds. I swear that somehow different background colors shift the hue, saturation and brilliance of the foreground colors! Perceptually, of course they are...because of nifty things like Mach bands and the eye's natural tendency to supply the complementary hue of each color (i.e. when you look at a bright red light, close your eyes and see green). What we need in orchestration is an explanation of how certain timbres affect others in the perceptual ear. I think the two modalities are very analogous in the subject of mixture. But to answer your question about pedagogy, I am going to provide a review of several English-language orchestration manuals of the current century, but mainly concerning what they say about evaluating concurrent timbres. Besides, very few have much to say about how a student should gradually acquire a good ear for orchestration. One of the only exceptions is a curious little article by J. Ott, "A new approach to orchestration," THE INSTRUMENTALIST 23/9 (April 1969), pp. 53-55. He suggests that students embark on a exploration of their own personal timbre space by vocally imitating all the instruments of the orchestra and categorizing them according to the vowels they use to make the sounds. > think computer representations of the sounds of instruments are far > enough along that we could write software to help teach composers > this skill of the mental ear? It would be great, wouldn't it? There are certainly alot of sounds available on compact disk of orchestral instruments (McGill and ProSonus), but the number of instrumental sounds you could store in a sound library will always be miniscule compared to what performers can do. But even if there were a limited set of online orchestral instrument sounds available on a menu-driven system for combining sounds, think how much you could learn about combinations that you didn't know before. > > Vance > Thanks for helping make this an interesting discussion and to get my gears turning. - Greg Greg Sandell sandell@ils.nwu.edu
lseltzer@bhupali.esd.sgi.com (Linda Seltzer) (06/18/91)
Vance, you raised an interesting issue. Too many musical analyses focus on pitch issues as if music were a flat document on a page instead of an interaction of people playing instruments. We should be paying more attention to the sense of ensemble, the interaction among performers, etc. Not that I'm against analysis of pitch content, but maybe the pendulum has swung too far in that direction. The same for pop music. The notion of "tracks" has influenced things to the point of virtually obliterating any sense of dialog among performers. Such dialog is clearly present in earlier styles such as old Boogie Woogie recordings.