gvokalek@augean.OZ (George Vokalek) (11/08/89)
> What happens when the listener's head turns? What you really need is a pair of > headphones that can detect head movements. Then you must adjust the delays and The ear is a tube with a transducer at one end. If the sound pattern is such that a node is present at the eardrum, no sound will be heard. By turning the head, you change the pressure distribution in the ear canal, moving away from the node (possibly toward another node at a different frequency). Given the speed of sound is 330m/s, a 3kHz sound will have a wavelength of about 10cm. Moving the head by several cm therefore represents a significant fraction of one wavelength, resulting in a significantly different sound pattern in the ear. Note that if this means that you should be able to localise high frequency sound more accurately than low frequency sound. Personally, this seems reasonable - for instance its easy to find a mosquito. I cant think of any Low freq examples. ..G..
pvo3366@sapphire.OCE.ORST.EDU (Paul O'Neill) (11/13/89)
In article <1989Nov2.180644.28647@sj.ate.slb.com> greg@sj.ate.slb.com (Greg Wageman) writes: >I don't think this is possible with two speakers in front of the >person. .......... Wrongo. See below. >I'm by no means an expert, but my understanding is that the brain >localizes sound by analyzing the amplitude (volume) and time-delay >(phase) of direct and reflected sounds as percieved by both ears to .... You've mentioned 2 mechanisms of localization, but not no. 3. 1] Amplitude 2] Phase 3] Frequency response Our ears have a different frequency response at different azimuths and elevations. We use the frequency content of arriving sounds as one of our localization inputs. Demonstration: Plug one ear with your hand. Close your eyes. Click the fingernails of your thumb and forefinger on the other hand at various positions around your other ear. Can you localize the clicks? How? You're only using one ear. Ignoring 1] and 2], and building circuits exploiting 3] only can give surprisingly good and adjustable stereo imaging -- with headphones or with speakers. One starts with the known frequency response curves of the human ear for various azimuths the the known azimuth of the speakers or headphones. Filter the sound source such that when it is run through the ear's "response at Speaker-Azimuth" it will have the content of "response at Faked-Azimuth". I'll see if I can dig out my references on this stuff. Most is from Journal of the Acoustical Society of America. (JASA) Paul O'Neill pvo@oce.orst.edu Coastal Imaging Lab OSU--Oceanography Corvallis, OR 97331 503-754-3251
rob@kaa.eng.ohio-state.edu (Rob Carriere) (11/14/89)
In article <13729@orstcs.CS.ORST.EDU> pvo3366@sapphire.OCE.ORST.EDU (Paul O'Neill) writes: >Our ears have a different frequency response at different azimuths and >elevations. We use the frequency content of arriving sounds as one of >our localization inputs. > >Demonstration: Plug one ear with your hand. Close your eyes. Click >the fingernails of your thumb and forefinger on the other hand at >various positions around your other ear. Can you localize the clicks? >How? You're only using one ear. I agree with the content of the post, but the demonstration is bogus. Quite apart from any audio clues as to the location of the sound, you have the clues arising from the fact that you know where your fingers are. There is no obvious way to tell whether or not this extra information is used to ``cheat''. A proper demonstration (which works quite well, incidentely) would be to have a second person produce the sounds. SR
brianw@microsoft.UUCP (Brian Willoughby) (11/15/89)
>Demonstration: Plug one ear with your hand. Close your eyes. Click >the fingernails of your thumb and forefinger on the other hand at >various positions around your other ear. Can you localize the clicks? >How? You're only using one ear. > >Paul O'Neill pvo@oce.orst.edu Are you satisfied that plugging one ear with your hand is TOTALLY blocking sound from entering that ear? I'm not. Try disconnecting your auditory nerve :-) The following is a followup that I composed to the original posting. I believe my system had trouble sending it. If this is a duplicate, please ignore it (it's very long). I don't mention many DSP techniques, but once the concepts are revealed its almost trivial to think of DSP applications. I also don't mention the effects of frequency response - I'll leave that to someone who knows a little more about how sounds are filtered by the irregular shape of the outer ear. ------------------------------------------------------------------------- In article <1989Oct31.193130.1685@eddie.mit.edu> rich@eddie.mit.edu (Richard Caloggero) writes: > Is anyone out there interested in talking about > psycho-acoustics/psycho-acoustic phenomena? [...] > As it turns out, most of this information is >spacial in nature. Psychoacoustics, then, is the study of phenomena >related to the *realness* of sound. This *realness* is not only a >function of frequency response, frequency balance, harmonic >distortion(s), etc. It is also a function of things having to do with >spacial information. My problem is that I have no name for these >*spacial things*. I've heard them, but I can't talk about my >experience in concrete terms without the necessary vocabulary. I think that psycho-acoustics is very relevant to this group because DSP techniques make it much easier to simulate realistic sound. It is currently possible to do a lot of experimenting with sound space using today's technology, but there is much room for improvement. > Can anyone out there shed some light on this muck? I am >convinced that it is possible to build a *box* which can take one >channel and generate a stereo signal which is a representation of the >original signal plus 3d positional information. In other words, I can >generate sound, using just two speakers, which actually comes from >*behind* the listener even if he/she is facing the speakers. Does >anyone agree with this? Does anyone know how to build such a thing? How your brain locates sound sources: I think that it is very possible with a stationary listener, or a system which adjusts to the listener's changing position. (Say, like headphones?). In fact, one of my pet peeves is that hi-fi audio salesmen have been selling *monophonic* subwoofers for years. They claim that low frequencies cannot be located by your brain because they are non- directional. Actually, your brain uses (at least) two methods of locating sound sources: delay and amplitude. Low frequencies are located by the delay between the arrival of the sound to each ear. The fact that low frequencies are less directional helps by presenting approximately equal amplitudes to each ear, only phase shifted. Sounds directly ahead (or behind) arrive simultaneously, while a sound emanating from 90% to the left or right have the maximum delay, based on the distance between your ears, the speed of sound, and the period of the frequency. High frequencies are more directional (directionality of sound increases with pitch), and your ear locates higher tones by the difference in amplitude between each ear. As with phase shift, amplitude difference is smallest when straight ahead, and largest to the left or right. Basically your head gets in the way of the directional highs, thus lowering the amplitude as the angle increases. Why the brain uses two methods: Both methods of directional sensing are used together because each breaks down at different frequencies. Delay, or phase shift, can only be used for low frequencies, because as you increase pitch the period of the wave gets smaller and eventually the wavelength is less than the distance between your ears. Your brain can determine the delay when comparing two versions of the same cycle of a wave, but things get muddy and confused if the sound completes an entire period in one ear before it arrives at the other. Amplitude differences start to diminish with lower pitches because directionality is also decreasing and eventually there is very little difference in volume. The moral of the story is that, in truth, the *mid range* speakers cannot be located by your ears and brain. Binaural Recording: Binaural recording was based on recording sound from two microphones which were the same distance apart as the average persons ears (no jokes about cranium size, please), thus preserving the natural time delays between the sound 'images' presented to each ear. A good analogy is 3D movies, which record two light images with cameras spaced horizontally like our eyes. When the two images are delivered independantly to each eye (using polarized lenses set at 90 degree angles), the brain interprets depth that is no longer real. For some reason binaural recording is not as popular as it was, even though we probably have the processing power to synthesize the effect rather than merely recording it accurately. Multi-Monophonic Recording: Standard audio mixers pan sound using volume only. There was an article this year in Electronic Musician magazine which referred to this as 'multi-monophonic recording', which is more accurate. The result is a distinct lack of 'spacial' clues. This is probably why mono subwoofers sell. The funny thing is that many audiophiles listen to classical music which doesn't suffer from volume panning since it is recorded via microphone. The EM article described how incorrect micing could destroy phase shift clues (if the mics were spaced too far - often on opposite sides of the stage) or destroy amplitude cues. Proper micing results in good imaging. Have you ever donned your headphones and panned the sound all the way to the extreme left or right? This really bothers me (gives me a kind of a small headache :-), because my brain is trying to compute what location this sound could be arriving from such that the other ear would hear nothing at all. Just doen't fit into the algorithm. Drawbacks to 3D audio: Headphones and loudspeakers are two differect media, yet we pipe the same source material through them. Perfect binaural recordings are ruined when auditioned over free-standing speakers, because each ear hears *both* channels. Your brain re-interprets the delays and amplitude differences and ends up computing WHERE THE SPEAKERS ARE! It is possible to make a recording which sounds 3D over loudspeakers, but the effect would be different through headphones. How can you experiment with psycho-acoustics today? As the EM article mentioned, digital synthesizers can precisely repeat the same sound based on an algorithm. If a stereo synth (or two mono synths connected through MIDI) were programmed so that each channel had a different delay, then you could create 3D soundscapes. One of the first things I tried with my EPS (a year before I read the article, BTW) was to pan two copies of the same digital sample so that one was hard left and the other was hard right. This took very little extra memory, since the data was shared between the two voices. I set one channel to ignore the pitch bender so I could change the time delay in real time. Think of an analogy to a record player. If two identical records are playing on turntables with matched speeds, then the sounds will be in sync. Grabbing one platter and slowing that channel down for an instant before letting go would make that record lag with a slight delay after it returned to normal speed. The pitch bender on the sampler did the same so I was able to move a sound around the room (without a volume change) and leave it there by releasing the pitch bender to standard speed. If I needed to make the pitch bender channel advance ahead of the channel which was ignoring the bender, then moving the pitch up for an instant and then releasing it would do the trick. This was pretty cumbersome, but there is much you can do with this idea. Many digital delay processors can listen to MIDI controllers and change their delay in real time. If the programmable EPS MIDI controllers were used to affect the volume of each wave, then one of these delay processors could cause the phase shift to track along with volume changes. I sure hope there are a few synth owners reading comp.dsp Carver makes a 'Sonic-Hologram' device which tricks your brain into treating loudspeakers like headphones. i.e. each ear only hears the respective channel. By computing the distance to the optimum listener position, requiring the speakers to be placed a certain distance apart and allowing for the speed of sound, Carver delays each channel the correct amount, inverts the signal's polarity and mixes it with the opposite channel. This results in any sound from the *right* speaker which hits the *left* ear being canceled out by an inverted copy of the same wave. The effect is so easy to accomplish that many portables have a 'stereo wide' switch - although how you could fully appreciate the effect while both you and the box are moving is beyond me... Drawback #2: Any 3D loudspeaker system will be dependant upon the position of the listener (I think) unless someone designs an adaptive system which monitors and adjusts to your movements. Imagine one of those flight simulator helmets with headphones and a computer which relocates objects as you turn and walk around (such a thing is in the works, but it still won't solve the free-standing loudspeaker problem). Other experimental ideas: Back when digital delays started to become affordable, I dreamed up a multi-channel mixer system which had an independant adjustable delay for each channel which tracked the pan pot. The delay would be set so that either channel could be delayed with respect to the other. Thus, both amplitude and phase would change is unison - as they do when an object moves around you. I thought this would be an expensive device, but I was thinking of combining an analog mixer and digital delay. With a totally digital mixer, it would be interesting to allow very short time delays in order to synthesize a binaural-style recording. Other areas: My experience has only been with time delays and amplitude changes. I wonder how much processing your brain does to reverse compute the changes in the sound as it passes through our irregularly shaped outer ear. i.e. is it possible that a sound from behind is distinguished by the path it takes around your outer ear? I have read of an individual who was convinced that he could coax stereo sound out of a monophonic speaker! I assume that he thought he knew how the brain interprets frequency info. He released a few recordings, perhaps someone else knows the name of this person? > How do you use your ears? Most people use them for spoken >communication, and listening to music. How many out there use them for >navigation? Since I am blind, I use audio information to *see* my >environment. I have often been walking in the dark down a familiar hallway, where I was expecting a doorway at some distance, only to stop a few feet short because I *thought* that I was at the door already. It is usually around two feet short of the door, so I figure my unconscious is taking in clues and warning me to stop at a safe distance. I've wondered if these 'clues' were sound-based, but sometimes there is faintly detectable light. Don't ask me why I occasionally walk around in the dark... >-- > -- Rich (rich@eddie.mit.edu). Brian Willoughby UUCP: ...!{tikal, sun, uunet, elwood}!microsoft!brianw InterNet: microsoft!brianw@uunet.UU.NET or: microsoft!brianw@Sun.COM Bitnet brianw@microsoft.UUCP
martin@cod.NOSC.MIL (Douglas W. Martin) (11/16/89)
In the past few weeks, there have been several articles about human perception of sound, and how such psychoacoustic cues are interpreted spatially. The two areas I wish to address are those of binaural recording, and the obstacle detection sense used by many blind people for navigation. I myself am totally blind, use this obstacle sense extensively, and have a MS in acoustics from Penn State. The ability to detect obstacles, find doorways, estimate the size of rooms, etc., was first discussed in the literature by D. Diderot in 1749. He thought that a blind person could judge the proximity of bodies by "the action of air on his face." The sensation of approaching an obstacle is somewhat like light pressure on the face. Thus, this sense has been misnamed "facial vision" in much of the early literature. The obstacle sense is more accurately referred to as echolocation; it is an auditory perception. Confirmation that this is an auditory sense and not some kind of "facial vision" was first obtained by researchers at Cornell University in the late 1940's. Obstacles can be detected either passively (using reflections of ambient sound in the room) or actively (using a self-generated noise such as a click or whistle). Learning to use this echolocation is sudden and insightful rather than gradual; a person merely needs to learn what to listen for. Of course, the use of this perception is not limited to blind people. Anyone can easily demonstrate this perception. Simply close your eyes, and walk with hard shoes on a hard floor toward a wall. You should sense the presence of the wall before actually contacting it. However, if walking barefoot across a carpeted floor, will usually result in impact with the wall, because there is much less reflected sound to work with. Many of the parameters of this echo detection capability were quantified by Charles Rice and his colleagues at Stanford in the mid and late 1960's. The ability to detect an object depends on its size, distance, and reflectivity. Rice found that blind people could detect obstacles spanning an angle of about four degrees. Area ratios between disks as small as 1.06 to one, could be discriminated. Some subjects could also reliably discriminate circles, squares, and triangles using their echolocation. Large obstacles can be detected at distances exceeding ten to fifteen metres. Distance cues appear to be related to both pitch and loudness, and directional cues result from the same auditory localization phenomena described by earlier articles in this group, mainly interaural time and amplitude differences. It was mentioned in an earlier article that binaural recordings can be made by separating two microphones by a distance equal to the diameter of the head. Actually, this is not sufficient to make a binaural recording; it will only make a stereo recording. In order to obtain the binaural effect of localization, an obstacle (like a head) must be present between the microphones. This is necessary to create the auditory shadow which is critical for high-frequency localization. When listening to a stereo recording through headphones, the sound image is "lateralized" as opposed to the image being "localized" using headphones with a true binaural recording. In lateralization, with stereo headphones, the sound image appears to be coming from somewhere inside the head, often closer to either left or right, but still within the head. When listening with headphones to a true binaural recording, the sound image is "out there in space" with a perceived distance and direction. Again, to make a binaural recording, it is necessary to have a head-sized obstacle between the microphones. The actual shape of the obstacle, the presence of hair or facial features, and other similar factors are not very critical. However, if sounds are to be localized in elevation as well as in azimuth, there must be a reflecting surface below the head, e.g. a torso. It has been mentioned that the ear has a different frequency response for sounds arriving from different angles. In fact, the structure of the pinna (outer ear) is such that an impinging sound wave undergoes multiple reflections in the pinna before reaching the eardrum. The amplitudes and relative time delays associated with these multiple reflections are, of course, angle dependent. An excellent paper on this topic was published by Wayne Batteau, 1965, in Proceedings of the Royal Society, London. I have hundreds of references in all these areas: blind echo location, binaural hearing and recording with dummy heads, and sound transformations in the outer ear. If there is interest, I will compile a bibliography as I have time, and will send it to anyone who wants it. Doug Martin martin@nosc.mil Naval Ocean Systems Center, San Diego, ca 92152. phone: (619) 553-3659.
bloch@mandrill.ucsd.edu (Steve Bloch) (11/17/89)
gvokalek@augean.OZ (George Vokalek) writes: >...Moving the head by several cm therefore represents a >significant fraction of one wavelength, resulting in a significantly >different sound pattern in the ear. > >Note that if this means that you should be able to localise high >frequency sound more accurately than low frequency sound. Personally, >this seems reasonable - for instance its easy to find a mosquito. >I cant think of any Low freq examples. Well, this makes sense as long as the wavelengths don't get shorter than the diameter of the head; after that everything goes to hell. But in general, that's pretty good, sonny, but that ain't the way I heerd it. Way I heerd it, you can more easily localize high BAND- WIDTH sound than low BANDWIDTH sound. A mosquito is an example of this too, of course. A good example (since we're talking wildlife) is from ornithology: many common songbirds use a high but nearly pure whistle when a predator shows up and they want to warn one another without being located, but they use a wide-band "chuck" when the intruder is one they think they can scare away; this way they can find one another easily and gang up on it. "Writers are a funny breed -- I should know." -- Jane Siberry bloch%cs@ucsd.edu
bloch@mandrill.ucsd.edu (Steve Bloch) (11/17/89)
brianw@microsoft.UUCP (Brian Willoughby) writes: >I wonder how much processing your brain does to reverse compute the >changes in the sound as it passes through our irregularly shaped outer >ear. i.e. is it possible that a sound from behind is distinguished >by the path it takes around your outer ear? I seem to remember a discussion in Runstein & Huber _Modern_Recording_ _Techniques_ that described fairly precisely the delays stemming from reflection from outer and inner pinnae of the ear, and in particular that reflections from one set of pinnae gave predominantly front/back information, the other predominantly up/down information. The book's at home, so I don't have the figures here. "Writers are a funny breed -- I should know." -- Jane Siberry bloch%cs@ucsd.edu