lehnert@aea.e-technik.uni-bochum.de (Hilmar Lehnert) (04/25/91)
Hi everybody, One of the sensory modalities, which I believe to be vital for many VR applications is our sense of hearing. This aspect has not been discussed very much in this newsgroup and so I would like to add some philosophy on the auditory part of VR. Summary: (For those, who do not like to read long articles.) The treating of sound is as complex as the treating of light. Human perception of auditory environment is very complex, especially because all spatial information is squeezed through only two input channels. The auditory part of VR is not close to the solution, we've just scratched the surface. What's the performance of our ears ? (mainly literature stuff) Our auditory system offer a dynamic range of 120 dB (even more for boom-car drivers) over a frequency range of about 10 octaves. Sounds can be perceived, even if the signal-to-noise-ratio is down to -85 dB. The spatial coverage is 360x180 degrees, the spatial resolution better then one degree (azimuth). The sensitivity is close to the physical limit, a few dB's more and you would hear the thermical noise of the air particles. It can't be switched off (that may have saved your live a couple of times) and the auditory sense is probably the most important sensory modality for communication. Our auditory system has only to input channels and the human head can be viewed as a stereo microphone with special directional characteristics caused by the influence of pinna, head, shoulders and torso. These characteristics are called head transfer functions (HTF). During the process of perception, the HTF are used to determine the direction of incidence and to a certain extent also the distance of a sound source. By the HTF the spectrum of a sound source is modified drastically according to the direction of incidence. We've measured variations at one ear of up to 60 dB. Surprisingly enough, we do not percieve a changing of the timbre for different directions. It seems, that we are not only able to recognize directions, but also to do a kind of inverse filtering of HTF's when perceiving timbre. How do we perceive environment with our ears ? In any natural environment, the sound emitted by a sound source is reflected by the surrounding surfaces and thus an image of the environment is created in our perceptual space. Changes in room size or shape, in the acoustic properties of absorbing, reflecting or scattering surfaces or in the directivity of a sound source cause clearly audible (and sometimes dramatical) changes in our perception. We do hear significant differences, when listening to the same source in the same direction but in different rooms, even if the rooms have the same reverberation times. The process of human perception of auditory environment is not well understood yet. Probably, the human auditory system analyzes certain reflection patterns, temporal and spatial reflection densities, long and short-term interaural correlations, energy decay processes and a lot more. The propagation of sound in an environment is as complex as it is for light, or maybe even more complex, because the wavelength of the sound varies from 2 cm to 20 m, and so most of the surfaces neither reflect geometrically nor diffusely, but something in between. What's about the current real-time 3D audio displays ? Systems like the convolvotron (Crystal River), focal point (Bo Gehring) or the binaural mixing console (Head Acoustics) create localization cues by filtering the signal with the HTF for the desired direction of incidence, specified by the azimuth and the elevation angle. That allows to place the auditory event on the surface of a (more or less corrupted) sphere around the listeners head. Well, I personally have some difficulties with the term 3D in this case. Azimuth plus elevation makes two, what about number three, which is, off course, distance in this case. Human perception of distance is maybe more complex then just judging the loudness of a signal, especially if a sound source has no 'natural' loudness. I believe that without adding some more sophisticated distance cues, no real 3-D image can be achieved. I would rather call these systems 2D (a surface), as I would call mono 0D (a point) and intensity stereo 1D (a line). Another point is, that no environmental information is encoded into the audio signals (with the exception maybe of AKG's CAP machine). Only the direct sound is simulated. That corresponds to an anechoic chamber, which is maybe not a very natural situation or suitable to give you a feeling of 'being there'. I think that these systems do a great job, regarding the hardware capabilities and the knowledge presently available. They surely are a big step in the right direction. How could the auditory part of a VR-system look like ? When designing VR systems, one should keep in mind, that the objects the virtual world is made of, have different properties for different sensory modalities. The properties relevant to the auditory sense are the geometric extensions and the acoustical properties (e.g. reflectance, wall impedance, degree of diffusion, transmittance etc.). Each sound source should be assigned a directivity. As well as a rendering process is needed for the visualization an "acoustical rendering" has to be performed for the auralization. Current research projects (I know of labs in Sweden, Denmark, France and Germany) deal with so-called binaural room simulation systems. These system model the sound field using some approximations and perform the auralization of the results using binaural technology. The results sound more like real world signals then does the pure direct sound simulations. However, these systems are far from operating in real-time, because they deal with several thousand reflections. Conclusion : Some of the articles recently posted may raise the idea, that the auditory part of VR is no big deal any more, and just the problem of individual corrections remains to be solved. This is not true. To create natural sounding or even authentic auditory environments in real time is a very difficult approach and a lot of research work (mainly psychoacoustical stuff and work on the effects of combining different sensory modalities) is still required. The interaction of sound with any environment is as complex as it is for light and humans are able to perceive a lot of that interaction very well. The auditory sense can support the creation of the feeling of "being there" in VR very much, because it does the same job in the real world. A simple way to check for the quality of a simulation is to simulate the room you're in, play the results to any listener and wait for the reaction. If he/she turns in the right direction and says "oops, who's talking there" your'e on the right path. Hilmar Lehnert (lehnert@aea.e-technik.uni-bochum.de) Re : Where are the woman (minorities in that group) I must admit I'm male and white, but at least I'm european. --