BAT AUDITORY SYSTEM

                       

 

(a) If a moth detects a bat call and begins to fly away from the attentive bat, will the bat increase or decrease the pitch of its voice? (and why does it do this?) (b) Explain why vowels sound almost the same when the pitch of a person's voice is raised or lowered. (c) What aspect of bat CF/CF processing resembles vowel recognition across human speakers? (d) What does the bat compute from the frequency-modulated parts of its outgoing call and echo?

 

 

 

   The mustached bat, Pteronotus parnellii, emits complex bisonar signals consisting of long CF (constant frequency) and short FM (frequency-modulated) components (Suga, 1988). Target range information is conveyed to the bat by the delay of the echo from an emitted biosonar signal, which involves the FM component; target velocity information is carried by the CF component which is also used for target detection (Suga, 1988).

 

 

  a)  If a moth begins to fly away from an echolocating bat, then, assuming that the moth is moving away faster, or that the bat is stationary or isn’t rapidly approaching the moth, such that the distance between the moth and the bat is increasing, the bat will increase the pitch of the sound it is emitting to echolocate the moth.

            This is because of the Doppler-shift effect: the waves of the sound that the bat has emitted will be more spaced out after they bounce off the moth, because the moth is moving away from them (i.e. if the moth were stationary, two waves might hit it at times t and t+1; because the moth is moving away, the second wave might hit it at t+2). In other words, the frequency of the echo the bat is getting back from the moth will be lower (Doppler-shifted) since the moth is moving away. A similar result would be obtained if the bat itself were to fly backwards and away from the moth (the distance between the source of the sound and the target object is increased). To compensate for that decrease in the echo frequency, the bat increases the pitch of its voice (higher frequency), such that the CF2 of the Doppler-shifted echo is stabilized at the “right”/predetermined frequency – the frequency (around 61 kHz) to which the bat basilar membrane is most sensitive, due to a greater number of hair cells tuned to a range of 61-62 kHz. This phenomenon is called Doppler-shift compensation (Suga, 1988). Combination-sensitive neurons (which respond to the combination of an emitted biosonar sound and its echo) exist in the CF/CF area of the bat auditory cortex where the Doppler-shift is represented, and in the medial geniculate body (Suga, 1988).

 

b)  The vowel recognition problem (across different speakers) is supposedly solved in humans by combination-sensitive neurons similar to those in the mustached bat. Vowels are recognized by combinations of formants (each vowel consists of multiple formants). However, the frequency ratios between the formants of a vowel are nearly constant across speakers (Suga, 1988) – i.e. the distance between bands of harmonics present in one vowel is the same regardless of pitch of voice. In different humans with different voice pitch, the CF components or formants are shifted up or down (depending on whether the pitch is high or low). The formants correspond to bands of emphasized frequencies. Thus all a listener has to do is cancel out the “Doppler-shift”, or frequency shift between different speakers when listening to a single vowel; the vowel is still coded the same, only the bands of emphasized frequencies are shifted up or down depending on whether the pitch is high or low, respectively.

 

c)         CF/CF neurons are combination-sensitive neurons, i.e. are tuned to the combination of two CF tones that are more or less harmonically related to each other (Suga, 1988). Similarly, speech-sound processing is assumed to be achieved by combination-sensitive neurons. In the bat CF/CF area, the frequency of the outgoing signal is compared with the frequency of the incoming echo; if the combination is right, the combination-sensitive neuron will fire. In humans, those frequencies are represented by the emphasized bands of frequencies. If the bands are the right distance apart (corresponding to a particular vowel), regardless of the pitch or stimulus level, the neurons coding for that vowel will fire. Just like bats cancel out the Doppler-shift, so do humans cancel out the variations in pitch. Suga (1988) proposed that vowels are recognized by combinations of F1, F2, and F3 (as opposed to CF1, CF2, and CF3), and are expressed as loci on the coordinates of F1 versus F2 frequencies and F1 versus F3 frequencies. The formant ratios correspond to neural activity along oblique lines in the frequency vs. frequency coordinates .

 

 

d) The FM (frequency-modulated) component of a bat’s call and echo carries distance information/ target range information. This is because the FM component conveys the delay of the echo – the farther away the moth, the longer the delay of the returning echo. The FM component is also used for characterizing a target, because its sound energy is distributed over many frequencies.

 

e) The DSCF area of the bat auditory cortex has frequency-versus-amplitude coordinates to represent the amplitude spectrum of the CF2 component of a Doppler-shifted echo (Suga, 1988). Its axes thus represent either target velocity (echo frequency: 61-63 kHz) or subtended target angle (echo amplitude) (Suga, 1988).