Introduction to Hearing in Vertebrates: A Psychophysics Databoook (1988)

by Richard R. Fay

The Uses of This Book

I hope that this review will be of value and interest to researchers and students in many fields which combine to form hearing science. Those researchers in animal psychophysics will find a comprehensive, ready reference to the published work in their field. Auditory physiologists will find data to help in the design and interpretation of their experiments on the neural codes underlying the perceptions of the species they study. Those who study biophysics and physiology of the ears of diverse animal species will have efficient access to the rather far-flung literature on the auditory perception of “their” species. Neuroscientists, engineers, and other modelers interested in the brain mechanisms underlying perception will have quantitative descriptions of system behavior that may help refine their questions and evaluate their models. Evolutionary biologists and anatomists will have a guide to the ultimate functional correlates of the structures they study. Animal behaviorists and experimental psychologists will have a reference for estimating the potential information content and likely discriminability of the stimuli they may use in controlling and analyzing behavior. Otolaryngologists and audiologists might discover among these data useful animal models for their study of auditory pathology and hearing impairment. Finally, those concerned with the management, breeding, and husbandry of different animal groups can use these data to help design and evaluate acoustical environments appropriate for given species.

Scope of the Book

This book presents psychophysical data in vertebrate hearing obtained from published literature. The data presented here are in the form of new figures original to this book, and tables given the numerical values of all plotted points. The book contains separate sections on the Lateral Line System (7 figures), Fish Hearing (62 figures), Amphibians and Reptiles (16 figures), Birds (63 figures), Mammals (over 115 figures), and Comparisons among Vertebrates (13 figures). In addition to data on hearing sensitivity, discrimination, and directional hearing, data are included on hearing development and infant hearing aspects of echolocation, and the psychophysics of electrical stimulation of the auditory system. The book contains a Topical Index for each section and also combined for all vertebrates, a Species Index containing all scientific and common animal names with references to the kinds of data available for each, complete bibliographic references (placed on the text pages displaying the data derived from them, and as a combined Bibliography), a Journal List, and an Author Index. Introductions precede each section. A Comparative Section brings together selected data from different animal groups for comparison across all vertebrates.

Data on vertebrate hearing have been derived from original research papers, book chapters, theses, and, rarely, from abstracts or unpublished papers (e.g. papers presented at meetings). An attempt was made to obtain the data from published research papers whenever possible, and references to theses, abstracts and unpublished papers are kept to a minimum.

An attempt was made to include all published auditory (and lateral line) psychophysical data for non-human vertebrates (183 species of fishes, amphibians, reptiles, birds and mammals), and selected data for humans. A partial list of the hearing functions included are: Auditory sensitivity and hearing range (audiograms), frequency and intensity discrimination, pitch and loudness estimation, all aspects of masking (including broad band noise, narrow band and tonal masking, non-simultaneous masking, psychophysical tuning curves, and auditory filter shape), temporal summation, temporal resolution, duration discrimination, echolocation acuity, amplitude modulation detection and discrimination, detection of electrical stimulation of the auditory nerve and brain, temporal pattern discrimination, hearing development, and all aspects of directional hearing.


There are several reasons why a given data set might not be included in the book:

1. I may have missed relevant papers in my literature search. Since I plan to update this book in later editions, I would be most grateful to be notified of any omissions that may have occurred.

2. Any given reference may have presented so much data (e.g. many functions of frequency, or sensation level, many individual subjects' data many parameters, etc.) that to include it all would have swollen the book well beyond its present size. In these cases, I have attempted to select representative functions of subjects, or I have averaged across parameters where no important experimental effects were apparent. In these cases, I have noted the existence of further data in the reference notes. Some psychometric functions are included as examples in different species, tasks, and conditioning paradigms, but the majority have not been included.

3. A given reference may not contain useful data in spite of a title that suggests otherwise. In some references, experiments were done and thresholds obtained, but the relevant data may not have been given. For example, some studies on the effects of trauma to the auditory system present changes in thresholds as a result of experimental manipulation, but not the control thresholds themselves.

4. In general, I have omitted thresholds obtained after lesions of the ear or auditory system. The focus of this book is on the data from normal auditory systems. In some cases, I have noted that a given lesion was made and what the general finding was. I have left out most studies of temporary and permanent threshold shift due to ototoxicity and acoustic trauma; there are too many data. Perhaps another book like this one could focus on these. In a few cases, I have included acoustic trauma effects where the data are manageable and make important points about the mechanisms of normal hearing (e.g. for some of the non-mammals in which the functioning of the ear is not yet well understood, or for some mammals where the mechanisms of discrimination are revealed by the lesions).

5. In general, I have omitted thresholds obtained after lesions of the ear or auditory system. The focus of this book is on the data from normal auditory systems. In some cases, I have noted that a given lesion was made and what the general finding was. I have left out most studies of temporary and permanent threshold shift due to ototoxicity and acoustic trauma; there are too many data. Perhaps another book like this one could focus on these. In a few cases, I have included acoustic trauma effects where the data are manageable and make important points about the mechanisms of normal hearing (e.g. for some of the non-mammals in which the functioning of the ear is not yet well understood, or for some mammals where the mechanisms of discrimination are revealed by the lesions).

6. Data from a paper may not have been included here if, after several concerted efforts to make sense out of the paper and the data within it, I failed to do so.

7. Finally, I have attempted to include representative human thresholds for comparison with non-human data, but have not attempted a complete review of the human literature. Again, there are too many data. In several cases, the human data included come from papers on the animal psychophysics in which humans were tested in the same way.

Format of the Book

The book is organized by animal group (class), with the addition of a section on the lateral line system of fishes and amphibians. Each section begins with an Introduction and a Topical Index which functions as a Table of Contents. This index directs the reader's attention to the kinds of psychophysical data that exist for each animal group, and is designed to summarize each figure and table. The index is arranged alphabetically on major topics (e.g. frequency discrimination) but subordinate topics are arranged logically (as in the body of the book), rather than alphabetically. The idea is to make the Topical Index useful both as an index and as a Table of Contents. At the end of the book, the indices for each class section are combined, alphabetically, on major topic. This allows the reader to find everything on a major topic across animal groups. This combined index also functions as a snap-shot look at what aspects of vertebrate hearing have (or have not) been studied. In addition to the Topical Index, there is a Species Index (for common and scientific names) listing for each species the kinds of data available, Author Index, Bibliography, and Journal List.

Each class section begins with audiograms, and then continues with various studies of auditory detection and discrimination.

Record Format

The book is organized as a series of two-page mini-reviews, or records. Opening the book anywhere reveals a complete two-age record. The left page begins with a figure, which plots the data in question. The figure is labeled with a capital letter (L,F,A,B,M, or C), referring to the lateral line, fish, amphibians-reptiles, bird, mammals or comparative, respectfully, and a hyphenated number which indicates the sequence of figures. The label format (e.g. M6-0) is designed so that new records may be added in later editions (e.g. M6-1) without the need to completely renumber the figures and tables.

All species are listed first by genus and species (italicized), and then by common name, as they have been presented by the authors of the papers referenced. Although this seems awkward and unnecessary for the well known species (e.g. cat, goldfish, and human), it is necessary for consistency in the context of so many different, and sometimes unfamiliar species. The Species Index given scientific names for all common names and vice versa. No systematic attempt was made to correct the authors in their use of common or scientific names.

Figure Format

An attempt was made to construct the figures in a standard way, for ease of interpretation and comparison across figures. (The figures were made using Symphony (Lotus Development), cleaned up and edited using Freelance Plus (Lotus Development), and laser printed (Hewlett Packard Laser-Jet Plus) at 300 dpi on the top half of the page). On most cases, logarithmic axes begin and end at integer powers of 10. Figures within a section, which are useful to compare, have the same axes with the same scale and dimensions. Sometimes this is awkward and leaves the data points crowded to one side or the other, but I believe that this is preferable to axis scales, which change from figure to figure. The motivation is to facilitate comparisons across figures.

Each data set in a figure has its own symbol, and every symbol or line is coded with a number in the figure. The data corresponding to each symbol and number is briefly identified below the figure. At the bottom of the left page, the complete bibliographic references corresponding to all data plotted are given in alphabetical order. This saves the reader from flipping back and forth to the combined bibliography at the end of the book to identify the complete reference of interest.

Table Format

On the right page, a table lists the coordinates of all the points plotted in the figure. The tabled numbers are those which were entered into the Symphony spreadsheet to create the figure. These numbers were taken from published tables in original references, or were extracted from published figures. To obtain numbers from figures, I used a pair of good dividers and a millimeter rule. The errors arising from using this procedure depended on the quality and size of the original figure, and my own eyeballing error (estimated to be about 0.2 mm). Great care was taken in reading numbers from figures, but it is likely that some errors were made. I would most appreciate knowing about any errors that readers may find.

Note format

Below the table are notes on various aspects of the data displayed in the figures and tables, identified by the appropriate number. The notes may include a mention of the conditioning and psychophysical method used, the number of subjects, the measure of central tendency used, any averaging of omission of data that I may have done, important details about the acoustic signals used, brief explanations or interpretations of the results (where appropriate or interesting), brief note of other results contained in the reference but not given in the figure and table, “see also” references to other records in the book, and citations of other relevant bibliographic references. These additional references are listed in the combined bibliography at the end of the book.

In some cases, the notes are lacking in some detail that the reader may consider important. Often, this reflects the omission of details in the original reference. In some cases, however, certain details are omitted in order to keep to a two-page format. Each of these two-page records stands by itself to tell its story. However, the data presented in this book do not substitute for the original published papers. When it is critical to know the details of the methods and results from one of the references cited, the original paper should be read.

Conventions and Definitions

A number of formal conventions are used throughout the book for consistency and to promote intuitive understanding of the figures and tables. These conventions are noted here along with brief explanations and definitions of terms and concepts used throughout the book.

Frequency is given in Hz rather than kHz unless using the latter saves needed space in the text.

Sound pressure is given in dB re: 1 dyne cm-2 for all underwater conditions (fishes and marine mammals), and as dB SPL (sound pressure level) for all in air conditions. Proper comparisons of hearing sensitivity in air and water are difficult to make. One common method of comparison is to express both air and water thresholds in units of sound intensity (e.g. Watts cm-2), which takes into consideration the impedance of the medium.

Note that: 0 dB SPL = 0.0002 dynes cm-2 0.002 dynes cm-2 is 20 micropascal, or 2x10-5 Pascal 1 Pascal = 1 Newton m-2 1 dyne cm-2 = 0.1 Pascal 0.0002 dynes cm-2 is -73.98 dB re: 1 dyne cm-2

In air, 0 dB SPL = 10-16 Watts cm-2

In water, 1 dyne cm-2 = 6.8x10-13 Watts cm-2

For equal intensity (in Watts), sound pressure in water is 35.6 dB above sound pressure in air.

Particle motion values are given in microns (micrometers) of displacement, or dB re: 1 micron. It is possible in many cases that the lateral line or auditory receptor responds in proportion to particle acceleration, however. Acceleration may be calculated by multiplying the displacement by (2 x pi x frequency)2. 1 Angstrom = 10-10 meter.

Frequency discrimination thresholds are given as the frequency difference (in Hz) between two tones that are just discriminably different.

Intensity discrimination thresholds are given at the sound pressure difference (in dB) between two signal that are just discriminably different.

Sensation level (SL) refers to a level of a signal in dB relative to the signal level at absolute detection threshold.

Masking is defined as the reduction of the audibility of one sound caused by the introduction of another sound. In many studies of masker level, the signal threshold tends to rise one dB for every one dB increase in masker level. This means that the increment caused by adding the signal to the masker is constant in dB, (i.e. is a constant proportion of masker sound pressure). This is equivalent to Weber's Law, which states that a just-detectable increment in stimulus intensity is a constant proportion of the base intensity.

Forward Masking refers to a case in which the signal to be detected is presented after the masker stimulus has ended. This is a way to probe the persistence of the effect of the masker in the auditory system. The locus of the persisting effect is likely at the synapse between hair cells and auditory nerve fibers.

Backward Masking refers to the case in which the signal to be detected begins and ends before the masker stimulus has begun. The locus of this masking effect is likely more central than that for forward masking.

Simultaneous Masking refers to the case in which the signal to be detected is presented during the masker.

Critical Masking Ratio, or CR (signal-to-noise ratio at threshold) in the decibel difference between the sound pressure level of the signal tone at threshold and the spectrum level of the masking noise in the frequency region of the signal.

Spectrum level is the sound pressure level of the noise within a one Hz-wide band. This can be calculated from a pressure level (in dB) in a wider band (B Hz wide) by subtracting 10log10(B) from the pressure level. Equivalent “critical ratio bandwidths” in Hz can be calculated as 10(CR/10).

Critical Bandwidth is the frequency range (in Hz) within which the intensity of a stimulus summates over frequency in its effect on the auditory system. Most often, the “effect” measured is the masking effect of a noise band on the detection of a tone centered (in frequency) in the band.

Psychophysical tuning curves (PTC) are measures of the frequency selectivity of a filtering system. They are usually obtained by determining the level of a masker (often a tone) required to just mask a tone or narrow band signal (fixed in level near absolute threshold) as a function of the frequency selectivity of a small number of channels having center frequencies near the signal frequency.

Q10 dB is a relative measure of the frequency selectivity (tuning) of a filtering system. It is calculated as the center frequency of the filter (in Hz) at levels 10 dB above the best sensitivity.

Binaural masking level difference (BMLD) is a phenomenon of binaural hearing in which there is a release from masking under conditions producing different inter-aural relations (e.g. inter-aural differences in phase or intensity) for the signal and for the masker. For example, if a tone and noise masker are identical in waveform at both ears (diotic, or N0S0), the masking effect is the greatest. If the signal is then simply inverted in polarity in one ear relative to the other (N0Spi), the masking effect is reduced by several dB (i.e. the signal is more detectable). This effect is probably best viewed as an adaptation for directional filtering, since such inter-aural differences would arise if the signal and masker sources were located at different azimuths. The BMLD is one way to define the “cocktail party effect.”

Amplitude modulation. For amplitude modulated signals, the modulation depth is indicated by m which is equal to (P-T)/(P+T), where P is the sound pressure at an envelope maximum and T is the sound pressure at a minimum. This value varies between zero and one; one indicating 100% modulation. The value m is usually scaled as -20 log10(m) so as to expand the scale at low modulation depths. For sinusoidal amplitude modulation (SAM), 20 log10(m) is the attenuation of the side bands (in dB) relative to the 100% modulated case.

Amplitude modulated noise is used to measure “temporal modulation transfer function" (TMTF). This is defined as the smallest modulation depth that still allows the animal to discriminate between modulated and un-modulated noise, determined at a number of modulation rates. The resulting function can be thought of as the frequency response of an hypothetical internal low-pass filter through which envelope fluctuations pass, and which limits the effective modulation depth of the internal (physiological) representation of the envelope.

Amplitude modulated signals are also used to measure the temporal resolution of the auditory system in representing and processing the time structure of the modulate envelope. In this case, animals are asked to discriminate between different modulation rates. Such thresholds can also be thought of as estimates of the animal's ability to measure the time interval between successive peaks in the envelope, and may be similar to duration discrimination thresholds.

Temporal summation. In studies of temporal summation at threshold, perfect summation or integration of sound intensity (energy detection) occurs when a 10-fold increase in sound duration results in a 10 dB (intensity factor of 10) reduction of sound pressure at threshold.

Repetition noise is a stimulus created by splitting the output from one noise source into two channels. This produces a noise whose amplitude spectrum is a sinusoidal function of frequency, and which has a peak in the autocorrelation function at the delay time (T sec). The autocorrelation peak indicates an essential periodicity in the nose fine structure caused by adding back a delayed version of the original noise (as an echo). The first peak of the spectrum above zero Hz occurs at 1/T (in Hz), and all subsequent peaks are spaced by 1/T Hz. Humans perceive this stimulus to have a pitch equal to a pure tone of 1/T Hz. The pitch could arise either from an autocorrelation-like process (time-domain analysis) or a filter bank-like process (frequency-domain analysis), and the interesting question concerns which of these the auditory system uses. Attenuating the delayed channel reduces the spectral modulation depth (peak-to-trough differences across the frequency spectrum).

If the delayed channel of repetition noise is inverted before adding – producing “cos-” noise – an autocorrelation null appears at the delay time, and the first trough of the spectrum occurs at 1/T Hz. This produces a slightly weaker and ambiguous pitch for human observers.

Minimum Audible Angle (MAA) refers to the smallest angular difference between two sound sources that allows the observer to discriminate the difference between successive sources from one source location and successive sounds from different source locations.

Conditioning and Psychophysical Methods

In the notes for each record, I briefly describe the conditioning procedures used. I have attempted to use the terminology and spelling conventions of the authors of the papers referenced in this description.

Instrumental avoidance conditioning refers to cases in which the animal is trained to perform a response (e.g. crossing a barrier, lifting a paw, licking a tube) in order to avoid shock or other noxious stimulus. The auditory signal in the experiment signals impending shock, and the criterion response is used to indicate hearing or discrimination, and causes the shock to be omitted.

Operant conditioning for reward (usually food or water) refers to cases in which the animal is trained to emit a criterion response (e.g. press a lever, nose a panel, or move to a certain location) in the presence of an auditory signal in order to receive the reward. The motivation is hunger or thirst. A response in the presence of the signal is used to indicate hearing of discrimination, and results in reward. Often, operant paradigms chain several responses in order to control the position of the animal in the sound field. In many cases, the animal can initiate a trial by emitting and “observing response.” This helps ensure that the animal is ready to listen. Then a signal is presented following a random time delay from the observing response, and a second response within a criterion time from the signal onset is used to indicate hearing or discrimination, and results in reward.

Classical (Pavlovian) conditioning refers to the case in which a stimulus (often shock) produces a reflex response (e.g., respiratory or cardiac suppression). The shock is termed the unconditioned stimulus (UCS) and the reflex is termed the unconditioned response (UCR). In delay conditioning, an auditory signal is presented for several seconds prior to the UCS and can be thought of as a signal for the impending UCS. After several pairings of the signal and the UCS, the signal comes to elicit a response often similar to the UCR. At this point, the signal is termed a conditioned stimulus (CS) and the response to it is termed the conditioned response (CR). The CR is used to indicate hearing of discrimination.

Conditioned suppression is a combination of operant and classical conditioning. The animal is trained to emit a steady stream of responses (e.g., pressing a bar or licking a tube) for a reward which is intermittently given. Then a shock (or other noxious stimulus) is introduced (the UCS) which interrupts the operant response stream. An auditory signal precedes the UCS and becomes a CS after several UCS-CS pairings. The interruption of the operant behavior stream is used to indicate hearing or discrimination.

Psychophysical Methods

There are many psychophysical methods and variants used to measure thresholds, and many definitions of threshold used in the literature reviewed. The following are brief descriptions of those most often used.

1) The method of limits is a classical psychophysical procedure in which the magnitude of a stimulus (sound intensity or the size of the difference between two stimuli that are to be discriminated) is reduced on successive trials from a level producing a clear behavioral response to one that does not. A second series then begins with the stimulus ascending in level. These series may be alternated several times. Threshold is defined as the averaged stimulus level half way between levels which result in a response to the signal and levels which to not. Often a “modified” method of limits is used. In these cases, only the descending or ascending series may be run. Sometimes a blocked method of limits is used in which several trials are presented at the same stimulus level before the level is changed.

2) The staircase procedure is another variant of the method of limits and may be identified by some authors as a modified method of limits. In this case, signal level is reduced toward threshold in steps the size of which may vary, until a “no-response” criterion is reached. The level then begins to increase until the response criterion is reached, and then decreases again. The averaged signal levels at the transitions between response and no-response are designed as threshold. This is an efficient procedure but sometimes fails because the signal levels are constantly very near threshold.

The staircase procedure and its variants are often termed tracking, adaptive tracking, or adaptive procedures. Sometimes, two positive responses in a row at the same signal level are required before the signal level is reduced (known as a two-down, one-up rule). In a two-alternative forced choice paradigm, the threshold converges on about 71 percent correct. In some cases, the tracking procedure continues until some running statistical criterion for stability is reached.

3) The method of constant stimuli is a classical procedure in which blocks of trials are presented, all at the same signal level. A “percent correct” is calculated for the given signal level, and a new block begins at a different signal level. Several levels are chosen to “bracket” the suspected threshold. Percent correct plotted as a function of signal level produces the psychometric function, and a threshold may be defined as the interpolated signal level corresponding to some percent correct value. In a “yes-no” or “go, no-go” paradigm (in which the animal either responds or does not on a given trial), a value near 50 percent correct defines the threshold. In a “two-alternative forced-choice” paradigm (in which the animal responds on every trial with one response or another, such as “go-right or “go-left”), a value near 75 percent correct may define the threshold.

In sophisticated psychophysical paradigms, analyses of errors as well as correct responses can be usefully made. Errors are “false alarms” (responding “yes” when the signal was not present) and “misses” (responding “no” in the presence of a signal). Correct responses are “hits” (correctly detecting the signal was not presented). With such measures, response bias such as an overall tendency to say “Yes” in a “yes-no” task) can be measured and overall performance measures such as d-prime can be obtained that are free of the effects of bias.

Some of the data presented in this book are obtained from stimulus generalization experiments, and are thus not strictly psychophysical data. In this type of experiment, animals are trained to respond to a given stimulus, and then tested for response to other stimuli that differ from the training stimulus along one or more physical dimensions. The value of this kind of measure is that it helps to show what dimensions of the stimulus control behavior are salient to the animal.

Richard Rozzell Fay

Richard R. Fay Ph.D. is Professor Emeritus, Distinguished University Research Professor at Loyola University Chicago. He was a Professor of Psychology and the Director of Parmly Hearing Institute, 1974-2011. He earned his Bachelors degree from Bowdoin College and Ph.D. from Princeton University working with E.G. Weaver. He did a post-doc with Dr. Georg Von Békésy before eventually settling in Chicago. He, along with his life long colleague, Dr. Art Popper, have edited 55 volumes of the Springer Handbook of Auditory Research (SHAR). After spending a sabbatical year in the library, Dick also published Hearing in Vertebrates: A Psychophysics Databook. The data from this book are being made available as a special collection by the Fay Foundation. Professor Fay is married to his lifelong love, Cathy and has two children, Christian and Amanda.