172A. Cognitive Psychology of Music (Introduction)
Undergraduate, non-major course
Week 3
Summary of lectures
i s s u e s c o v e r e d The Ear. Seashore's model of music & his test for musical talent. The breakdown of Seashore's model. The Circular Dimensionality of Instrument Timbre.
Figure 4: Simplified graph of the Ear.
The picture on the right is a microphotograph of a dissected and isolated cochlea, the organ of the inner ear that translates mechanical sound motion into neurological signals. This complex structure has many different cell types, the exact functions of which are still not fully understood.
Figure 6: Simplified graph of a stretched-out cochlea
Figure 7: Cross section of a stretched-out cochleaThe sound signal* generated by a source (vibrating system) reach via the air the eardrum and set it into motion. The motion is transmitted to the cochlea through three delicate bones. A membrane (oval window) receives the vibration and imparts it to a viscous fluid (perilymph) inside the cochlea. The fluid in turn sets the basilar membrane in motion (at the left of the graph near the oval window for high frequencies and at the right of the graph for low frequencies), disturbing some sensitive filaments (hair cells) which generate electric impulses.The impulses travel along the auditory nerve pathways to the brain where they enter a complex electrochemical network and the sensation of sound is registered. Basilar membrane: A piece of flexible tissue located in the inner ear (cochlea). It vibrates at different places in response to incoming vibrations/signals of different frequencies, and with different amplitudes in response to incoming vibrations/signals of different intensities. For high frequencies the basilar membrane vibrates close to the entrance of the cochlea, while for low frequencies it vibrates towards the end of the cochlea. (look at figures 3, 4, below). B.M. is a tuned resonator that can analyze complex vibrations/signals. Tiny hair cells (nerve endings) are disturbed by the motion of the basilar membrane, translating it into electric impulses.
Exposure to high intensity sounds can result in temporary (TTS: temporary threshold shift: temporary reduction of the sensitivity of hair cells) or permanent damage of the hair cells.
The basilar membrane gets stiffer with age, resulting in loss of sensitivity especially for high frequencies. (Presbycusis).Critical band: The specific area on the basilar membrane that goes into vibration in resonance with an incoming simple tone. Its length is determined by the elastic properties of the basilar membrane and psychoacoustical studies indicate an average value of approx. 1.2mm., representing ~1/3 of an octave.
The actual width of the critical band corresponds, therefore, to a frequency value called critical bandwidth. If the frequency difference between two simultaneous sines is within the critical bandwidth, then those sines will interact in a specific and musically important way. If the frequency difference is <10 Hz (approx.) then their interaction will be perceived as a slow loudness fluctuation called beating. If the frequency difference is > 10 Hz (approx.) but smaller than the critical bandwidth, then the interaction of the two simultaneous sines will be perceived as a change in the character of the combined sound called roughness. So:
Critical bandwidth: The frequency separation in Hz. between two simultaneous sines necessary for beats/roughness to disappear and for the tones to sound clearly apart.We will talk more about beating and roughness when we will discuss scales and the concepts of consonance and dissonance. For some additional information and explanatory graphs on what we've discussed so far click on the link below.
Seashore's approach to musical talent (model first published in 1919.)
Seashore's Model and summary of discussion; mapping together physical and perceptual attributes of sound.
("..everything that is rendered as music or heard as music may be expressed in terms of the concepts
of the sound wave" Seashore p.2.)
Physical frame of reference
('objective')Psychological/Perceptual frame of reference
('subjective')Notational frame of reference
Elemental capacities.
Seashore's "trunks of Musicality"Frequency: Cycles per second.
Unit: Hertz (Hz)
Audible Range (absolute threshold): 20Hz-20KHzPitch: 'High' versus 'Low'.
Unit: 'mel'
JND (difference threshold) ~ 1% of frequency (for middle frequencies)Pitch names and symbols on a musical staff. i.e. "A", , etc.
Tonal
Intensity (:I): Measure associated with the energy in a vibration.
Unit: Watt/m2
Adjusted unit: decibel (dB)
i.e. I w/m2 =
10*log10 I/10-12 dB
Range: 10-12 w/m2 or 0dB (threshold of hearing)
1w/m2 or 120dB (threshold of pain)Loudness: 'Loud' versus 'Soft'.
Unit: 'Phon'
JND ~ 1dB (for pure tones, middle frequencies, in environments with some background noise)Dynamic markings: i.e ppp -fff; relative changes in dynamics: i.e <, >, etc. Dynamic
Signal Form/
Spectrum/Harmonic composition:
Description of the frequency and amplitude characteristics of the sine components (partials) that sum up to make a complex signal.Timbre:
Sound quality. Multidimensional perceptual attribute.
Its principle dimension depends on the energy distribution of a spectrum:
nasal (more energy in the high components) versus
not nasal (more energy in the low components)
Indications on instruments to perform the score-parts. Qualitative Time:
Length (in terms of 'clock time') of sound and silence.
(measured by convention in hours, minutes, seconds etc.)Duration: 'Long' versus 'short'.
Perceived time. No specific unit.
The minimum length of an acoustical event that can be perceived as musical sound is approximately 50ms.a)Tempo: events per unit time (i.e. beats per minute),
b)Relative note durations'
c)speed changes: accel., rit., etc.Temporal Seashore's model is, as we have seen, an example of Cartesianism, giving a privileged status to the frames of mathematics and physics. It understands human perceptual mechanism as a machine-like concept. Our ear receives some input from the external/objective/knowable world (i.e. Frequency) and translates it into a perceptual output (i.e. Pitch) that relates isomorphically to the physical output. Whenever perception is non-isomorphic with the physical world the result is, according to Seashore, a normal illusion (:a perceptual attribute that does not relate isomorphically to a physical variable). Everything, therefore, in music can be expressed in terms of sound waves, which can be broken down to the above 4, and only those, variables. This approach is also called Atomism or Reductionism.
Atomism/Reductionism: Breaking down a complex system into its elementary units and studying those units tells us everything we need to know about the complex system itself. (The whole is the sum of its parts).
According to Seashore, musical talent is manifested in the capacity and fineness by which one can hear the relationships between physical and perceptual variables mapped in his model (look at the above table). Those relationships, which constitute elemental capacities, are genetically endowed. Testing them (with physical variables being the independent variables and perceptual variables being the dependent variables) gives profiles that may predict one's level of success in music. In other words, musical talent is defined operationally in terms of the JND (just noticeable difference) of 4 variables. According to this model musical geniuses have high scores that are evenly distributed across all 4 variables. Empirical research using Seashore's test over a period of 3 decades indicates that the test has a reasonable reliability.
A great number of objections can obviously be raised about its validity. (that is, about the relevance his operational definition of musical talent has to the performance, understanding, and experience of music).
Additionally, if elemental capacities were genetically endowed then test-scores of individual should not change with time. When Seashore observed that most people had better scores the second time they were taking the test he dismissed the observation as indicating that subjects were simply becoming test-wise rather than indicating a change in their elemental capacities. Such an explanation is discredited by the fact that scores do not simply improve with time but fluctuate in such a way as to suggest that the so-called elemental capacities are influenced by factors other than genetic. The controversial issue of nature versus nurture developed out of approaches such as Seashore's and presents a false dichotomy.
Auditory Cross-mapping & Seashore's Model
The most serious problem however with the Seashore model as a theory is that it does not satisfy an important requirement of any theory: Theories should be self-consistent and non-contradictory.
I ) Pitch does not depend only on frequency.
Figure 8: Dependence of the pitch of pure tones on intensity: |
B) Spectrum also influences pitch
of complex tones. As opposed to Ohm's law, the pitch of complex tones does not always
match the frequency of its lowest sine component or partial.
Phenomenon of the Missing fundamental: If the fundamental frequency (or even the first few
harmonics) is (are) removed from the spectrum of a periodic sound wave, the perceived
pitch remains unchanged and matches the pitch of a sinusoidal tone with frequency equal to
that of the 'missing fundamental'.
This is why, for example, although the speakers in small radios or telephones cannot
respond to low frequencies (they essentially remove the fundamental component from the
majority of complex sounds) the pitch does not rise to match the frequency of the lowest
component present.
C) Time is another important
factor in terms of pitch. A tone has to last more than a minimum amount of time (~50ms)
before we can get a clear sense of pitch. And this minimum amount is different for
different intensities.
II)
Loudness does not depend only on Intensity.
A) Frequency can also influence
loudness. Equal loudness contours are graphs that demonstrate how the loudness of
simple tones with the same intensity changes with frequency.
Figure 9: Equal loudness contours for pure tones. |
The
sensitivity of the ear drops dramatically at low frequencies (approx <450Hz). A simple
tone with intensity level of 60dB will sound moderately loud at 1000Hz but will just be
audible at 50 Hz. The sensitivity of the ear also drops at high frequencies (approx.
>4000Hz) but not as dramatically as at low frequencies. We are most sensitive to
frequencies in the range 1000Hz - 3000Hz (approx.) This may have some evolutionary
significance since speech sounds have most of their energy within this range.
B) For complex tones loudness also depends on
their spectrum (that is: on the way energy is distributed among its components).
Suppose that two complex tones have the same intensity level (in dB) but they have
different spectra. The complex tone with more 'spread-out' spectrum (: with components
spreading across many critical bands) will sound louder than the one with a less
'spread-out' spectrum (: with components spreading across fewer critical bands).
III)
The timbre of complex tones does not depend only on spectrum.
The spectrum of a complex signal describes the
amount of energy in each of the signal's partials but does not describe how this amount
changes with time. This time-variancy also influences timbre. A way to represent
this time-variancy is through the envelope of a signal. Envelope: A graph that
traces the boundaries of a signal, representing how its amplitude (overall energy) changes
with time.
Attack: The part of the envelope that traces the development
of a sound signal towards its maximum amplitude. Attack is the result of supplying energy
to a system at rest, setting it into vibration. It represents how energy is built up in a
vibrating system.
Steady
state: The part of the
envelope in which the amplitude of the signal remains fairly constant. Steady state is the
result of continuous supply of energy in a vibrating system.
Decay: The part of the envelope that traces the drop in
amplitude of a sound signal from its maximum value to zero. Decay occurs when we stop
supplying a vibrating system with energy, and represents how energy stored in a system
eventually dies out.
Envelope has a significant effect on the timbre
(quality) of sounds as can be demonstrated by playing a sound backwards. Envelope does not
describe another factor that influences timbre: how the frequency and amplitude of
individual components change with time.
Based on envelopes we can classify signals in two, very broad, categories:
i) Continuous (Most of the energy is contained in the steady state of the envelope)
ii) Impulse (The envelope has no steady state. The attack portion is much shorter than the decay portion and most importantly much steeper.)
According to Seashore most of the above constitute illusions.
Davies, in contrast, follows a much more Humean approach. According to him it is impossible to adequately understand music in purely physical terms. Any examination of music must consider the responses of the listeners and the intentions of the composer(s)/performer(s), in addition to the physics of the sound. (Davies p.26). Compare this with Seashore's view discussed earlier (music is in the acoustic signal).
Given that instruments all have unique timbral charcteristics (e.g., a flute sounds different than a piano), is it possible to map distances of timbre difference from instrument to instrument?
Taking a cognitive Humean approach to difference between instruments may reveal degrees of similarity and difference. In doing so, we may be able to translate qualitative categories to quantitative relationships.I ) Examples of qualitative terminology for timbre, Hermann von Helmholtz, On the Sensations of Tone (1862)
A) Nasal we may associate with the timbral quality of the oboe; In some linguistics models, this attribute corresponds to
acute.B) Round we may associate with the timbral quality of the french horn; In some linguistics models, this attribute corresponds to lax.
II ) Timbres can range from simple to complex depending on the presence of overtones, Jean Baptiste Joseph Fourier (1768-1830)
Timbres are made up of the presence of different frequencies that (as seen visually as sine waves) add and subtract amplitutes of the sound signal. (Think of the synthesizer, which synthetically combines different sine waves into different "shapes," imitating various instruments.)
The Fourier equation states that all sine signals combine into a spectrum.III ) Psychological test to map categories to scales of timbral difference—quantitative relationships.
A) Asking test subjects to rate the simiarity and difference between instruments generates data about percieved timbral difference. (for example: OBOE/FRENCH HORN, TRUMPET/FRENCH HORN, etc.)
B) Gathering that one-dimentional data and adding dimensionality can reveal a multi-dimensional representation of degrees of of timbral difference. It is assumed that the judgment of similarity between pairs of timbres on a single scale captures the interrelationships of a complex decision process involving multiple variables. Multidimensional scaling statistically reveals the dimensionality of the judgments.
*From Kendall, R. A. "Empirical Approaches to Musical Meaning." In R. A. Kendall and R. W. H. Savage, Perspectives in Systematic Musicology, Selected Reports in Ethnomusicology, Vol. 12, 2005.
The above model shows the limits of a one-dimenstional model when trying to judge the distance of BPN-CPT with the the other known distances. The two-dinensional model represents the differences more accurately.
C) The cognitive map developed through such data collection reveals that timbral differences among the various instruments of the orchestra may be understood as circular.D) The circular relationship reveals that instruments of the orchestra acheive a broad (and full) range of timbre.
*From Kendall, R. A. "Empirical Approaches to Musical Meaning." In R. A. Kendall and R. W. H. Savage, Perspectives in Systematic Musicology, Selected Reports in Ethnomusicology, Vol. 12, 2005.
Note that the above analysis of timbre difference is only possible by shifting to a psycho-acoustical cognitive study—a DEPARTURE from the Seashore insistence of understanding cognition as the ability to judge the acoustical world with accuracy (hence Seashore's preoccupation with measuring perceptual accuracy to determine musical "talent").
* Although many references refer to any plot of amplitude by time as a "wave," these are simply graphs, and it is far more correct to call them "signals," as in electrical signal. Waves exist (three dimensionally) in media such as air and water … not as graphs on the blackboard or representations on an oscilloscope.
Further reading on the relationship between physical and perceptual attributes of sound
Campbell, M. and Greated, C. (1987). The Musician's Guide to Acoustics. New York: Shirmer Books.Deutsch, D. ed. (1999). The Psychology of Music. San Diego: Academic Press.Moore, B. C. J. (ed.) (1995). Hearing. In the series "Handbook of Perception and Cognition. 2nd Edition." E. Carterette & M. Friedman editors. London: Academic Press.Plomp, (1976). Aspects of Tone Sensation. A Psychophysical Study. London: Academic Press.
Ethnomusicology Department - UCLAİ