Lecture:
In today's lecture the class and I learned a lot about the human ear and how it recognises and processes sound.
The ear is a magnificent 'tool' if you will, that allows us to hear sounds. Where many people take hearing for granted, others have looked into how we are able to hear sounds.
The ear acts as a receiver for sound waves and then begins the complicated process of changing the waves into a form our brain's can/ may understand. To do this, it needs to make use of many parts.
(See figure 1)
 |
Figure 1 - Sectional View of the Ear |
Our lecturer went on to discuss what operation each part of the ear carried out for us to be able to hear sound.
Sound waves travel through the ear canal to the ear drum. The ear drum then vibrates and the ossicles act like amplifiers in the sense that they amplify the vibrations from the ear drum and the frequencies from these vibrations then travel through the cochlea. As higher frequencies die off more quickly than low frequencies, the cochlea firstly picks up the high frequencies at the beginning of the cochlea. Lower frequencies are picked up further on. Once the cochlea picks up these frquencies, small hairs on the cochlea cause neurons connected to the auditory nerve to fire. These then transmit all the necessary factors of a sound wave to the brain stem where a hierarchy of neural processing begins. The necessary factors referred to are: timing, amplitude and frequency.
(See figures 2, 3 and 4)
 |
Figure 2 - The Middle and Inner Ear |
 |
Figure 3 - Cochlear Structures |
 |
Figure 4 - Frequency Response of the Cochlea |
A good website that covers a lot of the information covered to this point is:
http://www.deafnessresearch.org.uk/content/your-hearing/how-you-hear/how-ears-works/
A picture from the above mentioned website is below.
(See figure 5)
 |
Figure 5 - Outer, Middle and Inner Ear |
On the website are a couple of other useful images and a lot of useful text. It can be used as a good reference for study.
The outer ear may seem like just a basic piece of flesh on either side of one's head, but really it has a job to do just like the rest of the ear: when sound enters the outer ear, the outer ear begins to filter the sound wave and then performs the process of transduction (
(genetics) the process of transfering genetic material from one cell to another by a plasmid or bacteriophage. (wordnetweb.princeton.edu/perl/webwn#)) and amplification.
The middle ear begins the process of non-linear compression and then impedance matching. Impedance matching is one of the
important functions of middle ear. The middle ear transfers the incoming
vibration from the comparatively large, low impedance tympanic membrane to the
much smaller, high impedance oval window.
At the point of the inner ear, spectral analysis is carried out and the sound is then transferred to the auditory nerve. (See figure 6)
 |
Figure 6 - Schematic of Auditory Periphery |
A video I looked at on YouTube explains quite a bit about the human ear which I found quite useful.
http://www.youtube.com/watch?v=0jyxhozq89g
Below is a picture of the processes the auditory brainstem goes through.
(See figure 7)
.png) |
Figure 7 - Auditory Brainstem (Afferent Processes) |
The features of auditory processing are a two channel set of time-domain signals in contiguous and non-linearly spaced frequency bands. It can also distinguish from the signals that pass through the left ear and the right ear, distinguish between high and low frequencies as well as distinguish between timing and intensity information.
At various specialised processing centres in the hierarchy it can re-integrate and re-distribute.
(See figure 8)
 |
Figure 8 - Features of Auditory Processing |
Below is a picture showing the audible frequency ranges of certain animals.
(See figure 9)
 |
Figure 9 - Audible Frequency Range |
Below are some details on the normal hearing factors for humans.
- Hearing threshold - 0dB SPL = 20?Pa @ 1kHz
- Dynamic range - 140dB (Up to pain level)
- Frequency range (in air) - 20Hz to 20kHz
- Most sensitive frequency range - 2kHz to 4kHz
- Frequency discrimination - 0.3% @ 1kHz
- Minimum audible range - 1ᴼ
- Minimum binaural time difference - 11?ms
Where the frequency discrimination of 0.3% is concerned, this means that most sound that have a different frequency by that difference will still sound the same to the human ear, although the sounds are out of tune.
Only a major frequency discrimination would be recognised by the human ear.
Human hearing covers a range of 20Hz to 20kHz. A lot of these ranges are covered by human speech. (See figure 10)
 |
Figure 10 - Human Hearing and Speech Data |
Below is an image that is not really needed to be known for the module, but is still useful in its own right. (See figure 11)
 |
Figure 11 - Threshold of Hearing |
The MPEG/MP3 audio coding process uses lossy compression (where data the human would not perceive if it was kept is discarded by the computer to create space and get rid of useless information) as well as the psychoacoustic model (the model of human hearing).
A quote from the lecture - "
The use in MP3 of a lossy compression algorithm is
designed to greatly reduce the amount of data required to represent the audio
recording and still sound like a faithful reproduction of the original
uncompressed audio for most listeners. An MP3 file that is created using the
setting of 128 kbit/s will result in a file that is about 11
times smaller than the CD file
created from the original audio source. An MP3 file can also be constructed at
higher or lower bit rates, with higher or lower resulting quality.
The compression works by reducing
accuracy of certain parts of sound that are considered to be beyond the auditory
resolution ability of most people. This method is commonly referred to as perceptual coding. It
uses psychoacoustic
models to discard or reduce precision of components less audible to human
hearing, and then records the remaining information in an efficient manner." (See figure 12)
 |
Figure 12 - MPEG/MP3 Audio Coding |
Lab Session:
In today's lab we opened up a sound file of a soprano voice and noted that it had a duration of 7 seconds.
We then opened up another sound file called "english words" in a package called Adobe Soundbooth CS4.
In Adobe Soundbooth CS4 we messed around with the available functions and edits that we could apply to the sound. We then saved our own copy that we cropped so we could edit it fully.
It edited mine to the point where I had four different words being spoken, but the first and last sounded different to each other and the middle words, but the middle words sounded the same although they looked different on the displayed waves and the Spectral Frequency Display. This thus proves the point of Frequency discrimination - 0.3% @ 1kHz. (See figure 13)
 |
Figure 13 - Snapshot from Adobe Soundbooth CS4 |