Speech comparison using discrete wavelet transform
Speech recognition is a fascinating application of digital signal processing that is used to automate many tasks that in the past required hands-on human interaction. Speech comparison is one component of speech recognition. The speech comparison system identifies speech by its transitory characteri...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Animo Repository
2003
|
Subjects: | |
Online Access: | https://animorepository.dlsu.edu.ph/etd_bachelors/14222 |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | De La Salle University |
Language: | English |
Summary: | Speech recognition is a fascinating application of digital signal processing that is used to automate many tasks that in the past required hands-on human interaction. Speech comparison is one component of speech recognition.
The speech comparison system identifies speech by its transitory characteristics. Comparison is done between 48 sets of minimal pairs. A minimal pair can be defined as words that differ in one segment and are most likely to be confused with each other in speech. The system recognizes these barely discernible differences and identifies them in a speech signal.
In comparing the speech signals, the Discrete Wavelet Transform is used because of its features and advantages over other methods like Fourier analysis - improved representation of signals with fewer coefficients, more suitable for non-stationary signals, and the use of time-domain data based on changing frequency components. Other processes performed in the speech signals include endpoint detection, noise reduction, normalization and windowing. Downsampling and Dynamic Time Warping are optional processing methods that may also be applied.
The system is completely speaker-independent. It is able to determine that speakers are saying the same word even though they may be speaking at different rates, pitches, and intensities. The program may aid people who have difficulty in pronouncing certain phonemes since he can easily playback the correct pronunciation of each word in the system. |
---|