SPEECH RECOGNITION USING RECURRENT NEURAL NETWORK AND CONVOLUTIONAL NEURAL NETWORK /
Atosha, Pascal Bahavu
SPEECH RECOGNITION USING RECURRENT NEURAL NETWORK AND CONVOLUTIONAL NEURAL NETWORK / PASCAL BAHAVU ATOSHA ; SUPERVISOR, ASST. PROF. DR. EMRE ÖZBİLGE - 69 sheets : illustrations, tables ; 30 cm +1 CD ROM
Thesis (MSc) - Cyprus International University. Institute of Graduate Studies and Research Computer Engineering
The Recent years have seen tremendous advancements in speech recognition
technology, which has become essential to many different applications, such as virtual
assistants and transcription services. In order to improve the precision and resilience
of speech recognition systems, this thesis investigates the combined use of recurrent
neural networks (RNNs) and convolutional neural networks (CNNs). The study starts
with a thorough analysis of the state-of-the-art speech recognition models, stressing
the advantages and disadvantages of CNNs and RNNs. CNNs are skilled at obtaining
organized characteristics based on spectrogram representations, whereas RNNs are
best at gathering temporal dependencies in sequential data. This study suggests a
combination of models that brings together the sequential learning skills of RNNs
alongside the spatial feature mining prowess of CNNs, driven by their complementary
strengths. Common metrics such as word error rate, match error rate, word information
lost, and word information preserved were used to evaluate the performance of our
combined model.With 0.2476 of word error rate, 0.0732 match error rate, 0.36 of word
information lost, and 0.53 of word information preserved, our system achieved these
results. The results of this research add to the current debate on the development of
speech recognition technology by presenting a new method for combining the
advantages of RNNs and CNNs in a way that maximizes their mutually beneficial
impacts. For applications requiring accurate and reliable speech-to-text conversion,
the proposed combined model shows promise, as speech recognition remains an
essential part of interaction between humans and computers.
Computer Engineering--Dissertations, Academic
SPEECH RECOGNITION USING RECURRENT NEURAL NETWORK AND CONVOLUTIONAL NEURAL NETWORK / PASCAL BAHAVU ATOSHA ; SUPERVISOR, ASST. PROF. DR. EMRE ÖZBİLGE - 69 sheets : illustrations, tables ; 30 cm +1 CD ROM
Thesis (MSc) - Cyprus International University. Institute of Graduate Studies and Research Computer Engineering
The Recent years have seen tremendous advancements in speech recognition
technology, which has become essential to many different applications, such as virtual
assistants and transcription services. In order to improve the precision and resilience
of speech recognition systems, this thesis investigates the combined use of recurrent
neural networks (RNNs) and convolutional neural networks (CNNs). The study starts
with a thorough analysis of the state-of-the-art speech recognition models, stressing
the advantages and disadvantages of CNNs and RNNs. CNNs are skilled at obtaining
organized characteristics based on spectrogram representations, whereas RNNs are
best at gathering temporal dependencies in sequential data. This study suggests a
combination of models that brings together the sequential learning skills of RNNs
alongside the spatial feature mining prowess of CNNs, driven by their complementary
strengths. Common metrics such as word error rate, match error rate, word information
lost, and word information preserved were used to evaluate the performance of our
combined model.With 0.2476 of word error rate, 0.0732 match error rate, 0.36 of word
information lost, and 0.53 of word information preserved, our system achieved these
results. The results of this research add to the current debate on the development of
speech recognition technology by presenting a new method for combining the
advantages of RNNs and CNNs in a way that maximizes their mutually beneficial
impacts. For applications requiring accurate and reliable speech-to-text conversion,
the proposed combined model shows promise, as speech recognition remains an
essential part of interaction between humans and computers.
Computer Engineering--Dissertations, Academic