000 02716nam a22002657a 4500
003 KOHA
005 20241014092710.0
008 240924d2024 cy d a|| |||| 00| 0 eng d
040 _aCY-NiCIU
_beng
_cCY-NiCIU
_erda
041 _aeng
090 _aYL 3365
_bA86 2024
100 1 _aAtosha, Pascal Bahavu
245 1 0 _aSPEECH RECOGNITION USING RECURRENT NEURAL NETWORK AND CONVOLUTIONAL NEURAL NETWORK /
_cPASCAL BAHAVU ATOSHA ; SUPERVISOR, ASST. PROF. DR. EMRE ÖZBİLGE
264 _c2024
300 _a69 sheets :
_c30 cm
_e+1 CD ROM
_billustrations, tables ;
336 _2rdacontent
_atext
_btxt
337 _2rdamedia
_aunmediated
_bn
338 _2rdacarrier
_avolume
_bnc
502 _aThesis (MSc) - Cyprus International University. Institute of Graduate Studies and Research Computer Engineering
520 _aThe Recent years have seen tremendous advancements in speech recognition technology, which has become essential to many different applications, such as virtual assistants and transcription services. In order to improve the precision and resilience of speech recognition systems, this thesis investigates the combined use of recurrent neural networks (RNNs) and convolutional neural networks (CNNs). The study starts with a thorough analysis of the state-of-the-art speech recognition models, stressing the advantages and disadvantages of CNNs and RNNs. CNNs are skilled at obtaining organized characteristics based on spectrogram representations, whereas RNNs are best at gathering temporal dependencies in sequential data. This study suggests a combination of models that brings together the sequential learning skills of RNNs alongside the spatial feature mining prowess of CNNs, driven by their complementary strengths. Common metrics such as word error rate, match error rate, word information lost, and word information preserved were used to evaluate the performance of our combined model.With 0.2476 of word error rate, 0.0732 match error rate, 0.36 of word information lost, and 0.53 of word information preserved, our system achieved these results. The results of this research add to the current debate on the development of speech recognition technology by presenting a new method for combining the advantages of RNNs and CNNs in a way that maximizes their mutually beneficial impacts. For applications requiring accurate and reliable speech-to-text conversion, the proposed combined model shows promise, as speech recognition remains an essential part of interaction between humans and computers.
650 0 _aComputer Engineering
_vDissertations, Academic
700 1 _aÖzbilge, Emre
_esupervisor
942 _2ddc
_cTS
999 _c292842
_d292842