Formalization of the keyword recognition process in speech signal

Keywords: keyword recognition, process formalization, speech signal, conceptual model, deep learning, neural networks, Markov models.

Abstract

In this work, the formalization of the keyword recognition process in speech signals has been carried out, which has laid the foundation for the development of a detailed conceptual model. The proposed model provides a comprehensive and systematic description of the process for creating effective recognition tools. The model covers critical components such as feature extraction, modeling methodology, training strategy, testing procedures, and usage mechanisms. Significant attention is given to the implementation of advanced deep learning methods, including neural networks, Markov models, and data augmentation techniques, which contribute to enhancing the accuracy of keyword recognition. This work takes into account numerous challenges associated with the variability of speech signals, noise resilience, limited training data, and real-time performance requirements. The proposed formalized approach allows for the optimization of interaction among components and enhances the overall efficiency of the keyword recognition system in speech signals. 

References

1. Umesh Dwivedia, T., Guptab, S., Upadhyayb, S. K., Shuklab, Y., & Ahujab, S. Automatic Speech Recognition System Using Hybrid Hidden Markov Model and Human Emotion Recognition System.
2. Rashmi, S., Hanumanthappa, M., & Reddy, M. V. (2018). Hidden Markov Model for speech recognition system—a pilot study and a naive approach for speech-to-text model. In Speech and Language Processing for Human-Machine Communications: Proceedings of CSI 2015 (pp. 77-90). Springer Singapore.
3. Gunawan, A. (2010). English digits speech recognition system based on hidden Markov models. In Proceedings of International Conference Computer.
4. Deshmukh, A. M. (2020). Comparison of hidden markov model and recurrent neural network in automatic speech recognition. European Journal of Engineering and Technology Research, 5(8), 958-965.
5. Khurana, S., Laurent, A., Hsu, W. N., Chorowski, J., Lancucki, A., Marxer, R., & Glass, J. (2020). A convolutional deep markov model for unsupervised speech representation learning. arXiv preprint arXiv:2006.02547.

Abstract views: 35
PDF Downloads: 58
Published
2024-06-16
How to Cite
Didus , A., & Tereikovskyi , I. (2024). Formalization of the keyword recognition process in speech signal. COMPUTER-INTEGRATED TECHNOLOGIES: EDUCATION, SCIENCE, PRODUCTION, (55), 78-86. https://doi.org/10.36910/6775-2524-0560-2024-55-09
Section
Computer science and computer engineering