Method for Constructing Keyword Recognition Tools in Low-Resource Computer Systems
Abstract
Keyword Spotting (KWS) in low-resource autonomous systems, such as ground drones, faces a fundamental trade-off between accuracy and computational efficiency. This paper presents a method for constructing recognition systems that resolves this issue by optimizing classical approaches rather than employing resource-intensive neural networks. The method is based on the principle of prioritizing feature informativeness, implemented through a weighted acoustic fingerprinting mechanism: Mel-Frequency Cepstral Coefficients (MFCCs) are weighted, aggregated, and transformed into compact string "fingerprints," which are then compared using the Levenshtein distance. Experimental validation of the method, based on a system recognizing 100 drone commands, demonstrated high efficacy: an F1-score of 0.92 was achieved under ideal conditions and 0.78 at a 5dB Signal-to-Noise Ratio. The comparative analysis showed that the developed approach significantly outperforms baseline classical analogues and serves as an effective autonomous alternative to cloud services. Thus, the proposed method enables the creation of highly accurate and computationally lightweight keyword spotting systems, fully adapted for operation on edge devices without network access
References
2. Rabiner, L. R. (1989). A tutorial on hidden Markov models and selected applications in speech recognition. Proceedings of the IEEE, 77(2), pp. 257-286.
3. Seo, D., Oh, H.-S. and Jung, Y. (2021). Wav2KWS: Transfer Learning From Speech Representations for Keyword Spotting. IEEE Access, 9, pp. 80682-80691.
4. Dua, S. et al. (2022). Developing a Speech Recognition System for Recognizing Tonal Speech Signals Using a Convolutional Neural Network. Applied Sciences, 12(12), 6223.
5. Alharbi, S. et al. (2021). Automatic Speech Recognition: Systematic Literature Review. IEEE Access, 9, pp. 131858-131876.


