Spanish Speech Recognition Oriented to a Wheelchair Control
Published 2016-03-03
Keywords
- closed vocabulary,
- environmental noise,
- language model,
- speech recognition,
- Microsoft SAPI
How to Cite
Copyright (c) 2016 Revista UIS Ingenierías
This work is licensed under a Creative Commons Attribution-NoDerivatives 4.0 International License.
Abstract
This paper presents a computer application that recognizes Spanish voice command for a speaker independent closed vocabulary. The Spanish language model adopted is the one provided for Microsoft® SAPI (Speech Application Program Interface). This language model was limited to recognize only the grammar related with the functionalities that the user of the automated wheelchair studied by the Automatica research group of the Universidad Autónoma de Manizales can handle. The testing for measure the recognition system performance was implemented discriminately by gender and was developed in three environments with noise level ranges differentiated according the current Colombian legislation about maximum permissible ambient noise levels. It is highlighted that the recognition obtained is speaker independent without requiring the extensive previous training that with other tools should be done.
Downloads
References
- Organización Mundial de la Salud y Banco Mundial. (2011) Informe mundial sobre la discapacidad. [En línea]. Disponible en: https://goo.gl/0KtNAI
- Ministerio de Salud y Protección. (2015). Registro para la localización y caracterización de personas con discapacidad (RLCPD)”.
- C.S.L. Tsui et al, “EMG-based hands-free wheelchair control with EOG attention shift detection,” en IEEE Int’l Conf. Robotics and Biomimetics (ROBIO 2007), dic. 15-18, 2007, pp. 1266-1271. DOI: 10.1109/ROBIO.2007.4522346
- S. Yathunanthan et al, “Controlling a Wheelchair by Use of EOG Signal,” en 4th Int’l Conf. Information and Automation for Sustainability (ICIAFS 2008), dic. 12-14, 2008, pp. 283-288. DOI: 10.1109/ICIAFS.2008.4783987
- I. Iturrate, J. Antelis y J. Minguez, “Synchronous EEG brain-actuated wheelchair with automated navigation,” en IEEE Int’l Conf. Robotics and Automation (ICRA '09), may. 12-, 2009, pp. 2318-2325. DOI: 10.1109/ROBOT.2009.5152580
- Z. Hu et al., “A novel intelligent wheelchair control approach based on head gesture recognition,” en Int. Conf. Computer Application and System Modeling (ICCASM), oct. 22-24, 2010, pp. V6-159-V6-163. DOI: 10.1109/ICCASM.2010.5619307
- M.E. Lund et al, “Inductive tongue control of powered wheelchairs,” en Annual International Conference of the IEEE. Engineering in Medicine and Biology Society (EMBC), ago. 31, 2010-sep. 4, 2010, pp. 3361-3364. DOI: 10.1109/IEMBS.2010.5627923
- X. Huang y L. Deng, “An Overview of Modern Speech Recognition,” en Handbook of Natural Language Processing, 2a ed.: Chapman & Hall/CRC, 2010, ch. 15 (ISBN: 1420085921), pp. 339-366.
- Julius (2014) Open-Source Large Vocabulary CSR Engine Julius. [En línea]. Disponible en: http://julius.sourceforge.jp/en_index.php?q=index-en.html
- CMU (2016) CMU Sphinx-Open Source Toolkit. [En línea]. Disponible en: http://cmusphinx.sourceforge.net/
- The Institute for Signal and Information Processing. (2016) ISIP toolkit. About our software. [En línea]. Disponible en: http://www.isip.piconepress.com/projects/speech/software/
- (2016) HTK Speech Recognition Toolkit. [En línea]. Disponible en: http://htk.eng.cam.ac.uk/
- Microsoft (2016) Microsoft Developer Network. Speech API. [En línea]. Disponible en: https://goo.gl/XIc7po
- M. Nishimori, T. Saitoh y R Konishi, “Voice controlled intelligent wheelchair,” en SICE, 2007 Annual Conference, Takamatsu, 2007, pp. 336-340. DOI: 10.1109/SICE.2007.4421003.
- A. Škraba et al, “Speech-controlled cloud-based wheelchair platform for disabled persons,” Microprocessors and Microsystems, vol. 39, num. 8, nov.2015, pp. 819-828. DOI: 10.1016/j.micpro.2015.10.004
- J.A. Ansari, A. Sathyamurthy y R. Balasubramanyam, “An Open Voice Command Interface Kit,” en IEEE Transactions on Human-Machine Systems, vol. 46, num. 3, jun. 2016, pp. 467-473, DOI: 10.1109/THMS.2015.2476458.
- S.U. Khadilkar y N. Wagdarikar, “Android phone controlled voice, gesture and touch screen operated smart wheelchair,” en International Conference on Pervasive Computing (ICPC), Pune, 2015, pp. 1-4. DOI:10.1109/PERVASIVE.2015.7087119.
- M. Fezari y A. Khati, “New speech processor and ultrasonic sensors based embedded system to improve the control of a motorised wheelchair,” en 3rd International Design and Test Workshop (IDT), dic. 20-22, 2008, pp. 345-349. DOI: 10.1109/IDT.2008.4802527
- M.T. Qadri y S.A. Ahmed, “Voice Controlled Wheelchair Using DSK TMS320C6711,” en Int. Conf. on Signal Acquisition and Processing. (ICSAP), abr. 3-5, 2009, pp. 217-220. DOI: 10.1109/ICSAP.2009.48
- M. Fezari, M. Bousbia-Salah y M. Bedda, "Voice and Sensor for More Security on an Electric Wheelchair," en 2nd Int. Conf. on Info. and Comm. Tech. (ICTTA), 2006, pp. 854-858. DOI: 10.1109/ICTTA.2006.1684485
- C. Aruna et al, “Voice recognition and touch screen control based wheel chair for paraplegic persons,” en International Conference on Green Computing Communication and Electrical Engineering (ICGCCEE), mar. 6-8, 2014, pp. 1-5. DOI: 10.1109/ICGCCEE.2014.6922215
- J.C. Martínez y J.L. Ramírez, “Diseño y construcción de un módulo automático controlado por voz adaptable a una silla de ruedas convencional,” Segundo Congreso Internacional de Ingeniería Mecatrónica, vol. 1, num. 1, pp. 1234-1234, Colombia, 2009.
- O.I. Higuera, "Diseño e implementación de un prototipo de reconocimiento de voz basado en modelos ocultos de markov para comandar el movimiento de una silla de ruedas en un ambiente controlado," en XII Simposio de Tratamiento de Señales, Imágenes y Visión artificial, Colombia, 2007.
- W. Acosta, M. Sarria y L. Duque, "Implementación de una metodología para la detección de comandos de voz utilizando HMM," Revista de Investigaciones Universidad del Quindío, vol. 23, num. 1, pp. 64-70, 2012. Disponible en: https://goo.gl/8Klti8.
- D. Jurafsky y J.H. Martin, Speech and language processing: an introduction to natural language processing, computational linguistics, and speech recognition, 2a ed.: Pearson Prentice Hall, 2009.
- (2016) VoxForge. [En línea]. Disponible en: http://www.voxforge.org
- X. Huang, A. Acero y H. Hon, Spoken Language Processing, a guide to theory, algorithm and system development, Prentice Hall, 2001.
- J.V. Peña, "Contribuciones al reconocimiento robusto de habla," tesis doctoral, Dpto. de Teoría de la Señal y Comunicaciones, UC3M, Madrid, España, 2007. [En línea]. Disponible en: https://goo.gl/raEq5L
- F.J. Hernando Pericas, "Técnicas de procesado y representación de la señal de voz para el reconocimiento del habla en ambientes ruidosos," tesis doctoral, Dpto. de Teoría de la Señal y Comunicaciones, UPC, Barcelona, España, 1993.
- Microsoft (2016) Microsoft Developer Network. System.Speech Programming Guide for.NET Framework. [En línea]. Disponible en: https://goo.gl/PM20D6.
- G.E Dahl et al, "Context-Dependent Pre-Trained Deep Neural Networks for Large-Vocabulary Speech Recognition," IEEE Transactions on audio, speech, and language processing, vol. 20, num. 1, pp. 30-42, ene. 2012. DOI: 10.1109/TASL.2011.2134090
- Microsoft (2016) Microsoft Developer Network Introducing Computer Speech Technology. Speech Server 2004 R2. [En línea]. Disponible en: http://msdn.microsoft.com/en-us/library/ms870025
- Guía y procedimiento de medida del ruido de actividades en el interior de edificios. Según anexo IV del Real Decreto 1367/2007, AECOR, España, 2011. [En línea]. Disponible en: https://goo.gl
- /ra4EHQ