ORCID
https://orcid.org/0000-0002-8092-7927
Year
2018
Season
Fall
Paper Type
Master's Thesis
College
College of Computing, Engineering & Construction
Degree Name
Master of Science in Computer and Information Sciences (MS)
Department
Computing
NACO controlled Corporate Body
University of North Florida. School of Computing
First Advisor
Dr. Ching-Hua Chuan
Second Advisor
Dr. Roger Eggen
Third Advisor
Dr. Sanjay Ahuja
Department Chair
Dr. Sherif Elfayoumy
College Dean
Dr. William F. Klostermeyer
Abstract
Powerful, handheld computing devices have proliferated among consumers in recent years. Combined with new cameras and sensors capable of detecting objects in three-dimensional space, new gesture-based paradigms of human computer interaction are becoming available. One possible application of these developments is an automated sign language recognition system. This thesis reviews the existing body of work regarding computer recognition of sign language gestures as well as the design of systems for speech recognition, a similar problem. Little work has been done to apply the well-known architectural patterns of speech recognition systems to the domain of sign language recognition. This work creates a functional prototype of such a system, applying three architectures seen in speech recognition systems, using a Hidden Markov classifier with 75-90% accuracy. A thorough search of the literature indicates that no cloud-based system has yet been created for sign language recognition and this is the first implementation of its kind. Accordingly, there have been no empirical performance analyses regarding a cloud-based Automatic Sign Language Recognition (ASLR) system, which this research provides. The performance impact of each architecture, as well as the data interchange format, is then measured based on response time, CPU, memory, and network usage across an increasing vocabulary of sign language gestures. The results discussed herein suggest that a partially-offloaded client-server architecture, where feature extraction occurs on the client device and classification occurs in the cloud, is the ideal selection for all but the smallest vocabularies. Additionally, the results indicate that for the potentially large data sets transmitted for 3D gesture classification, a fast binary interchange protocol such as Protobuf has vastly superior performance to a text-based protocol such as JSON.
Suggested Citation
Blair, James M., "Architectures for Real-Time Automatic Sign Language Recognition on Resource-Constrained Device" (2018). UNF Graduate Theses and Dissertations. 851.
https://digitalcommons.unf.edu/etd/851
Included in
Computer and Systems Architecture Commons, Digital Communications and Networking Commons, Systems and Communications Commons