Sign Gesture Spotting In American Sign Language Using Dynamic Space Time Warping
MetadataShow full item record
American Sign Language (ASL) is the primary sign language used by approximately 500,000 deaf and hearing-impaired people in the United States and Canada. ASL is a visually perceived, gesture-based language that employs signs made by moving the hands combined with facial expressions and postures of the body. There are several software tools available online to learn signs for a given word but there is no software that gives the meaning of any sign video. If we need to search or look up documents containing any word we can just type it in search engines like Google, Bing etc. but if we need to search for videos containing a given sign we cannot do that simple by typing a word. One solution to this problem can be adding English tags to each of these ASL videos and do a keyword based search on it. This method can be inefficient as each sign video needs to be tagged manually with approximate English translations for each of the ASL gesture. The objective is to develop a system that lets users efficiently search through videos of ASL database for a particular sign. Given an ASL story dataset the user can use the system to know the temporal information (start and end frame) about the occurrence of the sign in the dataset. Recognizing gestures when the start and end frame of the gesture sign in the video is unknown is called Gesture spotting. The existing system evaluates the similarity between the dataset video and the sign video database using Dynamic Time Warping (DTW). DTW measures similarity between two sequences which may vary in time or speed. In this paper we have used a previously defined similarity measure called Dynamic Space Time Warping Algorithm (DSTW). DSTW was defined as an extension of DTW in order to deal with a more than one hand candidate by frame. This provides a method to find a better optimal match by relaxing the assumption of correct hand detection. This paper contributes by establishing a baseline method for measuring state of the art performance on this problem. We have performed extensive evaluations of DSTW on a real world dataset. We have achieved this by implementing a DSTW gesture spotting algorithm and evaluating it on a dataset built on ASL stories. We have used transition costs in DSTW to improve accuracy. We have also performed evaluation of accuracy based on motion scores for various signs. We have used single handed queries from the ASL dictionary and used a symmetric rule for evaluation of the classification accuracy of our system for each of these queries.