Automatically Translating Sign Language

  Print

"Ready-made functions in LabVIEW perform most of the processing to extract image features and information for translation."

- Dr. Rini Akmeliawati , Monash University Malaysia

The Challenge:
Using pattern matching to identify hand and finger movements to translate Malaysian sign language in real time with more than 80 percent accuracy.

The Solution:
Developing a vision-based system using the NI Vision Development Module and NI LabVIEW software to translate Malaysian sign language.

Author(s):
Dr. Rini Akmeliawati - Monash University Malaysia

We chose LabVIEW and the NI Vision Development Module to build an improved sign language translator based on work done by previous Monash students using MathWorks, Inc. MATLAB® software. One of the challenges the previous students faced was the inability of MATLAB software to perform parallel processing. This often delayed translation as the image captured was preprocessed by the algorithm, which subsequently delayed the image processing and recognition. We overcame this challenge by using LabVIEW and the Vision Development Module to solve our parallel processing problem.

This system processes sign language in real time by using LabVIEW and the Vision Development Module to acquire, process, and recognize images of signs. It consists of four main stages: image acquisition and handling; image processing; image analysis and feature extraction; and the recognition stage, also known as computer vision. The system uses a custom-made colored glove (see Figure 1) to identify hand and finger motions. Ready-made functions in LabVIEW perform most of the processing to extract image features and information to translate.

The recognition, or computer-vision, stage is the most crucial part of the system. The system uses mathematical algorithms to match the unique features of the signs with the database. Static signs are recognized by the positions of the colored finger tips and motioned signs are identified by tracking the path of the motion using a pattern-matching algorithm.

Image Acquisition and Handling

The image acquisition stage acquires images and prepares them for the image processing stage where the acquired images are accordingly stored and organized. In brief, image acquisition involves acquiring, sampling, storing, and deleting images.

You start the image acquisition by clicking the run button on the LabVIEW VI front panel (see Figure 2). The process automatically terminates image acquisition when the hand remains static for two successive frames. The system tracks the movement of the yellow region (palm) of the hand and measures the distance of movement between successive frames. The centroid of the yellow region is the point of measurement.

The centroid is the center of energy of an image and is calculated using an available LabVIEW function called IMAQ Centroid. To obtain the centroid of the yellow region, it is first segmented. The system uses color thresholding to segment the yellow region. The thresholding limits are determined with the NI Vision Assistant. The function in the Vision Development Module, Point Distances, computes the distance between the centroids.

After color thresholding, the system obtains the centroids of the yellow region for each frame and measures the distance of the centroids between subsequent frames. The distance measured is used to confirm there is no movement (for example, when the distance between centroids is near less than one). In this case, the distance condition is not set at zero because it is difficult for the hand not to move at all. However, with this condition alone, image acquisition stops almost immediately after hitting the run button because as the signer prepares to sign, the acquired frames are static and meet the condition. To prevent this from happening, the condition is checked only after more than three images are acquired, giving the signer more time to prepare for a movement.

Image Processing

In this stage, the system extracts features from every frame for the pattern-matching algorithm later in the recognition stage. Processes involved include color thresholding to track the hand, creating a bounding box to track the position of the finger tips, and analyzing the features to characterize the images within the bounding box. This information is passed on to the last stage for recognition.

The image processing method is the same for processing alphabets and numbers, but different for processing words. For alphabets and numbers, the first stage performs color thresholding to track the location of the yellow region (palm) followed by the colored finger tips. To process static signs, the program needs only to read one image because all images acquired are the same. In this case, the program reads the first frame from the acquisition.

A word is formed based on the motion path of the signer’s hand. When signing a word, a set of images are acquired. The program sequentially reads and processes the set of images one by one to obtain the features of each frame, such as the centroids and motion path.

The first step in the image processing stage is color segmentation using the color thresholding method, followed by extracting features and information from the segmented colors. In color thresholding, the color model for thresholding is selected before determining the threshold limits. The thresholding limit is determined with the NI Vision Assistant.

The system then incorporates the threshold limits in the main program. In the color thresholding function, all values within the threshold range are set to 255, representing the value of white pixels, and any other values outside the range are set to zero (the value for black colored pixels).

The resulting image from the color thresholding is in binary form (black and white). In these types of images, the white pixels are treated as the object and the black pixels are treated as the background. However, some pixels may be outside the region of interest with a pixel value within the range of the threshold limits. Normally, these pixels are small particles. These particles affect the object analysis accuracy, such as when computing the centroid of the object. The system removes these small particles using the RemoveParticle function.

Image Analysis and Feature Extraction

After color thresholding, the resulting images are in binary form. Each finger tip in the image is represented by a white particle. Different colors are separated into different images. These binary images are passed on the function in the NI Vision Development Module Count Objects 2. This function computes the properties of the particles in the image, including the number of particles in the image, the particle centroid and area, the orientation of the particle, and the number of holes in the particle.

From the set of object properties available, only the particle centroids (colored finger tips) are used for recognition. At the present stage, this information is sufficient because different signs have unique combinations of centroid positions. The centroid of each particle is extracted in terms of the X and Y coordinates. These values are inserted into an array and passed on to the recognition stage to identify the sign. This only works for static signs, such as alphabets and numbers. The centroids represent the position of each finger tip with respect to the bounding box created earlier. The bounding box is 100 pixels by 110 pixels.

Sign Recognition

This is the last stage of the process. The system uses different algorithms for recognition of static and dynamic signs. The recognition of a static sign is based on the position of the finger in the bounding box. For recognition of dynamic signs, the system needs to identify the trajectory of the signer’s hand and fingers. The trajectory of the hand is tracked and presented as a curve and a curve-matching algorithm compares the difference of the motion trajectory between the input sign and a database.

Results

The sign language translator we developed recognizes isolated alphabet letters (A to Z) and numbers (0 to 9) with 97.79 percent accuracy, and thirteen words forming sentences with 80.38 percent real-time accuracy.

MATLAB® is a registered trademark of MathWorks, Inc.

Author Information:
Dr. Rini Akmeliawati
Monash University Malaysia
JLN Lagoon Selatan, Sunway
Selangor
Malaysia
Fax: +60 3 55146207
rini.akmeliawati@eng.monash.edu.my

Bookmark and Share


Explore the NI Developer Community

Discover and collaborate on the latest example code and tutorials with a worldwide community of engineers and scientists.

‌Check‌ out‌ the‌ NI‌ Community


Who is National Instruments?

National Instruments provides a graphical system design platform for test, control, and embedded design applications that is transforming the way engineers and scientists design, prototype, and deploy systems.

‌Learn‌ more‌ about‌ NI