International Journal of Robotic Engineering
(ISSN: 2631-5106)
Volume 4, Issue 1
Original Article
DOI: 10.35840/2631-5106/4114
Application of Model Based 3D Hand Tracking for Mimicking Robotic Wrist with Three Degrees of Freedom Using Microsoft Kinect Xbox One
Keith T Montorio1*, Eugene Nico D Hermano1, Chiara Donita I Martin1, Aileen Jovy E Guieb1, Jezerie A Capuluan1 and Roselito E Tolentino2,3
Table of Content
Figures
Figure 2: Segmented observed hand...
Segmented observed hand: a) Original RGB image; b) Skin detection; c) Skin color and depth segmentation; d) Depth of extracted hand by setting threshold.
Figure 3: a) Nodes of the 3D hand model...
a) Nodes of the 3D hand model; b) Actual look of the 3D hand model.
Figure 4: Movement of human wrist with...
Movement of human wrist with a wearable device: a) Pitch; b) Yaw; c) Roll.
Figure 6: Different poses assigned by the previous...
Different poses assigned by the previous study: a) Pose 1; b) Pose 2; c) Pose 3; d) Pose 4; e) Pose 5.
Figure 7: Graph for the angles made by...
Graph for the angles made by the user and robotic wrist: a) For pitch movement; b) For yaw movement; c) For roll movement.
Tables
Table 1: Null and Alternative hypothesis.
Table 2: Resulting Angles after calculating the given quaternion values performed by the user.
Table 3: Resulting angles of the 3D hand model compared to the actual angles from each hand position and its angle difference and averages.
Table 4: Z-test evaluation of the angular data from the human wrist and robotic wrist.
Table 5: Resulting angles of both algorithm and robotic wrist for five poses in model based 3D hand tracking.
Table 6: Resulting angles of both algorithm and robotic wrist for five poses in the previous study which uses skeletal tracking.
References
- Rudio DJC, Esma KDB, Rosal LE, Caringal ABD, Yape DJE, et al. (2016) Application of microsoft xbox one for mimicking robotic wrist with three degree of freedom in different poses. International Journal of Engineering Research 5: 253-259.
- Oikonomidis I, Kyriazis N, Argyros AA (2011) Efficient model-based 3d tracking of hand articulations using kinect. BMVC 1-11.
Author Details
Keith T Montorio1*, Eugene Nico D Hermano1, Chiara Donita I Martin1, Aileen Jovy E Guieb1, Jezerie A Capuluan1 and Roselito E Tolentino2,3
1Member of Association of Electronics and Communications Engineering, Institute of Electronics Engineers of the Philippines, Philippines
2Instructor, Santa Rosa Campus, Polytechnic University of the Philippines, Philippines
3Instructor, De La Salle University - Dasmarinas, Philippines
Corresponding author
Keith T Montorio, Member of Association of Electronics and Communications Engineering, Institute of Electronics Engineers of the Philippines, Philippines.
Accepted: July 04, 2019 | Published Online: July 06, 2019
Citation: Montorio KT, Hermano END, Martin CDI, Guieb AJE, Capuluan JA, et al. (2019) Application of Model Based 3D Hand Tracking for Mimicking Robotic Wrist with Three Degrees of Freedom Using Microsoft Kinect Xbox One. Int J Robot Eng 4:014.
Copyright: © 2019 Montorio KT, et al. This is an open-access article distributed under the terms of the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited.
Abstract
This study aimed to mimic the human wrist movement in three degrees of freedom. Previous studies about robotic wrist mimicking used Skeletal Tracking of the Kinect with the application of Vector Multiplication as their algorithm, the problem of misalignment of the thumb with respect to the palm occurred, resulting bigger discrepancy especially to the roll movement which depends rigidly on the thumb and palm relationship. The proponents will solve this problem by applying the Model Based 3D hand tracking to disregard the problem of misalignment of the thumb and palm. Using Microsoft Xbox One, Python, LabVIEW and Arduino, the proponents implemented this algorithm as it only uses one reference node which is located at the center of the palm to acquire the data needed in controlling the robotic wrist. The proponents then verify it by evaluating the angle differences produced by the algorithm and the actual user. Also, by comparing the Skeletal Tracking and the Model Based 3D Hand tracking through the use of poses practiced by the previous study.
Keywords
Microsoft Xbox One, Robotic Wrist, Mimicking, 3D Model Based Hand Tracking
Introduction
Nowadays, substantial advances in field of humanoid wrist robotics have been made since the study in this area came up to be a dominant one, up to its extent into the 3-dimensional phenomena. Robotic wrist, which is under humanoid robotics or can be an industrial robot has been emerging thoroughly and received a great attention in the past years. Current research directions in this field are targeting a more functional and more refined structure, design, and implementation very close to the human hands. Different degrees of freedom, articulations and positions of the human wrist can only create small movements that made it complicated to mimic, unlike other body parts. Its complex movements turn out to be the challenging part to the researchers and as a result, improving the nature in imitating the human wrist motion came along the way, as it approaches the most recent algorithms and sensors to be used.
The most recent study about wrist mimicking, "Application of Microsoft Xbox One for Mimicking Robotic Wrist with Three Degree of Freedom in Different Poses" Rudio DJC, et al. [1], made their controller wireless in free space. The proponents used Microsoft Kinect for Xbox One as their sensor and its function called Skeletal Tracking. By using Skeletal Tracking, the proponents were able to detect the overall anatomy of the user with 25 joints including the tip of the left and right index finger, left and right thumb finger and other body joints. The proponents used the thumb and index finger as their basis to create three imaginary vectors that allows the user to move his hand freely. Then by using the concept of vector dot product, they were able to get the angles of the human wrist movements. On the other hand, with the use of skeletal tracking, there has been a bigger angle difference produced by the pitch, yaw and especially the roll movement of the human wrist and of the robotic wrist. The resulting angles for the yaw, roll and pitch has an average angle difference of 0.0532, -1.3842 and 0.9482, respectively. This is due to the vector thumb being misaligned to the plane of the vector of the index finger. As far as the user unconsciously moves his thumb while using the system, it can cause bigger discrepancies in the acquisition of angle with respect to the angle of the robotic wrist, especially for the roll.
The proponents will solve this problem by using the concept of 3D Hand Tracking based in the recent study "Efficient Model-based 3D Tracking of Hand Articulations using Kinect" Oikonomidis, et al. [2]. Using the proposed algorithm, the RGB image of the hand captured by the Kinect Sensor will be fitted with a hand model. Then, the quaternion representation pertaining to the hand model will be transformed into Euler angles to determine the yaw, pitch and roll angle to be mimicked.
Methodology
The proponents used the concept of Model Based 3D hand tracking using Python software. The process begins with the segmentation of the observed hand. The segmented 3D hand will be fitted to 3D hand model consisting nodes that will be used to obtain the quaternion parameters. Then the acquired quaternion representation will be converted to Euler angles. These converted values will be the angles that will be fed as input to the microcontroller. The microcontroller serves as an aid to communicate to the robotic wrist, transferring the required angle for pitch, yaw and roll to the servo motors installed on the robotic wrist, enabling it to mimic the movement of the human wrist. The proponents will use the LabVIEW software to check and view the response of the system.
Figure 1 illustrates how the concept upon building the design for the Application of Model Based 3D Hand Tracking is used to mimic the movement of the human wrist. The features of the model based 3D hand tracking are used to acquire the angles needed for robotic wrist mimicking.
Application of model based 3D hand tracking for acquiring human wrist angle
Segmentation of the observed hand: The proponents were able to start the hand tracking process by extracting the observed hand. In order to extract the observed hand, the proponents applied skin color detection and depth segmentation. Figure 2 illustrates the segmentation of the observed hand from the original RGB image. The original image entered the skin color detection process in order to subtract the unnecessary colors present in the image. The ensued image of the skin color detection is subjected to enter the depth segmentation process in order isolate the region of interest which is the hand.
Acquisition of Global Hand Node from the 3D Hand Model: The 3D hand is modeled by using an elliptical cylinder for the palm and two ellipsoids for caps, and each finger are made up of three cylinders and four spheres, except for the thumb having an ellipsoid, two cylinders and three spheres. Basically, the hand is modeled using the two basic 3D primitives, a sphere and a cylinder Figure 3. The 3D hand model is characterized with color code where the elliptical cylinders are yellow, ellipsoids are red, spheres are green and cylinders are blue. Alongside of the colors, the hand also has 16 nodes which are not visually presented during the actual process. The global hand node located at the palm, 5 nodes for the basic finger joints, and 10 nodes for the remaining finger joints. This hand model is then estimated and positioned to the hand observation with respect to the calibration.
In order to make sure that the 3D hand model is placed thoroughly to the hand observation, the PSO is used. Through this process, the discrepancy of the estimation made on the 3D hand model and hand observation are optimize for a better result.
The relation of the variables can be expressed by this formula:
vi(t+1) = w[vi(t) + c1r1[pi(t)-xi(t)] + c2r2[g(t)-xi(t)]]
Where:
w = Constriction factor
c1 = Cognitive component
c2 = Social component
r1, r2 = Random samples of uniform distribution
vi(t+1) = New velocity of the particle
vi(t) = Initial velocity
pi(t)-xi(t) = Vector connecting xi(t) and pi(t)
g(t)-xi(t) = Vector connecting xi(t) and g(t)
And for the particle's new position, it is simply the combination of the current position of the particle and its new velocity.
xi(t+1) = xi(t) + vi(t+1)
Where:
xi(t+1) = New position of the particle
xi(t) = Current position of the particle
vi(t+1) = New velocity of the particle
Acquisition of quaternion representation: From the 16 nodes included inside the 3d hand model, the study focuses only on the Global Hand node which is located at the center of the palm. The Hand algorithm uses quaternions to measure the output more accurately for it hinders the Gimbal lock or the loss of one degree of freedom in a three dimensional space. Quaternions can be represented by the coordinates (w, x, y, z) where w, x, y and z are all real numbers. More specifically, a unit quaternion is used in the study and the measurements are initially in the origin (1,0,0,0).
The relation of x, y, and z, to w can be expressed by this formula:
w = cos (Θ/2)
(x,y,z) = v = sin (Θ/2)
Where:
v = Magnitude of x, y, and z coordinates
Θ = Angle rotation of quaternion
Conversion of quaternion representation to Euler angles: After acquiring the (w, x, y, z) values, these vector coordinates must be converted to yaw, pitch, and roll movements. This is done by converting quaternions to Euler angles.
Roll = (( arctan 2( 2 × ( y × w - x × z ), 1 - 2 × ( y × y + z × z ) ) )/PI ) × 180.0
Yaw = ( (arcsin ( 2 × ( x × y + z × w ) ) )/PI) × 180.0
Pitch = ( ( arctan 2( 2 × ( x × w - y × z ), 1 - 2 × ( x × x + z × z ) ) )/PI ) × 180.0
Evaluation of the angles made by the user and the angles measured from the robotic wrist
The proponents will evaluate the algorithm used if it is sufficient enough for the acquisition of angle of the wrist in three degree of freedom. A wearable tester which is composed of three potentiometers that corresponds to the pitch, yaw and roll movement, will be devised in order to measure the angles made by the user. The potentiometers from the wearable equipment will create varying and simultaneous signals that will be fed to the microcontroller to interpret the data as the user moves his hand. Through the use of LabVIEW and Arduino, the gathered data will be transferred directly to the Microsoft Excel for evaluation Figure 4.
The proponents will evaluate the angular movement of the wrist made by the 3D hand model and the angles measured from the robotic wrist. The proponents will use LabVIEW software to simulate and evaluate the gathered data. LabVIEW Robotics provides a way to interface the robotic wrist with angles as an input from the python program. Communication between LabVIEW and Arduino is possible through LabVIEW Interface for Arduino (LIFA). LabVIEW Interface for Arduino (LIFA) Toolkit allows developers to acquire data from the Arduino microcontroller and process it in the LabVIEW Integrated Development Environment (IDE).
After getting the angular data from the 3D hand model and from the robotic wrist, the proponents will evaluate the significant difference between the angles made by the algorithm and the robotic wrist angles. The proponents will apply Z-test using the acquired angles to evaluate the response of the control system. In Z-test, it is necessary to define the null hypothesis (Ho), alternative hypothesis (H1) and the critical value that will prove that the hypothesis is true Figure 5.
To know the critical value in a two-tailed test, the significance level (α) is set to a standard value of 5%. This significance value will create a confidence of 95% (acquired from 100% - α) and the area of the curve as the critical value will be 0.975 (acquired from 1 - (α/2)). Knowing the area, the proponents used the z-test table (known as areas under the normal curve) and got a critical value of 1.96.
Table 1 shows the null and alternative hypothesis and its condition to be accepted for z-test. Depending on the computed z value, the proponents will either accept the null hypothesis or the alternative hypothesis. If the computed z value is within the range -1.96 to 1.96 (the positive and negative of critical values) the Null hypothesis is accepted, otherwise the alternative hypothesis is accepted.
The proponents will obtain the z value by using the Z-test equation by this formula:
Where:
Z = z-test result
n1 = Number of samples in the first sample group
n2 = Number of samples in the second sample group
= Mean value of the first sample group
= Mean value of the second sample group
σ1 = Standard deviation of the first sample group
σ2 = Standard deviation of the second sample group
Comparison of the angles made by the model based 3D hand tacking with the previous skeletal tracking algorithm
The proponents will evaluate and compare the results of gathered data from the Skeletal Tracking Algorithm and Model Based 3D Hand Tracking. The proponents will use the poses defined by the previous study and the assigned reference angle for yaw, pitch and roll which are 15, 25 and 30 degrees respectively for the comparison of data gathered. The proponents will compare the computed angular difference of the five poses provided in Skeletal Tracking and Model Based 3D Hand Tracking algorithms for each movement pertaining to the yaw, roll and pitch to verify the accuracy of the system Figure 6.
Results and Discussion
Application of model based 3D hand tracking for acquiring human wrist angle
From the movement made by the user, the proponents attained the quaternion representation with respect to the position of the hand. Those quaternion representations refer to the X, Y, Z and W values coming from the global hand node. The reference node is specifically located at the center of the palm. This only implies that any other nodes coming from the fingers including the thumb does not affect the quaternion representation that are being gathered, thus the misalignment of the thumb with respect to the palm was nullified, since only the global hand node was utilized. Table 2 shows the acquired quaternion representation of the algorithm. The proponents applied the Euler Angle formula using the attained quaternion values from the global hand node to acquire the angles made by the human hand for the pitch, yaw and roll angles.
Evaluation of the angles made by the user and the angles measured from the robotic wrist
The proponents evaluated the algorithm by getting the angular difference of the angles made by the user and the angles made by the 3D hand model. The angles gathered for pitch, yaw and roll movement for both wearable and algorithm are shown on the table.
Table 3 shows the actual angles made by the user using a wearable device and the resulting angles from the algorithm and its average difference for each movement. The proponents observed that there were only minimal angle differences for the pitch, yaw and roll movement. Specifically the average angle differences are -0.2285, -0.5508 and 0.5729 respectively. It implies that there were only minimal lapses in acquiring the movement of the human wrist by the algorithm.
Figure 7 shows the movement of the user applying the 3D hand model and underneath it is the corresponding response of the 3D hand model and the robotic wrist application. The graph shows the angle measured from the actual user with the use of Model Based 3D Hand Tracking (white) and the robotic wrist (red line). It is observed in the graph that the white line leads the red line.
Table 4 shows the z-test evaluation of the angular data gathered from the human wrist and the robotic wrist. Herein the table are the number of samples (n1 and n2), the mean of the samples ( and ), the standard deviation of the samples (σ1 and σ2) and the z-test results. As presented, pitch has the lowest z-test value, it only implies that pitch is the movement that best mimics the robotic wrist. Generally, the table shows that the z-test result is within the range of -1.96 to +1.96. Thus, there is no significant difference between the robotic wrist angles and the actual user angles. This means that the robotic wrist angles are close to the actual wrist angles.
Comparison of the angles made by the model based 3D hand tacking with the previous skeletal tracking algorithm
Using the poses defined by the previous study, the proponents compared the angles made by the Model based 3D Hand Tracking Algorithm and Skeletal Tracking Algorithm. The proponents used the reference angles assigned by the previous study for pitch, yaw and roll which are 25, 15 and 30 degrees respectively.
Table 5 shows the results for the pitch, yaw and roll movement for the five poses applying the Model based 3D hand tracking, meanwhile Table 6 shows the resulting angles for the pitch, yaw and roll movement for the five poses in the previous study which uses Skeletal tracking as their algorithm. Using the same five poses and the same angles used by the previous researchers in evaluating the gathered data, the proponents were able to compare the results with the previous study. Comparing the results of their average angle differences, the proponents observed that lower angle average results were obtained in model based 3D hand tracking rather than the results from the previous skeletal tracking algorithm. Specifically, the average angle differences for the model based 3D hand tracking pertaining to the pitch, yaw and roll movement are 0.03, -0.02092 and 0.29032, respectively. Referring to Table 6, the average angle differences using skeletal tracking are 0.9482, 0.0532 and -1.3842 in pitch, yaw and roll movement, respectively. This only implies that the model based 3D hand tracking is more effective than the skeletal tracking algorithm in terms of mimicking the human wrist in three degree of freedom.
The proponents also observed that the roll angle differences in model based 3D hand tracking in each pose is lower than the roll angle differences in each pose using the skeletal tracking. This indicates that the issue of the misalignment of the thumb with respect to the palm that greatly affects the roll movement due to the thumb dependency of the previous algorithm was resolved. The proponents also noticed that pose four has the smallest angle difference. It is because the user's hand is directly facing the Kinect. In this way, the hand is best detected from the image captured by the sensor. The proponents also noticed that pose two, specifically roll, has the largest angle difference. It is because large portion of the human hand are hidden from the sensor when the hand is not directly facing the Kinect. Thus, causes poor placement of 3D hand model on the actual hand. Roll has the largest average difference. This simply point out that compared to yaw and pitch movement, the roll is the most difficult to mimic. As seen in Figure 8, compared to yaw and pitch movement, roll movement causes obscurity to most part of the hand causing poor detection of the observed hand. This causes confusion upon the right placement of the 3D hand model to the captured image. Because the Global Hand Node is located in the 3D hand model itself, the inappropriate fitting of the 3D hand model creates an improper quaternion representation. As a result, large difference between the resulting angle and the actual angle occurs.
Conclusion
This study disregard the misalignment of the thumb with respect to the palm, since only the node on the palm which is the Global hand node was being utilized. Among the three, the roll movement was the most difficult to mimic based on the data obtained simply because the movement itself causes large obscurity on most part of the hand. Thus, in applying the model based 3D hand tracking for mimicking the human wrist, proper detection of the hand is necessary.
The z-test for the pitch, yaw and roll movements for this study are 0.5003, 0.1257 and 0.3885, respectively. Thus, the results are within the range of the critical values and it only implies that there are no significant difference between the robotic wrist angles and the angles produced by the algorithm.
The Model Based 3D Hand Tracking Algorithm is more effective in mimicking the human wrist in three degree of freedom compared to the previous study of Skeletal Tracking Algorithm. The proponents also noticed that pose four has the smallest angle difference and that pose two, specifically roll, which has the largest angle difference due to the proper detection of the hand.
Recommendation
Based on the evaluated data, most of the angle differences are caused by the poor detection of the actual users' hand. Angle discrepancies between the robotic wrist and the actual users' hand will vary depending on how proper the hand is detected. Although the misalignment of the thumb with respect to the palm was resolved by the use of 3D Hand Tracking, the proponents suggested that the future researchers will improve the algorithm by enhancing the detection of the user's hand even when large portion of the human hand are obscured from the sensor or even when the hand is at close fist.
Based on the presented graph, the Model Based 3D Hand Tracking (white line) leads the robot (red line) indicating a delay upon the transmission of data. The proponents suggest improving the mechanical robot for a better response.
From the comparison made, the proponents observed that the application of model based 3D hand tracking in different poses is more effective when the hand is directly facing the Kinect sensor where the hand is best detected. In this matter, the proponents recommend the future researchers to use an algorithm that can mimic better the human hand even when the hand is not directly facing the Kinect.
Acknowledgements
The authors would like to give their greatest gratitude to the following people who make this research paper possible in accordance within the time provided:
First of all, to the researchers' Professor and thesis Adviser, Engr. Roselito E. Tolentino for unselfishly sharing his knowledge, time and continuous support for guiding the authors throughout the completion of the paper; Second, to our panels, Engr. Marie Grace P. Tolentino, Engr. Teresita B. Gonzales and Engr. Katrina B. Acapulco, for their constructive criticism to our research; Third, to the researchers' family, for extending their emotional, moral and financial support. Their presence help researchers to be always ready and equipped whenever they engage in writing this research. Their sacrifices of staying up late just to make sure that the researchers got home safe from making the paper, will always be appreciated; Last but surely not the least, the researchers offer this success to our Almighty God. All of the researchers' sacrifices had paid off because of His presence. His guidance kept the researchers not to lose their patience and never lost track on what they were doing. The researchers are grateful for God's unending grace for bestowing the researchers' presence of mind in times when they needed it the most and for keeping their group united to finish what they have started. For this, the researchers do humbly offer this research and their greatest gratitude to him.