Home
        Master Thesis Using MS Kinect Device for Natural User
         Contents
1.                                         46  FIGURE 3 30     A CLASS DIAGRAM DESCRIBING AN OBJECT MODEL OF THE INTERACTION INFO   DATA STR  CTURE       iioii iter e me Pb oer oe ede ortae eme ease               47  FIGURE 3 31     AN OBJECT MODEL OF THE TOUCH LESS INTERACTION                                                       49  FIGURE 3 32     A CLASS DIAGRAM DESCRIBING AN OBJECT MODEL OF THE ACTION DETECTOR                                                                      50  FIGURE 3 33   AN OBJECT MODEL OF THE GESTURE                                             53  FIGURE 3 34     A STATE DIAGRAM OF THE WAVE GESTURE                                                                       53  FIGURE 3 35     AN OBJECT MODEL OF WAVE     5        5                   54  FIGURE 3 36   A STATE DIAGRAM OF THE SWIPE                                       55  FIGURE 3 37     AN OBJECT MODEL OF THE SWIPE GESTURES             56  FIGURE 3 38     AN ITERATION PROCESS OF THE NUI DEVELOPMENT       56  FIGURE 3 39   A BLOCK DIAGRAM OF THE WPF TOUCH DEVICE IMPLEMENTATION                                  57    80    FIGURE 3 40     AN ILLUSTRATION OF CURSOR ACTIONS  FROM THE LEFT  POINT AND WAIT       FIGURE 3 44     A CHART SHOWING THE RESULTS OF THE LEVEL OF COMFORT FOR PLANAR    PHYSICAL INTERACTION ZONE                    epi etn eo Bank Rea eae Fl EESTE                       68  FIGURE 3 45     A CHART SHOWING THE RESULTS OF THE LEVEL OF COMFORT FOR CURVED   PHYSICAL INTERACTION Z
2.                                       505                                          2    2 1  Natural User Interface  2 1 1  Multi touch Interface                   eese teet treten tette                                                                                    3       224 22 To  chsless Interface        teaser etri oc etras arcta rtr aO ded cade 4    2 2  Microsoft Kinect Sensor          22 1  Inside                mes coa eese eae sienne ctetu t  2 2 2  Field Of VIEW c                                                         7  2 2 3  Software Development                                                                                                                                                                            8    2 3  Microsoft Kinect for Windows SDK       2 3 1  Depth StreaM iini               PAETE OO OI                 TA EEE A E itte E AET 11  2 3 3  Skel  tal                   esses iadaa                                                           12  2 3 4  Face Tracking Toolkit  eiue E ee ee               13  2 3 5  Interaction Toolkit                      eet                                                rto      eset 14   REALIZATION                                                                16   3 1  Design and Analysis                                                             16     DX ER elata DIVA IN Tullo  E 16  3 1 2  Interaction  Detection    etes desde                             3 1 3  Interaction Quality                     3 1 
3.                                    Explorer   Symbols                EAE i IE           596 963925  Y   383 261183   Not Logged In   HZ         fl           Figure 3 52   A screenshot of the touch less interface in the GraphWorX64    application     3 5 3 3  Safety and Reliability  The ICONICS Company required a safe and reliable functionality of the touch     less interface in order to prevent any unwanted situations due to an unpredictable  behavior of the interactions  The requirement has been resolved by integrating the  interaction detection designed in chapter 3 1 2 and the interaction quality system  designed in chapter 3 1 3  As a result  the system is able to recognize intended  interactions by observing the user   s body and face pose and also is able to deter     mine whether the user   s position is suitable for a comfortable interaction     76    4  Conclusion    This thesis has dealt with the design  implementation and integration of the  touch less interface in the real case scenario  The work was divided into three    parts  Each part dealt with one aspect of the assignment     The first part of the thesis was dealing with the design of the touch less  interface and its implementation as a prototype library  Based on this library a set  of five prototypes has been implemented  Four prototypes demonstrate different  designs for the touch less interactions and the fifth prototype integrates the  touch less interactions with the Windows 8 operating system in orde
4.                            VOUS        Police  NJ home hostage situation resolved    Plze    CZE  P Report  IRS         chiefs knew Tea  6 1                                 o     61     41    da 73    Outlook       bing maps               AdChoices   gt   Ad Feedback  sports mon  Opinion  Tiger Brown s scary   amp  Sergio need     curbside art irks  to focus on golf neighbors       StockScouter   News Corp   amp  9  ther hc  at to trouble  over can raise      ague    ts on football helmet  rts trek to NFL          Figure D 2   Window 8 Touch less Application   Using touch less with Web Browser     A 6    E  A Form for User Subjective Tests    Subject No     Level of Comfort    Challenging    Planar  Interaction Zone  Curved  Interaction Zone  Level of Usability    Trigger            Usable  untuitive Requires habit Diffcult to use    Point  amp  Wait    Level of Usability    Usable  fatigue Challenging  a        z  5                          Targeting and selecting small items  Using Windows 8 maps    Using Windows 8 web browser    2  2  5                      2  D  E  9  5  o  8             5      a  5      2       
5.                    A 3  D  WINDOWS 8 TOUCH LESS APPLICATION SCREENSHOTS                                A 6    E  A FORM FOR USER SUBJECTIVE TESTS                 eese terrre          A 7    1  Introduction    Computers have evolved and spread into every field of industry and enter   tainment  We use them every day at work  at home  at school  simply almost eve   rywhere and computers  in any form  have become an integral part of our lives   Today  when someone speaks about using a computer  we usually imagine typing  on the keyboard and moving the mouse device on the table  These input methods  have been invented in 1960s as a kind of artificial control allowing users to use  computers with limited computational power  Today the technological advance   ment is making significant progress in the development of sensing technology and  makes it possible to gradually substitute the artificial way of human computer    interaction by more natural interactions called Natural User Interface  NUI      The NUI has already found its place in mobile devices in the form of multi   touch screens  Selecting items  manipulating with images and multimedia using  touch makes the human computer interaction more natural than it is with the tra   ditional peripheries  However  in the past years the evolution of the sensing tech   nology has gone much further beyond the limits of the currently used human   computer interaction  The technological advancement in computer vision enabled    computers 
6.                eese 20    FIGURE 3 7     AN ILLUSTRATION OF THE QUALITY DETERMINATION FOR EACH PARTICULAR JOINT  INDIVIDUALLY  THE GREEN JOINTS HAVE THE HIGHEST QUALITY  THE RED JOINTS    HAS THE LOWEST QUALITY                                              21  FIGURE 3 8     A PLANAR PHYSICAL INTERACTION ZONE DESIGN  GREEN                                                  23  FIGURE 3 9     AN ILLUSTRATION OF MAPPED COORDINATES INTO THE PLANAR MAPPED HAND   SPACES                                                                     23    79    FIGURE 3 10     AN ILLUSTRATION OF THE CURVED PHYSICAL INTERACTION ZONE  GREEN                     24  FIGURE 3 11     AN ILLUSTRATION OF MAPPED COORDINATES IN THE CURVED PHYSICAL    INTERACTION ZONE  e E          P 25  FIGURE 3 12   CURSOR S POSITION FILTERING AND A POTENTIAL LAG   14                                              27  FIGURE 3 13     A WEIGHT FUNCTION FOR THE MODIFIED LOW PASS FILTER DEPENDENT ON THE   CURSOR S ACCELERATION      s cies                 e o IR      eae RE ea 28  FIGURE 3 14   A CONCEPT OF THE USER S HAND VISUALIZATION USING A       5              2             28  FIGURE 3 15     AN ILLUSTRATION DESCRIBING JOINTS OF INTEREST AND THEIR RELATIVE   POSITION FOR WAVE GESTURE RECOGNITION  33 32    FIGURE 3 16     AN ILLUSTRATION DESCRIBING JOINTS OF INTEREST AND THEIR RELATIVE  POSITION FOR THE SWIPE GESTURE RECOGNITION  GREEN AREA INDICATES A       HORIZONTAL MOVEMENT RANGE THAT IS RECOGNIZED AS A SWIPE GESTURE  
7.               33  FIGURE 3 17     A BLOCK DIAGRAM OF THE IMPLEMENTATION ARCHITECTURE                                         34  FIGURE 3 18     A CLASS DIAGRAM DESCRIBING THE DEPTH FRAME DATA STRUCTURE  FIGURE 3 19     A CLASS DIAGRAM DESCRIBING THE COLOR FRAME DATA STRUCTURE                             36  FIGURE 3 20     A CLASS DIAGRAM DESCRIBING ARCHITECTURE OF THE SKELETON FRAME DATA   STRUCTURE                                               ORA des URBE ER PE epe 38  FIGURE 3 21     A CLASS DIAGRAM DESCRIBING ARCHITECTURE OF THE FACE FRAME DATA   STRUCTURES 252                                      39  FIGURE 3 22     A BLOCK DIAGRAM DESCRIBING THE DATA SOURCES ARCHITECTURE AND THEIR   OUTPUT                                          39    FIGURE 3 23     A CLASS DIAGRAM DESCRIBING AN OBJECT MODEL OF THE DEPTH DATA SOURCE           40  FIGURE 3 24     A CLASS DIAGRAM DESCRIBING AN OBJECT MODEL OF THE COLOR DATA SOURCE          41  FIGURE 3 25     A CLASS DESCRIBING AN OBJECT MODEL OF THE SKELETON DATA SOURCE                      41  FIGURE 3 26     A CLASS DIAGRAM DESCRIBING AN OBJECT MODEL OF THE FACE DATA SOURCE              43  FIGURE 3 27     A CLASS DIAGRAM DESCRIBING AN OBJECT MODEL OF THE KINECT DATA SOURCE          44  FIGURE 3 28     A CLASS DIAGRAM DESCRIBING AN OBJECT MODEL OF THE KINECT SOURCE                            220                                    46  FIGURE 3 29     A CLASS DIAGRAM DESCRIBING AN OBJECT MODEL OF THE INTERACTION   RECOGNIZER 2       
8.         eee 8  FIGURE 2 4     AN ILLUSTRATION OF THE DEPTH STREAM VALUES                 8      10  FIGURE 2 5     AN ILLUSTRATION OF THE DEPTH SPACE RANGE              sese eene nennen enne enn 10  FIGURE 2 6     AN ILLUSTRATION OF THE SKELETON SPACE       12  FIGURE 2 7     TRACKED SKELETON JOINTS OVERVIEW              eese nennen       12  FIGURE 2 8     AN ILLUSTRATION OF THE FACE COORDINATE SPACE               seen eene 14  FIGURE 2 9     TRACKED FACE POINTS   25   FIGURE 2 10     HEAD POSE ANGLES   25         nennen         FIGURE 2 11     GRIP ACTION STATES  FROM THE LEFT  RELEASED                                                             15  FIGURE 3 1     AN ILLUSTRATION OF THE KINECT S SETUP         17  FIGURE 3 2     AN ILLUSTRATION OF THE INTENDED AND UNINTENDED USER INTERACTION BASED   ON A FACE ANGLE                                                                                   18  FIGURE 3 3     AN ILLUSTRATION OF THE UNSUITABLE SCENARIO FOR THE RECOGNITION OF THE   TOUCH LESS USER INTERACTION                              nennen nennen ener innen enn 18  FIGURE 3 4     AN ILLUSTRATION OF ADVICES FOR HELPING USER FOR BETTER EXPERIENCE                     19  FIGURE 3 5     AN ILLUSTRATION OF THE EXAMPLE OF A PROBLEMATIC SCENARIO FOR TOUCH    LESS  INTERACTION               Ee oe ede ee ha        cases e ha                 20  FIGURE 3 6     AN ILLUSTRATION OF THE SENSOR S FIELD OF VIEW  FOV  WITH INNER BORDER   AND THE INTERACTION QUALITY FUNCTION Q D      
9.     The computer s ability to understand body movements led to the design of a  whole new kind of human computer interaction  which was termed  Touch less  Interface  6   The touch less interface indicates that touch interaction and mouse  input will not be the only broadly accepted ways that users will engage with inter     faces in the future     The most common design for touch less interface is using the user s hands for  moving a cursor over the screen  This technique uses the Skeletal Tracking that can  be combined with a Hand Detection for performing a click  A usual scenario for use  of such a touch less interface is that the user stands facing the sensor and with his  hand in certain distance from his body and high above the floor  he can move a  cursor on the screen by his hand s movement  This kind of NUI is used by Microsoft  for Kinect for Xbox 360 dashboard  Figure 2 1  and also the company promotes it  for use with the Kinect for Windows targeted for PC  The design  however  requires    it to be combined with a traditional GUI for creating the user s interface and giving    advices which means that this kind of natural interaction is still not a pure NUI but    it s getting closer to it        Figure 2 1   An illustration of the Kinect for Xbox 360 touch less interface   7     Another design for a touch less interface takes advantage of the possibility to  track the user s body movement and translate them to specific gestures  Gestures  are something what all p
10.    3 2 5  Integration with WPF          3 2 6  Integration with Windows                                                                                                                                                                               3 2 7                                                     ol inel edic aeree    tec ALS Lebe a irte 59  3 22 74  Overlay Wind 6W eto                            nen ee ay 59  3 2 7 2  Cursors Visualization  tete te atit                                                      60  3 2 7 3  Assistance Visualizzati OM  scsessccccsccecesacastisccecscecesactcesccsetincecssbiocosscatosonasbseccousscecusastcussctohsncesushiesisseticeensbitees 60  RE POCO                       62  3 3 1 T  st A pplication ee eek Eee ae a ri ML de redd 62  3 3 2  Touch less Interface for Windows                                                                                                                                             64  3 4  User Usability TO S  S aaee aeae a aar a r a aa aa Aaaa raaa aa Ke r Aa AAA Mapa Aa AAAS EAH a Vepa Ka daan Aaaa nona ae 65  3A 1  Test Methodology    xe e SA exem degentes tee ea aed          65     4 2     5  5          T                                   67  3 4 3  Tests Evaluatl  na      eei pao                         ei d d rans 71  3 4 3 1  The Level of                                                                   ia 71  3 4 3 2 The Level of           eet edente lea            a E  3 4 3 3  The Level of Usability for
11.    en wikipedia org wiki Finite state machine     Field of view  Wikipedia   Online   Cited  04 30  2013    http   en wikipedia org wiki Field of view     Face Tracking  MSDN   Online  2012   Cited  04 13  2013    http   msdn microsoft com en us library jj130970 aspx     Depth projector system with integrated VCSEL array   Online  11 27  2012    Cited  04 30  2013    https   docs google com viewer url patentimages storage googleapis com pd  fs US8320621 pdf     Definition of  touch less user interface  PCMag   Online   Cited  04 30  2013      http   www pcmag com encyclopedia term 62816 touchless user interface     84    38  Definition of the Simplest Low Pass  Stanford   Online   Cited  04 30  2013    https   ccrma stanford edu  jos filters  Definition_Simplest_Low_Pass html     39  Bing Maps WPF Control  MSDN   Online   Cited  04 30  2013    http   msdn microsoft com en us library hh750210 aspx     40  Bayer Filter  Wikipedia   Online   Cited  04 26  2013    http   en wikipedia org wiki Bayer_filter     41  BACnet   Online   Cited  04 30  2013   http   www bacnet org      85    A  Point and Wait Action Detection State Chart    Right and left drag       Right click  gt   Right drag 75      multitouchrzoom gesture     Right cursor on its place         AND initial timer tick out action timer tick out    Left tivated Left cursor moved  Right cursor activated eft cursor activate    AND left cursor timeout tick out      Right m Left curso                timeout tick   Left Cursor
12.   Face Source  For this work the Face Tracking feature  distributed as an additional library by    Microsoft  has been designed as a separated data source which is implemented by  the KinectFaceSource class  This class implements logic for processing depth   color and skeletal data into the tracked face data structure  The face tracking itself  is handled by the external native library FaceTrackLib  A call of the native methods  for face tracking is done by using the  NET wrapper for the external native library    implemented by the Microsoft Kinect Toolkit FaceTracking assembly     The face source extends the basic face tracking functionality by an implemen   tation of a timer for measuring the tracked face s lifetime  Depending on the envi   ronmental conditions  the face tracker can lose a track of the tracked face unpre   dictably  The lifetime timer can prevent a loss of the tracked face caused by a noise  in a several frames  If the face is tracked the lifetime timer is active and has its  highest value  In case of the face tracker lost face s track and the lifetime timer is  active  there is a last tracked face used as the currently tracked data  But when the  timer ticks out the tracked face is identified as not tracked  In the result the track   ing is more stable  however  when the face tracker loses a track of the face  the face  source may use outdated and thus the inaccurate data  The target lifetime value in    milliseconds can be set by the FaceTrackingTTL prop
13.   GraphWorX64   displays  According to the application of the displays in industry  and energetics the interactions must be safe and secured from any unpredictable  behavior  The following list shows the crucial requirements for the touch less    interface integration     e Possibility of use with WPF and standard Windows controls  such as scrollbar   list view  ribbon bar  etc       Preventa random user tracking by using a login gesture    e Prevent losing the engagement with the system when someone else comes in  front of the sensor    e Prevent unpredictable behavior caused by a user   s unintended interaction     e Provide an appropriate visual feedback     3 5 3  Touch less Interface Integration    The integration of the touch less interface with the GraphWorx64    applica   tion can be divided into two parts  The first part deals with integrating the interac   tions into the user input system of the application and the second part creates a    visualization layer on it     From the user usability tests conclusion evaluated in chapter 3 4 4  the inte   grated touch less interface is designed on the Curved Physical Interaction Zone  described in chapter 3 1 4 2 and the Grip action trigger described in chapter  3 1 6 2  A combination of these two designs has resulted in the most natural and    comfortable touch less interface solution     74    According to company s know how the integration of the touch less interface  with the product is described only superficially and 
14.   It means that the user s be   havior itself could have an influence on the natural interaction experience  This  finding leads us to think about designing a system for detecting the user s intent to    interact     We can get the inspiration in observation of our own interaction with other  people  This observation will tell us that when one person is talking to someone  else  he or she is looking at the other person s face  The current solutions for the  natural interaction offer a system for tracking faces  described in chapter 2 3 4   which is able to evaluate head angles and even the facial expression  For detecting  whether the user wants to interact with the system we can use the Face Tracking  for determining whether the user is looking toward the sensor  We can get three  head pose angles  pitch  yaw and roll  described by the Figure 2 10  For instance   imagine a scenario where the user is standing in front of the sensor surrounded by  other people during some presentation  We can expect that when the user is pre   senting  he or she is talking toward the audience and gesticulates  It means that the  user s head is turned left or right from the sensor  This is the most frequented sce   nario in practice and it leads us to a finding that for our intention to detect  whether the user is looking toward the sensor with his or her aim to interact we  could use one of the head pose angles  the yaw pose angle  A value of this angle is  in the range from  90 to 90 degrees 
15.   disposes of very high resolu     tion which allows much precise fingers tracking     2 2  Microsoft Kinect Sensor    The Kinect sensor has been developed and patented  11  by Microsoft  Company originally under a project Natal since 2006  The intention to create a  revolutionary game controller for Xbox 360 was initiated by the unveiling of the  Wii console at the 2005 Tokyo Game Show conference  The console introduced a  new gaming device called the Wii Remote which can detect movement along three  axes and contains an optical sensor that detects where it is pointing  This induced  the Microsoft s Xbox division to start on a competitive device which would surpass  the Wii  Microsoft created two competing teams to come up with the intended  device  one working with a PrimeSense technology and other working with tech   nology developed by a company called 3DV  Eventually  the final product has been  named Kinect for Xbox 360 and was built on the PrimeSense s depth sensing    technology     At this time  Microsoft offers two versions of the Kinect device  The first one   Kinect for Xbox 360  is targeted on the entertainment with Xbox 360 console and  was launched in November 2010  After the Kinect was hacked and many various  applications spread through the Internet  Microsoft noticed the existence of a  whole new market  On the basis of this finding Microsoft designed a second version  of the sensor  Kinect for Windows  targeted on the development of commercial  applications
16.   its place        DOWN     panna         T DOWN  pa                        1         Tum         Right cursor moved      Right cursor                 Initial          timeout tick Action    None       Right cursor    Right hand on timeout tick    Right Cursor  24       J             Right cursor deactivated         Right cursor on Right cursor    Right cursor on  its place   moved    its place       Action   ie Action  timeout tick    Right cursor on     timeout tick  its place                 Action timout  Action     Right Cursor   tickout          Right cursor moved aie           Left cursor  Deactivated  OR  Right cursor  deactivated        Right cursor on  its place                  Right cursor deactivated         Action timeout    Right cursor   IE CM Tick out    deactivated       Left Cursor    Figure A 1   A state chart for the Point and Wait action detector     A 1    B  User Manual    The Test Application is an executable application KinectInteractionApp exe and  it is located in the bin folder  The executable application takes a path to the    configuration XML file as its argument   C  TestApp bin KinectInteractionApp exe config xml    There are prepared four batch files  each for one combination of the physical    interaction zone and action trigger        1 curved paw bat     2 curved grip bat     3 planar paw bat       4 planar grip bat    The Touch less Interface for Windows 8 is an executable application  KinectInteractionWin8App exe located in the bin fold
17.  48    The system allows switching      interaction between tracked users standing  toward the sensor  The current user is selected by waving his or her hand  The  system notifies the application about the change of the interacting user by raising  an event TrackedUserChanged  In case of there is at least one user tracked  the  system state  accessible through the State property  is set to Tracking state   otherwise if there is no tracked user the system s state is set to Idle and it is    waiting for the user s login by using the wave gesture     A class diagram describing an object representation of the touch less interac   tions interface is illustrated by the Figure 3 31     9 IDisposable             S                         InteractionRecognizer    GestureInterface v   KinectSource 5 4        55       55       55        Interaction     GestureRecognizer      KinectSource             cursors   TouchlessCursorCollection       9 IDisposable                                                    TouchlessInteractionInterface              gt  gt  TouchlessCursor A  Class Class     4     Properties   Properties        ActionDetector   ActionDetectorBase    Acceleration   Vector3D       ActualHandSpaceCenter   Vector3D    ActionTTL   Time2Live       ActualinteractionZoneOffset   Vector3D    IsActive   bool      PrimaryCursor   TouchlessCursorType   Isinteractive   bool     Methods    IsOutOfBounds   bool      Dispose  void    Position   Point3D       Initialize    void M PositionQ
18.  ColorSource KinectColorSource       Connected P Status    MaximumriltAngle   int     gt          Initializing E             4    MinimumrTiltAngle   int   2  Error M TiltAngle   int      NotPowered    Uniqueld   string    SkeletonSource KinectSkeletonSource     NotRead  Class       Say   Methods  DeviceNotGenuine 9 D          DeviceNotSupported Dispose     void       InsufficientBandwidth    Initialize    void   FaceSource   KinectFaceSource         KinectSource      2 overloads        9  sensor AllFramesReady     void        Sources KinectSourceStatusChanged     void  9  StartElevationTask     void      ToString     string     Uninitialize     void    Events     AllFramesReady   EventHandler  AllFramesEve        StatusChanged   EventHandler lt KinectSourceS             Figure 3 27   A class diagram describing an object model of the Kinect data source     3 2 3 6  Kinect Source Collection   Dynamic Kinect sources instantiation and disposition depending on whether  the device has been connected or disconnected is implemented by the  KinectSourceCollection class  The class is implemented on the Singleton  design pattern  The implementation is based on the Microsoft Kin   ect KinectSensorCollection class and handles its event StatusChanged  which indicates a sensor s status change  Regarding the state the collection creates  or removes an instance ofthe KinectSource class for the given sensor  For each  status change there is the KinectSourceStatusChanged event raised in order 
19.  Real Case Scenario  3 4 4  Tests Conclusion iniii  3 5  Touch less Interface Integration with ICONICS GraphWorX64                      esee 73  3 5 1  About ICONICS GraphWorX64    3 5 2  Requirements         3 5 3  Touch less Interface                                                                                                                                                           74    3 5 3 1 1                                           rapa eram H                                         75    9 5 3 2  VISUaliZat1OfT  z cust utet necat cid r  cit acit de sacs hte ddl dere Rock ure te Br e Fee Hk ucc Boch LM dd 76    3 5 3 3  Safety and Reliab Iya ace eta ee a M DLE DD 76  5 CONGEUSION mERE                               an          77  LIST OF ABBREVIATIONS                   2er rectrn curia        ueni rut                                                                                               78  LIST                                                                                        79  LIST OF                                                          RO 79  ETS TOBE FIGURES                                                      79  BIBLIOGRAPHY d                                         82  A  POINT AND WAIT ACTION DETECTION STATE CHART                             A 1                                oeem                A 2  C  TEST APPLICATION SCREENSHOTS                                                                                                             
20.  User Interface   UI   User Interface   API   Application Programming Interface  FOV   Filed of View   PhIZ   Physical Interaction Zone   RGB   An additive color model consisted of Red  Green and Blue component  IR   Infrared light spectrum   CPU   Central Processing Unit   HMI   Human Machine Interface    SCADA   Supervisory Control and Data Acquisition  SNMP  Simple Network Management Protocol    78    List of Equations    EQUATION 3 1     AN EQUATION OF THE INTERACTION QUALITY FUNCTION                                              21  EQUATION 3 2     A FORMULA FOR CURVED PHYSICAL INTERACTION ZONE                                                25  EQUATION 3 3     LOW PASS FILTER WITH TWO 5          5                        4  84      0   nennen nennen               27  EQUATION 3 4     A WEIGHT FUNCTION FOR THE MODIFIED LOW PASS FILTER                  een 28  List of Tables   TABLE 1     THE LEVEL OF USABILITY RATING 5                     2  2  4   040000000      66  TABLE 2     THE LEVEL OF COMFORT RATING 5                                67  TABLE 3     THE LEVEL OF USABILITY RATING SCALE FOR THE REAL CASE SCENARIO                                   67    List of Figures       FIGURE 2 1     AN ILLUSTRATION OF THE KINECT FOR XBOX 360 TOUCH LESS INTERFACE   7                     5  FIGURE 2 2     KINECT FOR WINDOWS SENSOR COMPONENTS   12                                                              7  FIGURE 2 3     KINECT FOR WINDOWS SENSOR FIELD OF VIEW   15              
21.  as  Kinect Fusion  a library for 3D scanning and reconstruction  and a library for hand    grip detection which has opened doors for more natural way of interaction     The API of the Kinect for Windows SDK provides sensor s depth  color and  skeleton data in a form of data streams  Each of these streams can produce actual  data frame by polling or by using an event that is raised every time a new frame is  available  17   The following chapters describe particular data streams and their    options     2 3 1  Depth Stream    Data from the Kinect s depth camera are provided by the depth stream  The  depth data are represented as a frame made up of pixels that contain the distance  in millimeters from the camera plane to the nearest object as is illustrated by the    Figure 2 4                                 Figure 2 4   An illustration of the depth stream values     The pixel merges the distance and player segmentation data  The player seg   mentation data stores information about a relation to the tracked skeleton that  enables to associate the tracked skeleton with the depth information used for its  tracking  The depth data are represented as 16 bit unsigned integer value where  the first 3 bits are reserved for the player segmentation data and the rest 13 bits  for the distance  It means that the maximal distance stored in the depth data can be    up to 8 meters  The depth data representation is illustrated by the Figure 2 5     Distance from sensor  m     0 0 4 0 8 3 4 
22.  for PC  Technically  there are only slight differences between both  versions  however  the official Software Development Kit from Microsoft limits the  support of Kinect for Xbox 360 for development only  The most important  difference between Kinect for Xbox 360 and Kinect for Windows is especially in an  additional support of depth sensing in near range that enables the sensor to see    from 40 centimeters distance instead of 80 centimeters     2 2 1  Inside the Kinect    The Kinect device is primarily based on a depth sensing technology that con   sists of an Infra Red  IR  camera and IR emitter positioned in a certain distance  between them  The principle of the depth sensing is an emitting of a predefined  pattern by the IR emitter and a capturing of its reflected image that is deformed by  physical objects using the IR camera  The processor then compares the original    pattern and its deformed reflected image and determines a depth on the basis of    variations between both              The resulting depth image has a horizontal reso   lution of 640 pixels  vertical 480 and depth resolution of 8 meters divided by    millimeters     The device is additionally equipped with the color  RGB  camera with up to  1280x960 pixels resolution  which may be used as another data source for  recognition  Other device s component is a multi array microphone for spatial  voice input with ability to recognize a direction of a voice source  The device s tilt  angle is possible to set
23.  for user s pose recognition are based on the  Skeletal Tracking  see also 2 3 3  which can recognize and identify each tracked  body part  The tracked body part information consists of its position and identifi   cation and it is expressed as a joint  Such a solution leads us to a concept   illustrated by the Figure 3 7  where we can apply the interaction quality on each  particular joint  The concept allows us to determine how much each body part is  suitable for the interaction  For instance  we could use this information for  instructing the user what he or she should do for avoiding possible undesired    behavior        Figure 3 7   An illustration ofthe quality determination for each particular joint individually  the green joints  have the highest quality  the red joints has the lowest quality      21    3 1 4  Physical Interaction Zone    The touch less interaction is based on a spatial mapping between the user s  hand movements in physical space and the cursor on the screen  The area in front  of the user  where the user s hand movements are used for the mapping  is called  Physical Interaction Zone  PhIZ   User s hand movements within the boundaries of  the physical interaction zone correspond to cursors movements within the  boundaries of the screen  The physical interaction zone spans from around the  head to the navel and is centered on the range of motions of the hand on the left  and on the right sides  14      We could consider two different approaches in spat
24.  gesture  For recognizing more gestures there is a need of designing and  implementing a new recognizer for each one  It means that with a new gesture the  result size of the application grows and also a larger number of algorithms must be  executed to determine if a gesture has been performed  Other  more generic     approaches use machine learning algorithms such as Neural Networks  31  or    31    Dynamic Time Warping  32   These methods are more complicated but       resulting algorithm is more generic and allows the recognition more complicated    gestures     In the following chapters a design of two basic gestures such as a wave ges   ture and swipe gesture is described  Both gestures are designed for algorithmic    detection and demonstrate the basic approach for creating gesture recognition     3 1 7 2  Wave gesture   One of the most common gestures is the Wave gesture  People use wave ges   tures for saying hello or good bye  In the natural user interaction  the wave  gesture can be analogously used for saying the user is ready to begin the experi   ence  The wave gesture has been used by Microsoft and proven as a positive way    of determining user intent for engagement  14      The wave is a gesture with simple movements which makes it easy to detect  using an algorithmic approach  33   From observation of the common way in  user s waving we can notice the relationship between the hand and the arm during  the gesture  The gesture begins in neutral position when th
25.  infor     mation a contour of the user is separated from the rest of the scene and written as    60    ARGB pixels into      WritibleBitmap instance  Then       resulting bitmap is  rendered on the window  The final look of the user s contour is illustrated by the  Figure 3 41     Figure 3 41   An illustration of the user s contour     In case of an inconvenient user s pose  the assistance visualization can give  corresponding advices to the user such as an instruction about which way the user  should move to get back within the sensor s field of view or instruct the user to  turn his body or face toward the sensor  These instructions are shown in a notifi     cation bar bellow the user s contour     Also  the assistance visualization can give a visual feedback for gestures  It  shows instructions of the crucial states for the currently performed gesture  For  example  it indicates which way the users should move their hand in order to make  the wave gesture right  or it indicates whether the gesture was detected or can   celed  This visualization is used also for creating a login visualization that helps the    user to get engaged with the touch less interface     The assistance visualization is implemented as a WPF user control by the class  UserAssistantControl  The control is composed into an overlay window im   plemented by the class AssistantOverlayWindow in order to enable the assis   tant control to be shown above the rest of the running applications  The assistant  
26.  is  used as the image  When the cursor is in an action state  a closed palm hand shape  is used as the image  It helps the users to recognize whether their action is per   formed or they should do the action more clearly  In addition  a progress circle  indicating a timeout for firing an action is added for the point and wait action trig   ger  All three described graphical representations of the cursor s state are illus   trated by the Figure 3 40     Figure 3 40   An illustration of cursor actions  from the left  point and wait timer  grip released  grip pressed     The cursors visualization is implemented as an overlay window  described in  chapter 3 2 7 1  by the class InteractionOverlayWindow  It contains an in   stance of the natural interaction interface and handles its update event  When cur   sors are updated  the visualization invalidates a visual  maps the cursor s position    into the screen space and renders the cursor s graphics on the window     3 2 7 3  Assistance Visualization   A help to achieve the best experience is provided by the assistance visualiza   tion  It shows a contour of the user as it is seen by the sensor and gives the user an  overview about his or her visibility to the sensor  The data for rendering the con   tour are taken from the current depth frame which contains the depth information  along with the player segmentation data representing a relation between the  depth data and the user s tracked skeleton  see also 2 3 1  By combining the
27.  mov   ing around a central point of its movement  For instance  we can assume a shoul   der position as the central point  The design of curved physical interaction zone is  based on mapping the user s hand position into the screen space using angles  between the central point and the user s hand position which results in the arc   shaped trajectory of user s hand movement  As a result  the approach allows the  user move the cursor more naturally by moving his or her hand around his or her  shoulder  According to the natural basis of this physical interaction zone  design of  the user s hand movement doesn t require much concentration and the user    doesn t need to move his or her hand in an unnatural manner     26    3 1 5  Cursor    The basic natural user interaction is based on the possibility of selecting   clicking or dragging controls on the screen in the same way as when we are using  the mouse or touch input  The fundamental principle is using a cursor which rep   resents a location where the user s action is intended to be performed  Chapter  3 1 4 describes the mapping function which determines the cursor s position on  the screen  The function is based on mapping the user s hand position in physical    space into the screen space using a defined physical interaction zone     Inasmuch as the on screen position acquired by the mapping function may  contain inaccuracies caused by the imprecise user pose recognition  we can refine  the cursor s position by adding a f
28.  o    Figure 3 2   An illustration of the intended and       2    Figure 3 3   An illustration ofthe unsuitable  unintended user interaction based on a face angle     scenario for the recognition of the touch less user  interaction     For detecting the user s interaction we can also consider the Interaction  Quality described in chapter 3 1 3  The interaction detector will detect the user s  interaction only when the Interaction Quality of the user s body parts of interest is  higher than a certain limit  Then  the system can use advices for instructing the  user about what he or she must do for better experience  For instance  when the  user turns his head out of the sensor  the system tells the user that he or she  should turn the head toward the sensor  or when the user s right hand comes out   side the sensor s field of view the system instructs the user to move to the left in    order to get the user s hand back into the field of view     The described concept of detecting the user s interaction prevents the situa   tion where  for example  the interacting user walks out of the field of view and the    tracking system is not able to track his or her movements correctly  As a result this    18    solution ensures that      user s interaction will be performed in the best condi     tions dependent on the capabilities of the NUI device        Turn your body Move away from  toward the sensor Move to the left the screen    Figure 3 4   An illustration of advices for helping 
29.  to inform about the change  In case of at least one sensor is connected at the time  the collection is instantiated  the event for informing about the status change is  raised for the connected Kinect source with a current sensor s state  This  implementation is advantageous because it doesn t require double check of the  connected device during the application start as it requires in case of an  implementation of the native sensor collection  Everything what is needed for the  KinectSource initialization is to register the KinectSourceStatusChanged  event and initialize the source in the event handler method as it is described by the    following code strip     44      public void Initialize    a     KinectSourceCollection Sources KinectSourceStatusChanged     Sources KinectSourceStatusChanged       private void Sources KinectSourceStatusChanged object sender     KinectSourceStatusEventArgs e          switch  e Status   1   case Microsoft Kinect KinectStatus Connected      kinect source initialization  e Source Initialize        enables depth source  e Source DepthSource Enabled     enables color source  e Source ColorSource Enabled   true      enables skeleton source  e Source SkeletonSource Enabled   true   break    case Microsoft Kinect KinectStatus Disconnected      enables depth source  e Source DepthSource Enabled   false      enables color source  e Source ColorSource Enabled   false      enables skeleton source  e Source SkeletonSource Enabled   false      kinect so
30.  trigger   Planar physical interaction zone  with Grip action trigger   Curved physical interaction zone  with Point and Wait action trigger     Curved physical interaction zone  with Grip action trigger     The application is implemented as one executable KinectInteractionApp exe    which takes a path to the configuration XML file as its argument  The XML file con     63    tains the entire set of the configurable variables           touch less interaction   There have been created four configuration XML files each for one configuration    described above     3 3 2  Touch less Interface for Windows 8    A prototype aimed at using the touch less interactions in the real case sce   nario with Windows 8 operating system  The Windows 8 has been designed espe   cially for touch devices  which makes it a suitable candidate for evaluating the  user s subjective experience in using the touch less interactions with the current    applications and UI     The application consists of the touch less interface including gestures  its  integration with the system and its visualization  The touch less interactions are  based on the Curved physical interaction zone  see also 3 1 4 2  and Grip for trigger   ing the actions  see also 3 2 4 5  The integration with the operating system is done  via the Touch Injection API available only in the Windows 8  see also 3 2 6  The  visualization consists of the visualization of the cursors and assistance control    described in chapter 3 2 7     The Wi
31.  user s experience in dragging objects similar to the  previous one  In this test the user moves objects from the middle of the    screen into the screen s corners     A scenario aimed on the user s experience in scrolling among a large num   ber of items in a list box  As the list box control is used a touch control  from the Microsoft Surface SDK 2 0  35  enabling to scroll the items using  touch gestures  The test is designed in such a way the user must select one  given item which is located in the first third of the list  This particular loca   tion is chosen due to evaluation of the precision during the scrolling the  items  At the beginning  the user doesn t know how far the object is so he or  her starts to list through the list quickly  but then the item shows up and the    user must response in such a way he or she is able to click on it     A scenario evaluating user s experience in using multi touch gestures with  the touch less interactions  The test uses the WPF Bing Maps control  36   It  supports multi touch gestures such as pan and pinch to zoom combined    with a rotation     The prototype combines two ways of action triggering  described in chapter    3 1 6  with two types of the physical interaction zone  described in chapter 3 1 4  In    the result the test application creates the following four prototypes  each with a    different configuration of the touch less interactions     1  2   3   4    Planar physical interaction zone  with Point and Wait action
32.  using a motor in range from  27 to 27 degrees which  increases a final vertical sensor s field of view  Additionally  the device contains a  3 axis accelerometer primarily used for determining a device s tilt angle but it can  be used for additional further applications  Figure 2 2 describes a layout of the    Kinect s components     IR Emitter Color Sensor    IR Depth Sensor    Tilt Motor       Microphone Array    Figure 2 2   Kinect for Windows sensor components   12     2 2 2  Field of View    Because the sensor works in many ways similarly to a camera  it also can see  only a limited part of the scene facing it  This part of the scene that is visible for the  sensor  or camera generally  is called Field of View  FOV   13   The sensor s FOV for  both depth and color camera is described by the following vertical and horizontal  angles in  14   The horizontal angle is 57 5 degrees and the vertical angle is 43 5  degrees  The vertical angle can be moved within range from  27 to  27 degrees up  and down by using the sensor tilt  Additionally  the depth camera is limited in its  view distance  It can see within range from 0 4 meter to 8 meters but for the prac   tical use there are recommended values within 1 2 meter to 3 5 meters  In this  range the objects are captured with minimal distortion and minimal noise  The    sensor s FOV is illustrated by the Figure 2 3        Figure 2 3   Kinect for Windows sensor field of view   15     2 2 3  Software Development Kits    There ar
33.  widest range of users in order to evaluate its usability  The iteration  process is illustrated by the Figure 3 38  On the basis of evaluation the usability  tests a new iteration is initiated  The final setup will never be ideal for all users   but through conducting frequent usability tests the final setup will be a    compromise that works for the most people     3 2 5  Integration with WPF    The integration of the touch less interface with the WPF application is done  via implementing the abstract class TouchDevice that creates a base for any  touch input of the WPF framework  The abstract class provides methods for re   porting the down  up and move actions  These methods are intended to be called    by the particular implementation of the touch input     The WPF touch input base for the touch less interactions is implemented by  the abstract class NuiTouchDevice  This class implements basic logic for    switching between mouse and touch events          in order to handle actions via mouse when   9 Down  called    the touch events have not been handled by       the application  Whether the touch events    Raising Touch Down    have been handled is indicated by the flag    2  by i  eportDown          value returned by methods       ReportDown    ReportUp             ReportMove    The invoking of the    Event has been  handled    Raising Mouse  Down event    mouse operations is done through the    Windows API that is wrapped by the static             class MouseWin32  Th
34. 3 a  quality of each particular joint is evaluated  If the user s face angle  body angle and  a quality of joints is within specified ranges the interaction is indicated as detected   In case of an insufficient quality the recognizer generates a list of advices  indicating what the user should do for a better interaction experience       Advices   InteractionAdvices                      Interactioninfo    s InteractionAdvi      Class Enum    Properties None    Face  TrackedFace StepLeft     Skeleton   Skeleton StepRight  M UserBodyAngle   double StepForward       UserDistanceFromFieldOfView   Bounds StepBack     UserinFrontOfSensor   bool LookAtTheSensor      UserInteractionDetected   bool TurnBodyToSensor     UserPosition   Point3D  M UserViewAngle   double            ethods  HasAdvice     bool            JointInteractionQuality   SkeletonJointQualityInfoCollection         PositionQualityInfo A    Class    QualityLevels  Enum             7 Properties    QualityLevel  IsExcelent   bool  gt   Poor  IsGood   bool Good  IsPoor   bool Excelent  OverallQuality   double  QualityBounds   Bounds3D  XQuality   double  YQuality   double  ZQuality   double  7 Methods       PositionQualityInfo                                        Figure 3 30   A class diagram describing an object model of the interaction info data structure          these information are stored in an instance of      InteractionInfo  class and they are passed on for an evaluation in further implementation  A class  dia
35. 4  Physical Interaction Zone          3 1 4 1  Planar Interaction 700                                                                                                                                                               22  3 1 4 2  Curved Interaction Zone         iecit serpente beri icut bte                     24  3 1 4 3  Comparison of the Physical Interaction Zone Designs                   eene 26  31 5  CUPSO Tice enc ae                                                           27     1 6             EISE LEID                                                                    29  3 1 61  Point and Wait    pedes ei cedspetecer titus              29    26 246               RP 30  3 1 7  Gestures  3 1 7 1  Designing a                                                                                                                                                                            30  3 1 7 2  ETC                                      32  321   3  S WIDOg BSUUTO cincta AE ben          recetas                                  33    3 2  Implementation  3 21  Architecture                                                  ester                    3 2 2  Data Struchir  s                                                                      35   3 2 2  T  Depth                                                                                                                                                35       3 2 2 2         Ere seuacsuacun undam uci catnitn actin nec
36. 8  c    Default 5     R      Unknown  ange x     5       Near 5  R    Unknown  ange                Figure 2 5   An illustration of the depth space range     The depth frame is available in different resolutions  The maximum resolution  is 640x480 pixels and there are also available resolutions 320x240 and 80x60    pixels  Depth frames are captured in 30 frames per seconds for all resolutions     The depth camera of the Kinect for Windows sensor can see in two range  modes  the default and the near mode  If the range mode is set to default value the  sensor captures depth values in range from 0 8 meter to 4 0 meters  otherwise  when the range mode is set to near value the sensor captures depth values in range  from 0 4 meter to 3 0 meters  According to the description of depth space range  described in  18  the maximal captured depth value may be up to 8 0 meters in    both range modes  However  quality of the depth value exceeding a limit value of    10    4 0 meters in default mode and value of 3 0 meters in near mode may be degraded    with distance     2 3 2  Color Stream    Color data available in different resolutions and formats are provided through  the color stream  The color image s format determines whether color data are  encoded as RGB  YUV or Bayer     The RGB format represents the color image as 32 bit  linear X8R8G8B8   formatted color bitmap  A color image in RGB format is updated at up to 30 frames  per seconds at 640x480 resolution and at 12 frames per secon
37. 9    Frame structure as      parameter  Depth pixel data are copied into the internal  buffer and then each pixel is decomposed into the depth and user index  component  see also chapter 2 3 1 for the depth pixel s format description  When  the depth pixel data processing is done a new instance of the DepthFrame data  structure is created on the processed data and it is passed on by raising an event    DepthFrameReady     The KinectDepthSource class also    provides properties that describe physical ian SS    Class    parameters of the depth sensor such as a        Properties   Enabled   bool   Format   DepthImageFormat            value of the minimal or maximal depth    which the sensor is able to capture and a                     E       MaximumbDepth   int   value of the nominal horizontal  vertical                         NominalDiagonalFieldOfView   double  and diagonal field of view in degrees  The NominalFocalLengthinPixels   float  NominalHorizontalFieldOfView   double  NominalInverseFocalLengthInPixels   float  NominalVerticalFieldOfView   double    RangeMode   DepthRange    class also provides a property for selecting  between default and near range mode of    Resolution   Size  the depth sensor  see also chapter 2 3 1  Sensor   KinectSensor    TooFarDepth   int  TooNearDepth   int                                           Before      depth image data pro     UnknownDepth   int  Methods    cessing can be started the depth source     Wdnectbectiscurces      MapDepthF
38. Analysis    In this chapter all important approaches for the realization of the touch less  interface are described and it is explained why the particular methods have been    chosen     3 1 1  Kinect Device Setup    For the best experience  the environmental conditions and sensor s placement  are crucial  The sensor is designed to be used inside and at places with no direct  and minimal ambient sunlight that decreases the depth sensor s functionality  The  location of the sensor should be chosen regarding the intended interaction dis     tance or the place of application     There should be enough space in front of the sensor for the intended number  of engaged users and one should prevent other people from coming between the  engaged user and the sensor  for example  by using a sticker on the floor to indi   cate where the user should stand  or by roping off an area so that people walk    around     The sensor s ideal placement is in the height of user s shoulders and at the  center of the screen  Due to the diversity of the human s body  the ideal placement  is not reachable  The recommended setup is at the center of the screen and above  or under the screen depending on the screen s vertical size  The situation is illus   trated by Figure 3 1   15     16    Figure 3 1   An illustration of the Kinect s setup     3 1 2  Interaction Detection    In the real scenario the NUI device is capturing the user s movements all the  time even when the user is not intending to interact
39. Equation 3 2      v  5                9 5   Bors        Equation 3 2   A formula      Curved Physical Interaction Zone mapping     25    3 1 4 3  Comparison of the Physical Interaction Zone Designs  Each of the two different approaches in spatial mapping between the user s  hand movements in physical space and the cursor on the screen  see also 3 1 5     possess different behavior of the cursor s movement on the screen     In the case of the Planar Physical Interaction Zone the cursor position on the  screen corresponds to the user s hand position in physical space  It means that the  trajectory of the cursor s movement on the screen corresponds exactly to the tra   jectory of the user s hand movement in the physical space without any changes   For the user  this effect may be in some aspects unnatural because the user must  move his or her hand along the XY plane which in the result requires more  concentration in combination with a need for moving the hand in a certain way  that it is within the boundaries of the rectangular area  This movement could be  complicated due to the fact that the user cannot see how far the rectangular area is  and according to it the user must concentrate on the hand s distance from his body    all the time     The other approach is using the Curved Physical Interaction Zone and in con   trast to the Planar Physical Interaction Zone it considers the natural movement of  the user s hand  This natural movement is based on the fact that the hand is
40. It can be done by setting the  Enabled property to t rue which initial   izes the sensor s RGB camera data stream   In default the color source is disabled so  the first step before performing a color  image data processing is its initialization    by setting the Enabled property to t rue     A class diagram describing a Kine   ctColorSource object representation is    illustrated by the Figure 3 24     3 2 3 3  Skeleton Source    KinectColorSource  Class                 Properties    Enabled   bool     Format  ColorlmageFormat    Height  int     IsSupported   bool     NominalDiagonalFieldOfView   double     NominalFocalLengthInPixels   float     NominalHorizontalFieldOfView   double     NominallnverseFocalLengthInPixels   float      NominalVerticalFieldOfView   double      Sensor   KinectSensor  M Width  int    Methods        KinectColorSource    9  OnColorFrameReady     void      ProcessColorlmage     ColorFrame      Events       ColorFrameReady   EventHandler lt ColorFrameEventArgs gt              Figure 3 24   A class diagram describing an object  model of the color data source     Logic for a processing of skeleton data obtained from the Kinect s skeleton    data stream is implemented by the KinectSkeletonSource class  The pro   cessing is handled by the method ProcessSkeletonData   that passes a    native skeleton frame represented by the Microsoft Kinect Skeleton     Frame structure as its parameter   Skeletons data are copied into the  internal buffer  The processing 
41. LEY  Beginning Kinect Programming with the  Microsoft Kinect SDK  New York  Springer Science  Business Media New York   2012  ISBN 13  978 1 4302 4104 1     4  CORRADINI  ANDREA  Dynamic TimeWarping for Off line Recognition of a Small    http   citeseerx ist psu edu viewdoc download doi 10 1 1 200 2035 amp rep r  ep1 amp type pdf  Beaverton  Oregon   Oregon Graduate Institute     5  Touchless Interaction in Medical Imaging  Microsoft Research   Online   MICROSOFT   Cited  04 15  2013   http   research microsoft com en     us projects touchlessinteractionmedical      6  Tobii Gaze Interaction   Online  TOBII   Cited  04 18  2013    http   www tobii com en gaze interaction global      7  MICROSOFT  Teaching Kinect for Windows to Read Your Hands  Microsoft  Research   Online  03 2013   Cited  04 17  2013    http   research microsoft com apps video dl aspx id 185502     8  Skeletal Joint Smoothing White Paper  MSDN   Online  MICROSOFT   Cited  04  26  2013      9  Sign Language Recognition with Kinect   Online   Cited  04 18  2013    http   page mi fu berlin de block abschlussarbeiten Bachelor Lang pdf     10  New  Natural User Interfaces  Microsoft Research   Online  MICROSOFT  03 02   2010   Cited  04 30  2013   http   research microsoft com en   us news features 030210 nui aspx     11  Neural Network  Wikipedia   Online   Cited  04 30  2013    http   en wikipedia org wiki Neural network     82    12  Natural User Interface  the Future is Already Here  Design float blog   Onlin
42. ONE            retinere pa atta ox                   nn sip Reo ERE kin Pe           68  FIGURE 3 46     A CHART SHOWING THE RESULTS OF THE LEVEL OF USABILITY FOR POINT AND   WAIT ACTION TRIGGER                             pt tre re pee nep teen dean Pest an          ve               pud 69  FIGURE 3 47     A CHART SHOWING THE RESULTS OF THE LEVEL OF USABILITY FOR GRIP ACTION                    S                                     69  FIGURE 3 48     A CHART SHOWING THE RESULTS OF THE LEVEL OF USABILITY FOR SWIPE   GESTURES 2                      M               70  FIGURE 3 49        CHART SHOWING THE RESULTS OF THE LEVEL OF USABILITY FOR TARGETING   AND SELECTING ITEMS    terni sare easet                     Ela Foe De Rasa        70  FIGURE 3 50     A CHART SHOWING THE RESULTS OF THE LEVEL OF USABILITY FOR USING   WINDOWS 8                     5              71  FIGURE 3 51     OVERVIEW OF THE TOUCH LESS INTERFACE INTEGRATION WITH   GRAPHWORX64   APPLICATION              eere nne       75  FIGURE 3 52     A SCREENSHOT OF THE TOUCH LESS INTERFACE IN THE GRAPHWORX64                             Ex HT 76    81    Bibliography    1  NORMAN  DON  Natural User Interfaces Are Not Natural  jnd org   Online  2012    Cited  04 15  2013   http   www jnd org dn mss natural user interfa html     2  MICROSOFT  Kinect for Windows SDK 1 7 0  Known Issues  MSDN   Online   MICROSOFT   Cited  04 30  2014   http   msdn microsoft com en   us library dn188692 aspx     3  JARRETT WEBB  JAMES ASH
43. The test is divided into two parts     The first part investigates the user s experience in using particular action trig   gers described in chapter 3 1 6 and types of interaction zones described in  chapter 3 1 4  As a testing application  the Test Application  described in chap   ter 3 3 1   is used  The aim of the test is to evaluate the users  subjective levels    of usability and the levels of comfort     The level of usability is evaluated for both types of action trigger  The level  of usability is defined by a rating scale divided into a scale of ten  The rating 9  represents the intuitive experience without requisite need for learning and the  rating 0 represents the worst experience when the interactions are not usable    at all  The rating scale is described by the Table 1        Intuitive Usable Requires a Difficult to use Unusable  habit                                        Table 1   The level of usability rating scale     The level of comfort is evaluated for both types of interaction zone  The  level of comfort is defined by a rating scale divided into a scale of six  The rat   ing 5 represents the comfortable experience without any noticeable fatigue and  the rating 0 represents a physically challenging experience  The rating scale 15  described by the Table 2     66          Comfortable Fatigue Challenging          5 4 3 2 1 0                      Table 2   The level of comfort rating scale     The second part investigates the usability of the touch less in
44. University of West Bohemia  Faculty of Applied Sciences    Department of Computer Science and  Engineering    Master Thesis    Using MS Kinect Device for  Natural User Interface    Pilsen  2013 Petr Altman    Declaration    I hereby declare that this master thesis is completely my own work and that  I used only the cited sources     Pilsen  May 15th 2013        eem  Petr Altman    Acknowledgements    I would like to thank Ing  Petr                Ph D   who gave me the opportunity to  explore the possibilities of the Kinect device and provided me with support and  resources necessary for the implementation of many innovative projects during  my studies  I would like to thank Ing  Vojt  ch Kresl as well for giving me the oppor     tunity to be part of the innovative human machine interfaces development     Abstract    The goal of this thesis is to design and implement a natural touch less inter   face by using the Microsoft Kinect for Windows device and investigate the usability  of various approaches of different designs of touch less interactions by conducting  subjective user tests  From the subjective test results the most intuitive and com   fortable design of the touch less interface is integrated with ICONICS  GraphWorX64   application as a demonstration of using the touch less interactions    with the real application     Contents  TIN FRO DU GLION                                                                                                                        1  
45. able delay  in grip recognition  The grip does not work as well with sleeves or anything that  obstructs the wrist  Grip should be used within 1 5 to 2 meters away from the sen     sor  and oriented directly facing the sensor     In the press interaction  the users have their hand open  palm facing towards  the sensor  and arm not fully extended towards the sensor  When user extends the    hand toward the sensor  the press is recognized          information about the current interaction state is provided through the    Interaction Stream similar to the stream model of the other data sources  26      15    3  Realization Part    In this chapter  the realization of the touch less interface will be described   The realization consists of the design and analysis  implementation  user tests and  description of the touch less interface integration with the real case scenario  The  chapter Design and Analysis describes all important particular approaches and  explains the reason of their choice  The Implementation chapter deals with the im   plementation of the particular approaches described in the Design and Analysis  chapter  The User Tests chapter evaluates tests based on the subjective user s ex   perience by using the touch less interface with different configurations  In the last  chapter Touch less Interface Integration With Iconics GraphWorX64   the integra   tion of the touch less interface with the application from Iconics Company will be    described     3 1  Design and 
46. ace for user s hand  movement classification  touch integration with WPF and Windows 8 input  and    visualization for giving a supportive visual feedback to the user     3 2 1  Architecture    The Touch less Interface is designed and implemented on the three layer ar   chitecture  A data layer along with an application layer is implemented in the  library named KinectInteractionLibrary  A presentation layer is implemented as a  WPF control library and its implementation is located in the KinectInteraction   WPFControls library     The data layer implements a wrapper for encapsulating basic data structures  for more comfortable way of Kinect data processing  Also  it implements logic of  data sources for depth  color  skeleton and facial data input  This layer creates a  generic interface between the sensor s implementation and application layer  The  application layer implements a functionality of the touch less interface  interac   tion detection  interaction s quality system and gesture interface  The presentation  layer is based on WPF and it is implemented on top of the application layer  This  layer provides basic controls for sensor s data visualization and foremost the visu   alization for touch less and gesture interaction  Also  on the same layer an integra   tion of the touch less interface with the user input is implemented  The architec     ture is described by the Figure 3 17        Presentation  Layer    Visualization Integration with the UI    Application  Lay
47. additional 13 points used for 3D mesh  reconstructions  information about 3D head pose and animation units that are  mentioned to be used for avatar animation  The 3D head pose provides infor   mation about the head s X  Y  Z position and its orientation in the space  The head  orientation is captured by three angles  pitch  roll and yaw  described by the Figure  2 10        Figure 2 9   Tracked face points   25  Figure 2 10   Head pose angles   25     2 3 5  Interaction Toolkit    The latest Kinect for Windows SDK version 1 7 came up with Interaction  toolkit  The interaction toolkit can detect a hand interaction state and decides  whether the hand is intended for interaction  In addition it newly includes a pre   defined Physical Interaction Zone for mapping the hand s movement on the screen    for up to 2 users     14    The Interaction toolkit provides an interface for detecting user s hand state  such as grip and press interaction  26   In the grip interaction  it can detect grip  press and release states illustrated by the Figure 2 11  The grip press is recognized   when the users have their hand open  palm facing towards the sensor  and then  make a fist with their hand  When users open the hand again  it is recognized as the    grip release     Figure 2 11   Grip action states  from the left  released  pressed      According to the known issues  27  published by Microsoft  the grip detection  accuracy is worse for left hand than it is for right hand  There is a notice
48. agram describing an object model of the action detector base     3 2 4 4  Point and Wait Action Detector   An action detector for the Point and Wait action  described in chapter 3 1 6  is  implemented by the class PointAndWaitActionDetector  The class is based  on the abstract class ActionDetectorBase and implements logic for detecting    the action by overriding the abstract method OnDetect        50    The point and wait action detector is implemented as a finite state machine   34   Appendix A contains the state chart of the detection algorithm  The action  detector is able to detect primary hand s click and drag and multi touch gesture  zoom and pan       actions are timer based  which means that in order to perform  an action a cursor has to stand still for a certain time  This basic detection of an  action can be divided into two steps  The first step detects whether the cursor  stands still  The second step measures how long the cursor didn t move  There is a  certain threshold for the detection of cursor s movement  When the threshold is  not exceeded a timer is activated  When the timer ticks out an action is detected     Otherwise  if the cursor moves before the timer ticks out no action is detected     The point and wait action detector detects primary hand s click and drag and  multi touch gesture zoom and pan  The click is the simplest one  When there is  only the primary cursor tracked and it stands still for a certain time a click is per   formed on the cursor 
49. al  in opening a new market  The first final version of the SDK was officially released  in February 2012 as a Kinect for Windows SDK along with unveiling a commercial  version of the sensor  Kinect for Windows  The SDK supports a development in        C    VB NET  and other  NET based languages under the Windows 7 and later oper   ating systems  The latest version of the SDK is available for free on its official  website  16      The Kinect for Windows SDK started by its very first beta version that was  released in July 2011  The beta was only a preview version with a temporary  Application Programming Interface  API  and allowed users to work with depth  and color data and also supported an advanced Skeletal Tracking which  in com   parison with an open source SDKs  did not already require T pose to initialize  skeleton tracking as is needed in other Skeletal Tracking libraries  Since the first  beta Microsoft updated the SDK gradually up to version 1 7 and included a number    of additional functions     The first major update came along with the 1 5 version that included a Face  Tracking library and Kinect Studio  a tool for recording and replaying sequences  captured by the sensor  The next version 1 6 extended SDK by the possibility of  reading an infrared image captured by the IR camera and finally exposed the API  for reading of accelerometer data  The currently latest Kinect for Windows SDK  version 1 7 was released in March 2013 and included advanced libraries such
50. algo   rithm goes through all skeletons and  finds those which are in tracked state   The tracked skeleton s data are used for  creating of a new instance of Skeleton  class and the new instance is inserted  into the list of tracked skeletons  After  all skeletons are processed  a new  instance of the SkeletonFrame class  is created on the basis of the list of    tracked skeletons       KinectSkeletonSource           Class      Properties   Enabled   bool   IsSupported   bool   Sensor   KinectSensor   SmoothParameters   KinectSkeletonSmoothParameters  TrackingMode   SkeletonTrackingMode    KinectSkeletonSource    MapDepthToWorldPoint     Point3D  MapWorldPointToColor     Point  MapWorldPointToDepth     Point3D  OnSkeletonFrameReady     void  ProcessSkeletonData     SkeletonFrame  Restart     void           Events          SkeletonFrameReady   EventHandler lt SkeletonFrameEventArgs gt              Figure 3 25   A class describing an object model of  the skeleton data source     41    Before      skeleton data processing can be started      skeleton source has to  be enabled  It can be done by setting the Enabled property to true which ini   tializes the sensor s skeleton data stream  In default the skeleton source is disabled  so the first step before performing a skeleton data processing is its initialization by    setting the Enabled property to true     A class diagram describing a KinectSkeletalSource object representa     tion is illustrated by the Figure 3 25     3 2 3 4
51. atively from the center axis of the    sector  The zero angle of        is in a center of the sector                 lt              1  1               1             Figure 3 10   An illustration of the curved physical interaction zone  green area      The mapping function for the curved physical interaction zone transforms the  hand s physical coordinates into the angular space and then it is transformed into  the planar screen space  We can divide the mapping function into the following    two steps     24    1  The first step transforms      hand s physical coordinates into      angular  space  y  8  where y is an angle in the range from 0 to a  This angle is the  sum of the angle between the hand and the central point in the XZ plane  with value within range from     2 to    the value already considers      offset  angle          relatively to the user s body angle  Similarly  the angle    is within  range from 0 to    and it also considers an angle between the hand and         central point in the YZ plane     2  The second step transforms the angular space into the planar screen space  by dividing an angle from the angular space by the given angular size of the    sector  After this division we get values within range from 0 to 1 for both  coordinates in screen space        1     1    0       Figure 3 11   An illustration of mapped coordinates in      curved physical interaction zone   The described mapping function for both coordinates is described by the fol   lowing 
52. ave gesture which is required for a login to the interaction  The action  detector evaluates a current user s action such as performing of the down or up    event that is analogous to mouse button down and up event  see also 3 2 4 3     An interaction starts when the tracked user s wave gesture is recognized   Depending on the hand by which the user waved the primary hand is set to left or  right hand  The primary hand is meant to be used in further implementation on  top of the interaction system so it is made accessible through the  PrimaryCursor property  After the user is logged in the interaction system  starts to track the user s hands movement and monitor the user s interaction  quality  Regarding the user s interaction quality  the system processes user s hand  movement into the movement of the cursors on the screen  When the user s  interaction is detected and overall interaction quality is sufficient the system  performs mapping of the hand s position from physical space into the screen space  using a mapping function described by the current type of physical interaction  zone  see also 3 1 4  In case of two cursors on the screen  the resulting position of  the cursors is checked in order to prevent their swapping  see also 3 1 5  Then  an  action for the cursors is evaluated using the given action detector  Finally  the  system raises an event CursorsUpdated where it passes on the updated cursors    represented by a list of instances ofthe TouchlessCursor class    
53. cal representation  However  most UI toolkits used to con     struct interfaces executed with such technology are traditional GUI interfaces     The real crucial moment for NUI has come with the unveiling of the Microsoft  Kinect as a new revolutionary game controller for Xbox 360 console  which  as the  first controller ever  was enabled to turn body movements into game actions  without a need of holding any device in the hands  Initially  the Kinect was  intended to be used only as a game controller but immediately after its release  the  race to hack the device was started which resulted in the official opening of the  device s capabilities of the depth sensing and body tracking to the public  The po   tential of the natural and touch less way of controlling computers extended by  possibilities of depth sensing has found its place in entertainment  3D scanning     advertising  industry or even medicine     The interfaces  commonly referred to as NUI are described further in the fol     lowing chapters     2 1 1  Multi touch Interface    The multi touch interface allows natural interaction by touching the screen by  the fingers  In comparison with the cursor based interface  the user doesn t have  to move the cursor to select an item and click to open it  The user simply touches a  graphical representation of the item which is more intuitive then using the mouse   Additionally  due to an ability to recognize the presence of two or more points of  contact with the surface  t
54. control contains an instance of the natural interaction interface and handles its  update event  When the cursors are updated the visualization invalidates a visual  and renders a contour of the tracked user based on the last available depth frame  data  If any advice is available the assistance control shows it by fading in a notifi   cation bar bellow the user s contour  The final look of the assistance visualization    is illustrated by the Figure 3 42     61       gt     iv Wave For Logi iv Wave F i    StepLeft  i         Figure 3 42   An illustration of the assistance visualization     3 3  Prototypes    As a part of this thesis is an implementation of a set of prototypes demon   strating different approaches in the touch less interface design  There have been  two prototype applications implemented based on the implementation of the  touch less interaction described in chapter 3 2 4   The first prototype is aimed at  subjective tests of using the touch less interactions for clicking  dragging and  multi touch gestures with different types of action triggers and physical interac   tion zones  The second prototype integrates the touch less interactions with the  Windows 8 operating system via the touch injection interface and allows testing    the touch less interface in the real case scenario     3 3 1  Test Application    The first prototype application is aimed at subjective tests of using the touch   less interactions  The prototype is designed for evaluating a subjec
55. creen  In case of the touch interface there is usually  nothing drawn on the screen but the user s finger  itself is used as the visual feedback  Although  these Figure 3 14   A concept of the user s  inputs use different ways of dealing with the visual M MEUS  feedback  they both assure the user about the location where his or her action is    intended to be performed  In this regard  the natural user interaction is similar to    28         mouse input  It doesn t have straightforward mapping for      hand s position  in physical space into the screen space so we need to provide an additional visual  feedback on the screen  A final look of the cursor should correspond to the nature  of the interaction  In case we interact by using hands  the cursor should be illus   trated by a hand shape  Also the cursor s graphics should change during the action    in order to show whether the cursor is or is not in action state     3 1 6  Action Triggering    Analogously to the mouse or touch input we need to detect an action such as a  click  drag  pan  zoom  etc  Because the touch less interaction doesn t provide any  action triggering using a button like the mouse does  or any contact with the  screen like the touch input does  we need to detect the action in other way which    doesn t require physical contact with the computer or any of its peripheries     This chapter describes two different ways of action triggering which can be    used with the touch less interface     3 1 6 1  Poi
56. d in high definition  1280x960 resolution   19     The YUV format represents the color image as 16 bit  gamma corrected linear  UYVY formatted color bitmap  where the gamma correction in YUV space is equiv   alent to standard RGB gamma in RGB space  According to the 16 bit pixel repre   sentation  the YUV format uses less memory to hold bitmap data and allocates less  buffer memory  The color image in YUV format is available only at the 640x480  resolution and only at 15 fps   19     The Bayer format includes more green pixels values than blue or red and that  makes it closer to the physiology of human eye  20   The format represents the  color image as 32 bit  linear X8R8G8B8 formatted color bitmap in standard RGB  color space  Color image in Bayer format is updated at 30 frames per seconds at  640x480 resolution and at 12 frames per second in high definition 1280x960    resolution   19     Since the SDK version 1 6  custom camera settings that allow optimizing the  color camera for actual environmental conditions have been available  These set   tings can help in scenarios with low light or a brightly lit scene and allow adjusting    hue  brightness or contrast in order to improve visual clarity     Additionally  the color stream can be used as an Infrared stream by setting the  color image format to the Infrared format  It allows reading the Kinect s IR  camera s image  The primary use for the IR stream is to improve external camera  calibration using a test pattern observ
57. demonstrates the results of  the whole integration process  Also  in order to create an industrial solid solution  suitable for use in real deployment  the whole implementation part of this thesis  has been rewritten and the resulting source code is licensed by the ICONICS    Company     The overview ofthe integration architecture is described by the block diagram  in the Figure 3 51            Touch less  Visualization    WPF    WPF touch less  Input Device             Kinect for Windows SDK       Figure 3 51   Overview ofthe touch less interface integration with GraphWorX64   application     3 5 3 1  Interactions   The GraphWorx64    application is based on more than one presentation  framework  Although  it is a Windows Forms application  the displays are based on  the WPF due to its vector based graphics  It leads to making the integration able to  handle mouse and touch input simultaneously  It has been resolved by using the  WPF integration using the modified WPF touch input device implementation  described in chapter 3 2 5  According to its design based on the standard mouse  input and WPF touch input  it enables to use the touch less interactions for inter   acting with the standard windows controls and with the WPF controls including its    touch interface     Touch less gestures such as left and right swipe gesture were also integrated         controls composed in the GraphWorX64   display have a pick action which says  what is going to happen when a user clicks on 
58. ding the cursor s current action     The driver also registers the event for recognized gesture  When the swipe  gesture is recognized the driver calls the corresponding keyboard shortcut  There  are two sets of shortcuts  The first one is used for presentation so it simulates a  press of the Page Up key for the left swipe gesture and a press of the Page Down  key for the right swipe gesture  The second set provides keyboard shortcuts for  showing the start screen by using the left swipe gesture and closing an active    application by the right swipe gesture     3 2 7  Visualization    The visual feedback of the touch less interface is implemented on the WPF  and it is divided into two parts  Cursors and Assistance visualization  The following  chapters describe an implementation of both these parts and demonstrate their    final graphical look     3 2 7 1  Overlay Window    The visualization is implemented as an overlay window  In order to overlay  the applications by the visualization window  the window is set to be the top most   Also  the applications must be visible through the overlay window so the window  is made borderless and transparent  In addition  the window is set as a non   focusable and its focus is not set after startup which disables any noticeable    interaction with the window     The WPF window enables to overlay desktop by drawing transparent graphics  which makes the visualization more attractive and also practical in that way the  visualization is not 
59. e    Cited  04 17  2013   http   www designfloat com blog 2013 01 09 natural     user interface      13  APPLE  Multi touch gestures   Online   Cited  04 30  2013    http   www apple com osx what is gestures html     14  Mind Control  How EEG Devices Will Read Your Brain Waves And Change Your  World  Huffington Post Tech   Online  11 20  2012   Cited  04 30  2013    http   www huffingtonpost com 2012 11 20 mind control how eeg     devices read brainwaves n 2001431 html     15  Microsoft Surface 2 0 SDK  MSDN   Online  MICROSOFT   Cited  04 30  2013    http   msdn microsoft com en us library ff727815 aspx     16  Leap Motion   Online  LEAP   Cited  04 18  2013      https   www leapmotion com      17  KinectInteraction Concepts  MSDN   Online   Cited  04 30  2013    http   msdn microsoft com en us library dn188673 aspx     18  Kinect Skeletal Tracking Modes  MSDN   Online  MICROSOFT   Cited  04 26   2013   http   msdn microsoft com en us library hh973077 aspx     19  Kinect Skeletal Tracking Joint Filtering  MSDN   Online  MICROSOFT   Cited  04  26  2013   http   msdn microsoft com en us library jj131024 aspx     20  Kinect Skeletal Tracking  MSDN   Online  MICROSOFT   Cited  04 26  2013    http   msdn microsoft com en us library hh973074 aspx     21  Kinect for Xbox 360 dashboard and navigation  Engadget   Online   Cited  04  30  2013   http   www engadget com gallery kinect for xbox 360   dashboard and navigation 3538766      22  Kinect for Windows Sensor Components and Speci
60. e abstract class  NuiTouchDevice provides virtual  methods  OnDown     OnUp   and  OnMove     These methods are intended to  Down event reported  be called by the further implementation  based on this class in order to handle the Figure 3 39   A block diagram of the WPF touch  device implementation     desired actions  The algorithm is described  by the Figure 3 39     57    The final touch less input device is implemented by      class Touch   lessInputDevice  The class inherits from the abstract class NuiTouch   Device described above  The input device represents one cursor on the screen  which means there are two instances of the input device needed for creating a full  two handed touch less solution  Each input device is identified by its ID  The ID  should be unique and that s why the constructor requires a type of the cursor as a  one of its parameters  As the other parameters the constructor demands a  presentation source specifying on which graphical elements the input device is  intended to be used and an instance of the natural interaction interface  described  in chapter 3 2 4 2  The touch less input device registers update event of the  interaction interface  When cursors are updated the input device stores the last  state of the given cursor type  calculates the cursor s position on the screen and  then it calls the corresponding report methods regarding the cursor s current    action     3 2 6  Integration with Windows 8    The integration with the operating sys
61. e forearm is perpendic   ular to the rest of the arm  If the hand s position exceeds a certain threshold by  moving either to the left or to the right  we consider this a segment of the gesture   The wave gesture is recognized when the hand oscillates multiple times between  each segment  Otherwise  it is an incomplete gesture  From this observation one  can see that for recognition  two tracked skeleton s joints  the hand joint and the  elbow joint are needed  Figure 3 15 illustrates all three gestures  states and their    relationship              Figure 3 15   An illustration describing joints of interest and their relative position for wave gesture  recognition   33     32    3 1 7 3  Swipe gesture   Another basic gesture is the Swipe gesture  This gesture is commonly used for  getting to something next or previous such as a next or previous page  slide  etc   The gesture consists of the hand s horizontal movement from the right to the left  or from the left to the right  Depending on the movement direction there are two  swipe gesture types distinguished  the right swipe and left swipe  Even though the  direction of movement is different  the gestures are recognized on the same prin     ciple     According to the simple movements from which the swipe gesture consists  it  can be easily detected using an algorithmic approach  The user usually makes a  swipe by a quick horizontal movement which is  however  unambiguous because it  may be also one of the wave gesture s segm
62. e of  each joint is checked in order to detect the gesture only if all joints are tracked   Then  the hand position is checked whether it is located within an area above the  elbow  When the hand is within the area  the algorithm begins monitoring the  hands movement  For instance  the right swipe is initiated when the right hand s  horizontal position exceeds a given threshold on the right side relatively to the  right shoulder  The gesture is detected if the hand horizontally exceeds the right  shoulder position by a given threshold to the left  This threshold is usually greater  than the previously mentioned one  If the gesture is not finished in a certain time    the gesture is cancelled and it is not detected     The swipe recognizer is used for implementation of the right and left swipe  gesture  These gestures are represented by classes RightSwipeGesture and  LeftSwipeGesture implemented on the abstract class Gesture  Each gesture  has its own instance of the recognizer  The recognizer is set for detecting the right    or left hand through its constructor     55    An object model of      swipe gesture implementation is described by         Figure 3 37     IDisposable    Gesture  Abstract Class                               E                     M    AC    LeftSwipeGesture        RightSwipeGesture     Class Class    Gesture 3 Gesture    Methods   Methods      Dispose    void                      void     Initialize     void    Initialize     void      Recognize     Ges
63. e several Software Development Kits  SDK  available for enabling a  custom application development for the Kinect device  The first one is a libfreenect  library which was created as a result of the hacking effort in 2010  at the time  when Microsoft had not published public drivers and held back with providing any  development kits for PC  The library includes Kinect drivers and supports a read   ing of a depth and color stream from the device  It also supports a reading of accel     erometer state and interface for controlling motorized tilt     Another SDK  available before the official one  is OpenNI released in 2010  a  month after the launch of Kinect for Xbox 360  The OpenNI library was published by  PrimeSense Company  the author of the depth sensing technology used by Kinect   The SDK supports all standard inputs and in addition includes a Skeletal Tracking   Since its release an OpenNI community has grown and developed a number of  interesting projects including 3D scanning and reconstruction or 3D fingers    tracking     The Microsoft s official SDK for Kinect was unveiled in its beta version in July  2011 and its first release was on February 2012 as the Kinect for Windows SDK  version 1 0  Currently  there is available the newest version of the SDK  a version    1 7  An evolution and features of the SDK are described in the following chapter     2 3  Microsoft Kinect for Windows SDK    Microsoft published an official SDK after it had realized the Kinect s potenti
64. ecognizer              Nested Types       Figure 3 35   An object model of wave gestures     The wave recognizer is used for implementation of the right and left wave ges   ture  These gestures are represented by classes RightWaveGesture and  LeftWaveGesture implemented on the abstract class Gesture  Each gesture  has its own instance of the recognizer  The recognizer is set for detecting the    desired right or left hand through its constructor     54    An object model of      wave gesture implementation is described by       Figure 3 35     3 2 4 8  Swipe Gesture Recognizer   Detection of the swipe gesture  described in chapter 3 1 7 3  is implemented  by the class SwipeGestureRecognizer  The detection algorithm is imple   mented as a finite state machine  34  illustrated by the Figure 3 36     Hand moved horizontally away    Hand moved within detection zone        body in a certain distance Hand moved           horizontally in  a certain distance           Hand is  within  detection  zone                 Gesture    Hand away detected    from body       Hand moved out of the detection zone Timeout    Figure 3 36   A state diagram of the swipe detection     A detection algorithm is implemented inside the method TrackSwipe   that  passes a tracked skeleton as its parameter and returns t rue in case of success and  false in case of non detected gesture  The algorithm uses the hand  elbow and  shoulder joints for the detecting the swipe gesture  First of all  the tracking stat
65. ed from both the RGB and IR camera to more  accurately determine how to map coordinates from one camera to another  Also   the IR data can be used for capturing an IR image in darkness with a provided IR    light source     11    2 3 3  Skeletal Tracking  The crucial functionality provided by the Kinect for Windows SDK is the    Skeletal Tracking  The skeletal tracking allows the Kinect to recognize people and  follow their actions  21   It can recognize up to six users in the field of view of the  sensor  and of these  up to two users can be tracked as the skeleton consisted of 20  joints that represent locations of the key parts of the user s body  Figure 2 7   The  joints locations are actually coordinates relative to the sensor and values of X  Y  Z    coordinates are in meters  The Figure 2 6 illustrates the skeleton space          Shoulder Center    Elbow Left  Shoulder Left  Hip Center              Shoulder Right    ec Hip Left        Sensor       Direction    Knee Left       Foot Left    Figure 2 6   An illustration of the skeleton space  Figure 2 7   Tracked skeleton joints overview     The tracking algorithm is designed to recognize users facing the sensor and in  the standing or sitting pose  The tracking sideways poses is challenging as part of  the user is not visible for the sensor  The users are recognized when they are in  front of the sensor and their head and upper body is visible for the sensor  No spe     cific pose or calibration action needs to be taken 
66. ed usability was reported with using of multi touch gestures  and gesture for scrolling by users which were not familiar with using multi touch  gestures  Foremost  the test has shown that using the multi touch gesture for    zoom could be difficult due to a need for coordinating both hands on the screen     3 4 3 3  The Level of Usability for Real Case Scenario   Figure 3 48 shows the results of using the touch less interface for controlling  the real Windows 8 applications  The test was investigating eight types of interac   tions  The results have shown that the most intuitive and usable way for control   ling the application was by using left and right swipe gesture  Generally  the touch   less interactions were also evaluated as a usable but quite fatiguing way for  controlling Windows 8 applications  Some difficulties have been observed with  using maps and web browser  There the users had to use multi touch gestures  which  according to the usability tests results in chapter 3 4 3 2  have been  investigated as unintuitive for touch less interactions  More difficult experience  was reported when the users were trying to click on small buttons and items due  to the precision of the Skeletal Tracking and Grip action detector in case of users of    lower height     72    3 4 4  Tests Conclusion    The tests have evaluated that for the best level of comfort  the design of the  Curved Physical Interaction Zone should be used  due to its more natural way of  mapping the hand 
67. en the joint is overlaid by another joint or its position is not  possible to determine exactly the tracking state has a value Inferred  although   the position is tracked it could be inaccurate  Otherwise  when the joint is not    tracked its tracking state is set to a value Not Tracked     A class diagram describing a SkeletonFrame object representation is illus     trated by the Figure 3 20     37                9 IDisposable    TrackedSkeletons   Skeleton   9 IDisposable  SkeletonFrame    gt  gt   Skeleton   SkeletonTrackin      Class Class Enum  7      TrackingState    lt     Methods   Properties NotTracked  9  Clone    SkeletonFrame                 set    int PositionOnly  9  Dispose   void      Position   get  set     Point3D Tracked        GetSkeletonByld int skeletonid    Skeleton   TimeStamp   get  set      long     SkeletonFrame Skeleton   trackedSkeletons     Userindex   get  set      int               Methods                         Skeleton 3      Dispose   void   ClippedEdges None      GeUointPoint ointType joint  Size areaSize    Point   3  l Right      RotationY Point3D referencePoint  double angle    Skeleton Left      Skeleton Skeleton nativeSkeleton  long timeStamp  int userIndex  Top      Skeleton Skeleton skeleton  Bottom       Joints  SkeletonJointCollection    SkeletonJoint       TrackingState   20intTrackingst       Class 8 Enum  rb  gt        Properties NotTracked    1    get  set     JointType Inferred  Tracked       Vector   get     Vector3D     X 
68. ents  The swipe gesture detecting algo   rithm should be designed more strictly in order to make the gesture more clear  and reliable  For instance  the right swipe gesture will be recognized by the  designed recognizer as the right hand s horizontal movement performed from a  certain horizontal distance from the shoulder on the right side and moving along  the body to the certain horizontal distance from the shoulder on the left side  Also   the swipe gesture will be detected only when the hand is above the elbow in order  to avoid detecting swipe gesture in situations like when user is relaxed  Based on  the described design of the gesture  it can be seen that for recognition there are  three tracked skeleton s joints needed  the hand  elbow and shoulder  The swipe    gesture design  and the relationship between joints  is illustrated by Figure 3 16        Figure 3 16   An illustration describing joints of interest and their relative position for the swipe gesture  recognition  green area indicates a horizontal movement range that is recognized as a swipe gesture      33    3 2  Implementation    This chapter describes an implementation of the Touch less Interface for fur   ther use in an implementation of the prototypes  see also 3 3   The implementation  of the touch less interface consists of data layer for sensor s data processing and  representation  Touch less interactions using hands for moving cursor and a  several kinds of ways for performing action  Gesture Interf
69. eople use independently in language and  moreover  in the  knowledge in controlling computers  They can use them naturally and learn them  very fast  Even though  innate gestures may have different meanings in different  parts of the world  computers can learn them and translate them to predefined  actions correctly  For example  the most often used gesture is waving  its meaning  is very understandable because people use wave for getting attention to them   Analogously  the wave gesture may be used for login to start an interaction with  computer  Other common gesture is swipe which usually people use in a meaning  of getting something next or previous  The mentioned gestures for wave and swipe  are quite simple to recognize but there is an opportunity to teach computers even  more difficult ones using  for instance  machine learning algorithms and learn    computers to understand a hand write or the Sign Language  8      Lately  the computing performance and electronics miniaturization gave birth  to even more advanced types of touch less Interfaces  One of the most interesting  projects is certainly Gaze Interaction unveiled on CES 2013 by a company Tobii  9    The gaze interaction is using an Eye tracking for enabling naturally select item on  the screen without any need of using any periphery device or even hands and that    all only by looking at the item  Another interesting project is a project Leap Motion    5     10   This sensor is based on      depth sensing but  
70. er    Data Layer       Kinect for Windows SDK       Figure 3 17   A block diagram of the implementation architecture     34    3 2 2  Data Structures    The implementation is based on the data structures which represent data  provided by the sensor  The Kinect for Windows SDK implements its own data  structures but these structures have limited implementation  The data of these  structures is not possible to clone and regarding to its implementation with non   public constructors there is not possible to instantiate them instantly and thus it is  not possible to use them for custom data  This limitations have been overcame by  encapsulating data from these native structures into the own object data  representation  Particular encapsulated data structures are described in the    following chapters     3 2 2 1  Depth Frame    A depth image is represented and implemented by the DepthFrame class   The class contains information about a depth image s format  image s dimensions   time of its capture and above all the depth pixels data and user index data  Depth  data contains data of the depth image that are represented as an array of 16 bit  signed integer values where each value corresponds to a distance in physical space  measured in millimeters  An invalid depth value  1 means that the pixel is a part of  a shadow or it is invisible for the sensor  User index data are represented as a byte  array of the same length as the depth data array  The user index data contain    informa
71. er  The executable takes a    path to the configuration XML file as its argument   C  Win8App bin KinectInteractionWin8App exe config xml    There are prepared two batch files  the first one for using the touch less  swipe gestures for controlling a presentation  the second one for using the touch     less swipe gestures for controlling the Windows 8 UI     e 1 presenter bat       2 windows8 bat    A 2    C  Test Application Screenshots       x       Y                      Figure C 1   Test application   Test scenario with a large button     T   1 8396744s  D   707 765244007503px  0 attempts                         Figure C 2   Test application   Test scenario with a small button     A 3    T   14 63741395  D   329 206622047613px  1 attempts          Figure C 3   Test application   Test scenario for dragging objects from the window s corners to the center of  the screen     T   5 83423725  D   688 472221661847px  2 attempts             Figure C 4   Test application   Test scenario for dragging objects from the center of the screen into the  window s corners     A 4    Click on the number 42       Figure C 5   Test application   Test scenario with a list box     EY                Lh  aN p        rA    Figure C 6   Test application   Test scenario with a multi touch maps        D  Windows 8 Touch less Application Screenshots    Petr         Altman 2    15                Mostly Cloudy  15   3     Chicago Tribune   Debate over  physicality clouds Bulls    faceoff with  Heat      
72. erty     The face tracking is a very time intensive operation  Due to this fact the face  source implements the face tracking processing using a parallel thread  This solu   tion prevents from dropping sensor s data frames because the time intensive  tracking operation is performed in the other thread independently and it doesn t    block a main thread in which the other data are processed     42    As the face tracking is a time intensive         KinectFaceSource    operation it is also CPU intensive  In some               cases there is not required to track a face in    Properties  5 AM   Enabled  bool  each frame so in order to optimize the per     FaceTrackingTTL  int      Framerate  int  formance the face source implements a IsSupported   bool  TE        Mode   FaceTrackingMode  frame limiter that allows setting a required     Sensor  KinectSensor       SkeletonTrackingld   int  tracking frame rate  The lower the frame     methods         9 FaceDataP  Method    void  rate is  the less performance is needed and        came ae   i          KinectFaceSource    the face tracking operation won t slow down     OnFaceFrameReady    void  9  ProcessFaceData     FaceFrame  the system  The target frame rate can be set     TrackFaces   Dictionary lt int  TrackedFace      Events  by the Framerate property     FaceFrameReady   EventHandler lt FaceFrameEventArgs gt         Nested Types       A class diagram describing a Kinect   FaceSource obj ect representation is Figure 3 26   A cla
73. fications  MSDN   Online   MICROSOFT   Cited  04 30  2013   http   msdn microsoft com en   us library jj131033 aspx     23  MICROSOFT  Kinect for Windows   Human Interface Guide  2012     24  Kinect for Windows   Online  MICROSOFT   Cited  04 19  2013      http   www microsoft com en us kinectforwindows      83    25     26     27     28     29     30     31     32     33     34     35     36     37     Kinect Face Tracking  MSDN   Online  MICROSOFT   Cited  04 26  2013   http   msdn microsoft com en us library jj130970 aspx     Kinect Coordinate Spaces  MSDN   Online  MICROSOFT   Cited  04 25  2013    http   msdn microsoft com en us library hh973078 aspx     Kinect Color Stream  MSDN   Online  MICROSOFT   Cited  04 26  2013    http   msdn microsoft com en us library jj131027 aspx     Kinect 3D Hand Tracking   Online   Cited  04 17  2013    http   cvrlcode ics forth gr handtracking      MICROSOFT  Human Interface Guidelines v1 7 0   PDF  2013     GraphWorX64  ICONICS   Online  ICONICS   Cited  04 30  2013    http   iconics com Home Products HMI SCADA Software   Solutions GENESIS64 GraphWorX64 aspx     MICROSOFT  Getting the Next Frame of Data by Polling or Using Events  MSDN    Online   Cited  04 30  2013   http   msdn microsoft com en   us library hh973076 aspx     ICONICS  GENESIS64  ICONICS   Online   Cited  04 30  2013    http   iconics com Home Products HMI SCADA Software   Solutions GENESIS64 aspx     Finite sate machine  Wikipedia   Online   Cited  04 30  2013    https
74. for a user to be tracked     The skeletal tracking can be used in both range modes of the depth camera   see also 2 3 1  By using the default range mode  users are tracked in the distance  between 0 8 and 4 0 meters away  but a practical range is between 1 2 to 3 5  meters due to a limited field of view  In case of near range mode  the user can be  tracked between 0 4 and 3 0 meters away  but it has a practical range of 0 8 to 2 5    meters     The tracking algorithm provides two modes of tracking  22   The default mode  is designed for tracking all twenty skeletal joints of the user in a standing pose  The  seated mode is intended for tracking the user in a seated pose  The seated mode    tracks only ten joints of upper body  Each of these modes uses different pipeline    12    for      tracking  The default mode detects      user based on      distance of       subject from the background  The seated mode uses movement to detect the user  and distinguish him or her from the background  such as a couch or a chair  The  seated mode uses more resources than the default mode and yields a lower  throughput on the same scene  However  the seated mode provides the best way to  recognize a skeleton when the depth camera is in near range mode  In practice   only one tracking mode can be used at a time so it is not possible to track one user    in seated mode and the other one in default mode using one sensor     The skeletal tracking joint information may be distorted due to noise a
75. get set     double    Y  get  set      double    Zi get  set     double    Methods      SkeletonJointUoint jointDat  BoneOrientation boneOrientationD        SkeletonJoint  ointType id  double x  double y  double z  Skeleto      SkeletonJoint SkeletonJoint joint       ToPoint3D    Point3D       Figure 3 20   A class diagram describing architecture of the skeleton frame data structure     3 2 2 4  Face Frame   Information about all tracked faces is represented and implemented by the  FaceFrame class  This class contains a property TrackedFaces which is real   ized as a hash table where the tracked faces are stored under the related skeleton s  id as a key  It allows getting a tracked face for a given skeleton conveniently only    by passing the skeleton s id as a key     Every single tracked face is represented by an instance of the TrackedFace  class  The class describes a tracked face by its rectangle in depth image coordi   nates  see also 2 3 1  by its rotation  described in chapter 2 3 4  position in physical    space and its 3D shape projected into the color image     A class diagram describing a FaceFrame object representation is illustrated    by the Figure 3 21     38    9 IDisposable   TrackedFaces   Dictionary lt int  TrackedFace gt  Q IDisposable             FaceFrame A  gt  gt  TrackedFace A  Class Class      Methods   Properties  9   FaceFrame      FaceRect   get  set      Rect  Clone     FaceFrame    Projected3DShape   get  set     Point    Dispose     void    Ro
76. gram describing the InteractionInfo object representation is illustrated by  the Figure 3 30     The advices are evaluated on the worst found quality among all joints and on  whether the face angle and body angle are within their ranges  When the user is    not looking toward the sensor in the certain range of angle  the recognizer    47    generates an advice saying that      user should look at      sensor  Similarly  when  user turns his or her body out of the certain angle from the sensor  the recognizer  generates an advice notifies that the user should turn his body back toward the  sensor  The interaction quality is used for evaluating to which side the user is  approaching too close and on the basis of this information the recognizer can  generate an advice saying which way the user should move in order to stay within    the sensor s field of view     3 2 4 2  Touch less Interactions Interface   A purpose of the touch less interactions interface is to implement logic for  using the user s hands for moving the cursor on the screen in order to allow basic  operations similar to multi touch gestures  see also 2 1 1  The touch less interac   tion interface is implemented by the TouchlessInteractionInterface    class     The touch less interactions interface is based on the interaction recognizer   gesture interface and a given action detector  The interaction recognition is used  for detection of the user s interaction quality  The gesture interface enables to  detect a w
77. her hand in a fist  It s  very simple to use because  for instance  for clicking it is the natural hand gesture    and it s also easily understandable     The grip action is able to perform click  drag and multi touch gestures such as  zoom and pan  Practically  the grip action may perform any multi touch gesture    using two hands     Recognition of the grip action is based on computer vision and uses a depth  frame and a tracked skeleton as its input  The recognition itself is a difficult prob   lem because it works with noisy and unpredictable data which are affected by the  actual user s pose  It means that the hand shape is not constant due to its pose  facing the sensor and in certain situations it could look same for both states of    action     The Kinect for Windows SDK v1 7 came up with the Interaction Toolkit   described in chapter 1 1 1  which provides  among other things  recognition of the  grip action  The recognizer is based on the machine learning algorithms which are  able to learn and then identify whether the user s hand is clenched in a fist  The  recognition is successful in most cases but still there can be some situations in  which the action could be recognized wrongly  These situations can occur when  the hand is rotated in such a way that it is invisible for the depth sensor  It hap     pens  for example  when the users point their fingers toward the sensor     3 1 7  Gestures    The natural user interface enables a new way of interacting by recogni
78. his plural point awareness implements advanced func     tionality such as pinch to zoom or evoking predefined actions  3      Moreover       multi touch interface enables interaction via predefined mo   tions  usually gestures  Gestures  for example  help the user intuitively tap on the  screen in order to select or open an application  do a panning  zoom  drag objects  or listing between screens by using a flick  Such a way of interaction is based on  natural finger motions and in conjunction with additional momentum and friction  of graphical objects on the screen  the resulting behavior is giving an increased    natural feel to the final interaction     Although the multi touch interface refers to NUI  the interfaces for such tech     nology are designed as a traditional GUI     2 1 2  Touch less Interface    The invention of sensors capable of depth sensing in real time enabled com   puters to see spatially without the need of complex visual analysis that is required  for images captured by regular sensors  This advantage in additional depth infor   mation made it easier for computer vision and allowed to create algorithms such  as Skeletal Tracking  Face Tracking or Hand Detection  The Skeletal Tracking is able  to track body motion and enables the recognition of body language  The Face  Tracking extends the body motion sensing by recognition and identification of  facial mimics  Lastly  the Hand Detection enables tracking fingers  4  or recognizing    hand gestures  5  
79. i cimi decet Breq esee Boer bereit Bote           36       322 23  5keletomErame   oe Uia ene a s eee 37  3 2 2 4  Face Frame   3 2 3  Data SOURCES a  eta eese cedet dent eed d pe DE deve E a              sedo ee  3 2 3 1 DepthiSOUEC   o s etie dS In ta 39  3 2 3 2   010  5           M                                           40                                   41  3 2 3 4                                                                       A a a N        e A          42  3 2 3 5  Kifie ct SOU T      A A E ST 43  3 2 3 6  Kinect Source Collection    niiina                                    44   3 2 4  Touch less Interface uiii eiae aiiai aiia aii ai ia       3 2 4 1  Interaction Recognizer    3 2 4 2  Touch less Interactions                           48  3 2 4 3   Action Detector rsisi                    rto iate otoo                   49  3 2 4 4  Point and Wait Action                                                                                                                                            50  3 2 4 5  Grip Action Detector inisin aeg                                          51  3 2 4 6 Gesture Interface                                        dieron         52  3 2 4 7  Wave Gesture RecognizZer                                                   ette tto tito        tento tite nis 53  3 2 4 8  Swipe Gesture RecogniZer                    eese ten ttnnttnn tnnt tenti ntt ttn tte ttn                              3 2 4 9  Iterative NUI Development and Tweaking  
80. ial mapping between the  user s hand movements in physical space and the cursor on the screen  The first  approach is basically based on defining a planar area in physical space  When the  hand moves within this area  the hand s location in physical space is directly  mapped into the boundaries of the screen using basic transformations  The other  approach takes into account the fact that in the physical space the user s hand is  moving around a central point along a circular path  which means that the depth of  the physical interaction zone should be curved  Using this approach the mapping  of the hand s movements in physical space into the boundaries of the screen is    more complicated     For the following description of the physical interaction zone design and its  mapping functions we need to define into what space we want to transform the  user s hand position  After mapping we need to get 2 dimensional coordinates   x  y  which corresponds to the boundaries of the screen  Considering the various  resolution of the screen the x and y values should be independent on the screen s  resolution  The best way is to define the range of these values within the interval     0  1  where position  0  0  corresponds with the left top corner of the screen    and position  1  1  is equivalent to the right bottom corner of the screen     3 1 4 1  Planar Interaction Zone   The design of the planar physical interaction zone defined as a rectangular  area is based on its width and heigh
81. ilter to reduce jittery and jumpy behavior  14    In most cases the application of a filter could result in increasing lag  Lag can    greatly compromise the experience  making it feel slow and unresponsive     Figure 3 12   Cursor s position filtering and a potential lag   14     As a filter for refining the cursor s position  one may use  for example  a simple  low pass filter  29  described by an Equation 3 3  This filter makes the cursor s  movement smoother and in certain extent is able to eliminate undesired jittery  and jumpy behavior but it is at the cost of the resulting lag  The final behavior of  the filter depends on a value of its weight w which specifies an amount of the posi   tion increment  Finding a good weight for balance between smoothness and lag  can be tough    X    Xnew  W             1     w      y   new  W     Yora     1     w    Equation 3 3   Low pass filter with two samples     The final cursor s position can be filtered also in order to increase the accu   racy when the cursor is getting closer to the desired position on the screen  We can  modify the low pass filter in order to filter the cursor s position depending on its    acceleration  In other words  the position will be filtered only when it moves    27    slowly  This is scenario in where      user expects      most precise behavior of the  cursor with the intention of pointing at the desired place  We can even setup the  filter so that it won t filter fast movements at all and will be ap
82. it  In order to integrate touch less  gestures with the display  it has been made possible to associate these gestures    with any pick action  It enables the user  for instance  to change the actual view of    75    the 3D scene by using only swipe gestures without any need for having any device    in the hands or having any contact with the computer     3 5 3 2  Visualization    A touch less interaction visualization has been implemented as an overlay  window  see also 3 2 7 1  This solution enables to use and visualize touch less  interaction over the whole screen  The visualization consists of the cursors visuali   zation  described in chapter 3 2 7 2  and the assistance visualization described in  chapter 3 2 7 3  The resulting look of the touch less interface inside the  GraphWorX64   application is illustrated by the Figure 3 52        Properties  Z  Explorer     5            Object Count P  Runtime Preview  EE      Dynamis   Toolbox  Z  Scrollbars  Custom             7 19 Color       Z Preferences  Z  Symbols  2 Status Bar    Navigation Zoom Grid Ruler Wind ayout S               Natural Ul         Application Mode   Application Style   T Settings    xplorer Preferences    a DetailPanel 1 En          Pump2Stats        9  3D View    General Settings  Grid Settings  J      Runtime Options      New This Display Default Settings  Gi New Rectangle Default Settings    New            Default Settings  N New Line Default Settings      New Polyline Default Settings             
83. joints in order to don t detect the gesture for non tracked or inferred  joints  Then  a vertical position of both joints is compared and when the hand is  above the elbow the algorithm starts to look for the neutral hand s position and  left or right hand s oscillation  The neutral position is detected when the hand s  joint is in the vertical line with the elbow and their horizontal relative position is  in the tolerance given by a threshold  When the hand s position exceeds the  threshold by the horizontal movement to the left or to the right  the recognizer  increments a value of the current iteration  After a certain number of iterations in    certain timeout  the algorithm detects the wave gesture  The state of the gesture       provided by the property GestureState indicates whether the hand is on the  right or left side relatively to the neutral position or whether the gesture detection  failed              IDisposable  Gesture      Abstract Class    A    LeftWaveGesture A   RightWaveGesture       Class Class    Gesture   Gesture    Methods   Methods      Dispose   void     Dispose   void     Initialize     void    Initialize     void     Recognize     GestureResult  5  Recognize     GestureResult      Reset    void     Reset    void              recognizer       recognizer        WaveGestureRecognizer    Class      Properties    CurrentState   int     TrackedHand   Hands  7 Methods      LoadSettings     void      Reset     void     TrackWave     bool     WaveGestureR
84. ly and accurately  For    advanced and non critical tasks two handed gestures should be used     The gesture s design should also consider fatigue caused by performing the  gesture repeatedly  If the users get tired because of a gesture  they will have a bad  experience and will probably quit  One possible way of reducing fatigue is  in the  case of one handed gesture  that it should allow being used for both hands so the    user can switch hands     For successful human computer interaction the requisite feedback deemed  essential  30   The gestures are ephemeral and they don t leave any record of their  path behind  It means  when the user makes a gesture and gets no response or  wrong response  it will make it difficult to him or her to understand why the ges   ture was not accepted  This problem could be overcome by adding an interface for    indicating crucial states of the current progress of the recognition     Design and implementation of a gesture recognizer is not part of the Kinect for  Windows SDK and thus the programmer must design and implement his own  recognition system  There are a couple of approaches used today from bespoken  algorithms to reusable recognition engines enabling to learn different gestures   The basic approach is based on the algorithmic detection where a gesture is rec   ognized by a bespoken algorithm  Such an algorithm uses certain joints of the  tracked skeleton and based on their relative position and a given threshold it can  detect the
85. mensions  In the result we linearly trans   formed the physical position of the user s hand into the screen space as it is shown    in the Figure 3 9        Figure 3 9   An illustration of mapped coordinates into the planar mapped hand space     23    3 1 4 2  Curved Interaction Zone   For the design of the curved physical interaction zone we can use a shoulder  position as a central point of the hand s movements  By using this point as a center  of the hand s movement we ensure its independence on the user s pose in physical  space because the hand moves always relatively to this point  When we have the  central point chosen we need to specify the boundaries of the physical interaction  zone  By the curved nature of the physical interaction zone the user s hand X and Y  position in physical space is not mapped directly on the screen but for the map   ping it uses angles between the user s hand and the central point in physical space  rather than spatial coordinates  Since we use angles instead of spatial coordinates  for a mapping between the user s hand movement in physical space and the cursor  on the screen the area boundaries are defined by angles as well  We define two  sets of these angles  First set for the XZ plane and the second set for the YZ plane   Each set contains two angles  The first angle         defines a size of the sector for  user s hand mapping in physical space and the second angle              specifies  an offset about which the sector is rotated rel
86. movements onto the cursor s position  The hand Grip gesture is  seen as the most intuitive solution for triggering actions  The tests showed that    such a way of interaction is intuitive and usable in most cases     A combination of the Curved Physical Interaction Zone and Grip trigger action  has been tested in the real case scenario with controlling the Window 8 applica   tion  The tests have shown that such a touch less interface design is usable and the  users are able to use it immediately with a minimal familiarization  The tests also  have shown a disadvantage of the current design  The disadvantage is that the    users start to feel fatigue after about 15 minutes of using touch less interactions     3 5  Touch less Interface Integration with ICONICS  GraphWorX64      In this chapter  an integration of the touch less interface  designed in chapter  3 1 with the ICONICS GraphWorx64    application will be described  The integration  demonstrates a practical application of the touch less interactions in the real case    scenario     3 5 1  About ICONICS GraphWorx64       The GraphWorX64   is part of the bundle of the industrial automation soft   ware GENESIS64    developed by the ICONICS company  The bundle consists of  many other tools such as AlarmWorX64    TrendWorx64     EarthWorx     and others   37   GraphWorx64    is a rich HMI and SCADA data visualization tool  It allows  users to build scalable  vector based graphics that do not lose details when it is  zoomed o
87. n  It allows users to build intuitive graphics that depict real world  locations and integrate TrendWorX64   viewers and AlarmWorx64    viewers to  give a full picture of operations  It makes configuring all projects quick and easy   Users can reuse content through the GraphWorx64    Symbol Library  Galleries and  Templates as well as configure default settings to allow objects to be drawn as    carbon copies of each other without additional styling     GraphWorx64    allows creating rich  immersive displays with three dimen   sions  It makes easy to create a 3D world with Windows Presentation Foundation     WPF  and get a true 360 degree view of customer   s assets  It makes it possible to    73    combine 2D and 3D features using WPF with annotations that move with 3D  objects or create a 2D display that can overlay a 3D scene  It utilizes the 2D vector  graphics to create true to life depictions of customer s operations and view them  over the web through WPF     GraphWorX64   is at the core of the GENESIS64    Product Suite  It brings in  content and information from all over GENESIS64    such as native data from SNMP  or BACnet  38  devices  AlarmWorX64   alarms and TrendWorX64 trends  Through  a desire to have a consistent experience all of GENESIS64    takes advantage of the    familiar ribbon menus found throughout integrated applications  such as Microsoft    Office   39     3 5 2  Requirements   The ICONICS Company required using the touch less interactions with the
88. nd in   accuracies caused by physical limitations of the sensor  To minimize jittering and  stabilize the joint positions over time  the skeletal tracking can be adjusted across  different frames by setting the Smoothing Parameters  The skeletal tracking uses  the smoothing filter based on the Holt Double Exponential Smoothing method used  for statistical analysis of economic data  The filter provides smoothing with less  latency than other smoothing filter algorithms  23   Parameters and their effect on    the tracking behavior are described in  24      2 3 4  Face Tracking Toolkit    With the Kinect for Windows SDK  Microsoft released the Face Tracking toolkit  that enables to create applications that can track human faces  The face tracking  engine analyzes input from a Kinect camera to deduct the head pose and facial ex     pressions  The toolkit makes the tracking information available in real time     The face tracking uses the same right handed coordinate system as the skele   tal tracking to output its 3D tracking results  The origin is located at the camera s  optical center  Z axis is pointing toward a user  Y axis is pointing up  The meas   urement units are meters for translation and degrees for rotation angles  25   The    coordinate space is illustrated by the Figure 2 8     13    Sensor       Figure 2 8   An illustration of the face coordinate space     The face tracking output contains information about 87 tracked 2D points  illustrated in the Figure 2 9 with 
89. ndows 8 doesn t allow the applications to overlay Windows 8 UI inter   face usually  Particularly  for intend of drawing the graphical elements above the  applications  the prototype application has needed to be signed by a trusted  certificate and the UI access changed for the highest level of its execution  After  these tweaks the desired visualization of the touch less interactions overlaying  the Windows UI interface has been done  The appendix D illustrates the Windows    8 application overlaid by the touch less interaction visualization     The application enables to use the touch less interactions through the regular  multi touch input  In addition  it makes it possible to use swipe gestures for    executing actions  The application implements two ways of using the gestures     1  Swipe gestures for presentation  The user can list between slides  pictures    or pages via right or left swipe     2  Swipe gestures for the system actions such as showing the Start screen by  using the left swipe and closing the current application by using the right    swipe     64    The prototype application is implemented as an executable application  KinectInteractionWin8App exe which takes a path to the configuration XML file as  its argument  The XML file contains the entire set of the configurable variables for  the touch less interaction including a setting of the Windows 8 integration  The  setting enables to choose a behavior of the integration between using gestures for    a pre
90. nt and Wait    The first Kinect touch less interface for Xbox 360 came up with a simple way  how to trigger the click action  It is based on the principle that a user points the  cursor on a button he wants to click on and then he or she waits a few seconds  until the click is performed  This principle may be called as the Point and Wait    interaction     The point and wait interaction is able to detect primarily the hand s click and  drag and multi touch gestures zoom and pan  The click is the simplest one  When  there is only the primary cursor tracked and it stands still for a certain time a click  is performed on the cursor s position  In case of both tracked cursors there is only  down event raised on the primary cursor instead of the click which allows the  cursor to drag  The dragging ends when the primary cursor stands still for a  certain time again  Multi touch gestures are possible to do when both cursors are  tracked  The primary cursor has to stand still for a certain time and then both    cursors must move simultaneously     Additionally  this kind of interaction requires a visualization of the progress of  waiting in order to inform the user about the state of the interaction  Also  it is im   portant to choose a timing that doesn t frustrate users by forcing the interaction to  be too slow  14      29    3 1 6 2  Grip    One of the possible action triggers based on a natural user s acting is Grip  action  The grip action is detected when user clenches his or 
91. o    provides a method Clone    for creation of its deep copy     A tracked skeleton is represented and implemented by the Skeleton class   The skeleton is identified by its ID stored in the property Id  An association of the  skeleton to the user s information in the depth image is realized by the property  UserIndex which identifies depth pixels related to the tracked skeleton  The  skeleton data are composed of 20 types of joints representing user s body parts of  interest  All of these 20 tracked joints are stored in the skeleton s collection  Joints  In addition  the Skeleton class provides a property Position con   taining a position of the tracked user blob  33  in physical space  The property  TrackingState contains information about a state of skeleton s tracking  If the  skeleton is tracked  the state is set to a value Tracked  when the skeleton is not  tracked but the user s blob position is available  the state has a value  PositionOnly  otherwise the skeleton is not tracked at all and the state is set to  a value NotTracked  In case of the user s blob is partially out of the sensor s field  of view and it s clipped the property ClippedEdges indicates from which side  the tracked user blob is clipped     The joint is represented and implemented by the SkeletonJoint class  The  class contains a position of the joint in physical space  A tracking state of the joint  is stored in property TrackingState  If the joint is tracked the state is set to a  value Tracked  Wh
92. oid      Recognize Skeleton skeleton    GestureResult        Dispose   void        GestureInterface KinectSkeletonSource skeletonSource       Reset  void      RemoveGesture Type gestureType    void    gt  Events    Events      StateChanged   EventHandler lt GestureEve            GestureRecognized   EventHandler  GestureResultEvent        GestureStateChanged   EventHandler lt GestureEventArg       A             Figure 3 33   An object model of the gesture interface     3 2 4 7  Wave Gesture Recognizer    Detection of the wave gesture  described in chapter 3 1 7 2  is implemented by  the class WaveGestureRecognizer  The detection algorithm is implemented as    a finite state machine  34  illustrated by the Figure 3 34     Timeout tick       Hand moved to the left          Hand moved within neutral zone    V    Hand moved within  neutral zone       detected       Neutral    Right position  Hand moved out of ee netural zone           moved to the right         Timeout  F    ailure          Hand is under the elbow    Figure 3 34   A state diagram of the wave gesture detection          A required number of  iterations has been done        A detection algorithm is implemented inside the method TrackWave   that  passes a tracked skeleton as its parameter and returns a value true in case of    success and a value false in case of non detected gesture  The algorithm uses         53    hand and      elbow joints for detecting      wave  First of all  it checks a tracking  state of both 
93. on zone has been found as  a more comfortable design than the planar physical interaction zone  The tested    users were more comfortable with the possibility of relaxing the hand at their side    71    and more natural movements of their hand during      interaction  The only diffi   culty that was reported was the variable cursor s behavior when the user was ap   proaching the left or the right side of the screen  It has shown that such behavior is    caused by the spherical design of the interaction zone     3 4 3 2  The Level of Usability    The level of usability has been investigated for two types of action triggers  the  Point and Wait action trigger  3 1 6 1  and Grip action trigger  3 1 6 2   Figure 3 46  shows that the designed point and wait action trigger for doing a click is usable but  it was not fully intuitive for most of the tested users  The rest of them reported  that such a kind of interaction requires it to become a habit for them  The results  for other types of interactions such as drag  scroll  pan and zoom show that such a    design of action triggering is not suitable for performing advanced interactions     The Figure 3 47 shows that the Grip action trigger has been reported as a  much more intuitive and usable solution for performing all kinds of investigated  interactions  For most of the tested users  without any knowledge of such a way of  action triggering  the grip was the first gesture they used for clicking or dragging  items  More complicat
94. own   method in order to raise a down event  Then  when the grip  action is released  the OnCursorUp    method is called in order to notify about an  up event  In the result the Grip action detector provides functionality similar to  traditional multi touch and allows performing any multi touch gesture that uses    up to two touches     3 2 4 6  Gesture Interface  A system for detection and classification of user s gestures  described in chap   ter 3 1 7  is represented by the Gesture Interface  The architecture of the gesture    interface consists of the class GestureInterface and the abstract class          Gesture  The GestureInterface class contains a list of instantiated gestures  and provides an interface for specifying which gestures the interface should use   The gestures for detection are specified by calling a method AadGesture   that  passes a type of the gesture as its parameter  The gesture may be removed from    the list using    method RemoveGesture    that passes a type of the gesture to       remove as its parameter  When the gesture is detected successfully the gesture  interface raises an event GestureRecognized that passes an instance of the  GestureResult class containing an instance of the recognized gesture and the  related skeleton for further association of the gesture with a specific tracked  skeleton  A constructor of the gesture interface requires an instance of the  skeleton source  The gesture interface registers the SkeletonFrameReady  event of 
95. plied only for slow  movements  This may be done by setting the filter s weight dynamically according  to the actual cursor s acceleration  A function of the filter s weight is illustrated by  Figure 3 13 and described by Equation 3 4 where acc is cursor s acceleration  w is  the weight  Wmax is the upper limit for a resulting weight        c is an acceleration    threshold specifying from which value the weight is modified            4             Xnew      oia                   w acc    min Wmax  acc   c     Equation 3 4   A weight function for the modified low pass filter     w acc             lt   gt                                    0            Figure 3 13   A weight function for the modified low pass filter dependent           cursor s acceleration     In case of two cursors appearing simultaneously on the screen  a problem may  occur  for example  when we move the right hand to the left side and the left hand  to the right side of the interaction zone  We notice that cursors are swapped      the  screen  This may lead to the confusion of the user and make the interaction incon   venient  In order to prevent such a behavior the mutual horizontal position of the    cursors should be limited so the cursors won t swap     The current input methods consider a visual  feedback that tells the user at which position his or  her intended action will be performed  For instance   for such a visual feedback the mouse uses a cursor  usually represented by an arrow drawn on the  s
96. r to enable    the interactions to be used for common tasks     In the second part of the thesis  subjective user tests were performed in order  to investigate which approach for designing the touch less interactions is the most  intuitive and comfortable  The results have shown that the best design is a combi   nation of curved physical interaction zone  3 1 4 2   based on the natural move     ment of the human hand  and grip action triggering  3 1 6 2      The third part of the thesis was dealing with an integration of the designed  touch less interface with the ICONICS GraphWorX64   application as a demonstra   tion of using the touch less interactions in real case scenario  The final implemen   tation of the touch less interface for the application has been based on the results    from the performed subjective user tests     As a result of this thesis a touch less interface has been designed and imple   mented  The final implementation was based on the subjective user tests that  evaluated the most natural approaches for realization of the touch less interac   tions  The resulting touch less interface has been integrated with the             5  GraphWorX64   application as a demonstration of using the touch less interac   tions in a real case scenario  According to these conclusions  all points of the thesis    assignment have been accomplished     77    List of Abbreviations    MS   Microsoft   PC   Personal Computer   WPF   Windows Presentation Foundation  NUI   Natural
97. rameToColorFrame     Point       MapDepthFromWorldPoint     Point       MapDepthToColorPoint     Point           9     has to be enabled  It can be done by  setting the Enabled property to true    MapDepthToWorldPoint     Point3D  OnDepthFrameReady    void    which initializes the sensor s depth data ProcessDepthImage     DepthFrame         Events  stream  In default the depth Source 1S    DepthFrameReady   EventHandler lt DepthFrameEventArgs gt        disabled so the first step before perform     Figure 3 23   A class diagram describing an object    ing a depth image data processing is its modelof th   depth data source     initialization by setting the Enabled  property to true     A class diagram describing a KinectDepthSource object representation is    illustrated by the Figure 3 23     3 2 3 2  Color Source    A color image obtained from the sensor is processed into the    KinectColorFrame data structure using logic implemented by the Color        Source class  The processing is handled by the method ProcessColor   Image    that passes a native color image represented by      Microsoft Kin     ect ColorImageFrame structure as its parameter  Color pixel data are copied    40       into      internal buffer which is then used for creating of a new                         instance  Finally  the new instance of the ColorFrame is passed on by raising an    event ColorFrameReady     Before the color image data pro   cessing can be started the color source has  to be enabled  
98. relatively to the sensor  28   We specify a limit  angle in which the interaction detector will detect that the user is intending to in   teract with the system  Additionally  when the head pose angle will be getting  closer to the limit angle  the interaction detector informs the user about that the    interaction might be interrupted  Otherwise  beyond this angle all user interac     17    tions will be ignored  The Figure 3 2 describes      usual scenario of the intended    and unintended user s interaction     The facial observation tells us about the user s intention to interact but in the  concept of controlling the computer with the current NUI devices we need to take  into account situations which are not suitable for the recognition of the natural  user interaction  The main restriction is the NUI device s limitation in the ability of  frontal capturing only  21   It means that when the user is not standing facing the  sensor  the user s pose recognition precision decreases  It leads us to avoid such  situations by observing a horizontal angle between the user s body and the sensor   Similarly to facial observation we specify a user s body angle in which the interac   tion detector will be detecting the user s interaction intention and beyond this  angle all user interactions will be ignored  The Figure 3 3 illustrates the issue of  the user s body observation in order to avoid unsuitable situations for the recogni     tion of the touch less user interaction     LE  
99. resented and implemented by the ColorFrame class  The  class contains information about a color image format  image dimensions and  image data  The image data are represented as a byte array  The color image is  stored in ARGB format  it means  the image pixels are stored as a sequence of four    bytes in order blue  green  red and alpha channel     The ColorFrame class provides an in     Disposable             ColorFrame    terface for a basic manipulation with pixel   css    data such as getting and setting pixel color at     properties     a    i i      Data   get  set      byte    given X and Y coordinates and flipping the 1m    Format   get  set      ColorlmageFormat       Height   get  set      int  age  The class also provides a method dp                   5 7 Methods  Clone    for creation of its copy  E       Clone     ColorFrame   ColorFrame byte   data  ColorlmageFormat format   Dispose     void   GetPixel int i    Color   GetPixel int x  int y    Color  Rotate OrientationEnum orientation    void    A class diagram describing a Color     90900009    Frame object representation is illustrated by    the Figure 3 19              Figure 3 19   A class diagram describing the  color frame data structure     36    3 2 2 3  Skeleton Frame   Information about all tracked skeletons is represented and implemented by  the SkeletonFrame class  This class contains an array of currently tracked skel   etons  The class provides a method for getting a skeleton by a given id and als
100. s based on the data layer implementation described in the    previous chapters     3 2 4 1  Interaction Recognizer   The interaction detection  quality determi   nation and advice system for the user s  interaction is implemented by the Inter   actionRecognizer class  The interaction  recognition is done      calling       Recognize   method that passes a depth  frame  tracked skeleton and tracked face as its  parameters  The method returns and instance of  InteractionInfo class that contains infor     mation about recognized interaction  The       InteractionRecognizer class is illustrated  by the Figure 3 29     46             InteractionRecognizer  Class      Properties    Source  KinectSource     Methods      AddAdvice     void      GetPositionQualityLevel     PositionQualityInfo  InteractionRecognizer               o0       QualityFunction     double          e    Recognize    InteractionInfo       Figure 3 29   A class diagram describing an  object model of the interaction recognizer     The recognizer detects      user s face angle using a given tracked face   s yaw  pose and evaluates whether the angle is within the specified range  Similarly  the  recognizer determines the user s body angle using a given skeleton  The angle is  measured between shoulder joints around the Y axis  The recognizer also uses dis   tances of the user s position measured from each side of the sensor s field of view   On the basis of these distances and the equation described in chapter 3 1 
101. s features     2 1  Natural User Interface    The interaction between man and computer has always been a crucial object  of development ever since computers were invented  Since the first computers   which provided interaction only through a complex interface  consisting of buttons  and systems of lights as the only feedback to the user  the human computer inter   actions went through a significant evolution  At the beginning  the computer was  seen as a machine which is supposed to execute a command or a sequence of  commands  The first human computer interface  which enabled users to interact  with computers more comfortably by entering commands using a keyboard  is a  Command Line Interface  CLI   But a need of making work with computers more  intuitive led to the invention of Graphical User Interface  GUI  helping users to use  complicated applications by exploration and graphical metaphors  The GUI gave  birth to the mouse device which allowed to point on any place in the graphical user  interface and execute the required command  We still use this way of the human   computer interaction today  but in recent years the development of the human   computer interaction is directed to a more natural way for using computers which  is called Natural User Interface  NUI      A desire to enable communication with computers in the intuitive manner   such as we use when we interact with other people  has roots in the 1960s  the  decade when computer science noticed a significant ad
102. s position by continuous calling of OnCursorDown    and  OnCursorUp   methods  In case of both tracked cursors there is called only the  OnCursorDown   method in order to allow dragging  The dragging finishes by  calling the OnCursorUp   method when the primary cursor stands still for a  certain time again  Multi touch gesture is possible to do when both cursors are  tracked  The primary cursor has to stand still for a certain time and then both  cursors must move immediately  Then  the OnCursorDown   method is called  for both cursors  The multi touch gesture finishes when the primary cursor stands  still for a certain time and then the OnCursorUp   method is called for both    cursors     3 2 4 5  Grip Action Detector   An action detector for the Grip action  described in chapter 3 1 6 2  is imple   mented by the class GripActionDetector  The class is based on the abstract  class ActionDetectorBase and implements logic for detecting the action by    overriding the abstract method OnDetect        The implementation is based on the Microsoft Kinect Interaction Toolkit  see  also 2 3 5  which is used for a grip action recognition implemented by the class  GripActionRecognizer  The implementation processes current data con   tinuously for detecting a grip press and release actions  A current state of the grip  action can be acquired by calling the method Recognize    on an instance of the    class     51    When grip action is recognized the action detector calls the  OnCursorD
103. sentation or for the system actions     3 4  User Usability Tests    This chapter describes the methodology of the subjective usability tests and  evaluates their results in order to find out which approaches are more and less    suitable for the realization of the touch less interactions     3 4 1  Test Methodology    According to the nature of the NUI  there is no particular test methodology for  an objective evaluation of the level of usability for the concept of the touch less  interactions  The concept could be usable for some people but for other people  might be very difficult to use  It means that the conclusive results can be collected  by conducting usability tests which evaluate the subjective level of usability and    the level of comfort for each tested user     For an evaluation of usability of the NUI  designed and implemented in this  thesis  a test was designed and aimed at user experience in using the touch less  interactions for common actions such as a click  drag  scroll  pan and zoom  The  test uses a setup with a large 60  inches LCD panel with the Kinect for Windows  sensor placed under the panel  The tested user is standing in front of the LCD panel  in a distance of about one and a half meter  The setup is illustrated by the Figure  3 43     65    LCD panel        Kinect     0 8 m     1 5       Figure 3 43   A setup      user usability tests     The test investigates the user s subjective experience in the level of comfort    and level of usability  
104. ss diagram describing an    object model of the face data source     illustrated by the Figure 3 26     3 2 3 5  Kinect Source   All data sources  described in previous chapters  are composed into the  KinectSource class  This class implements logic for controlling the sensor and  also it handles all events of the sensor  The Kinect source provides an interface for  enabling and disabling the sensor by calling the methods Initialize   and    Uninitialize       The most important task of the Kinect source is handling of the sensor s  AllFramesReady event  There are processed all data in the handler method us   ing the corresponding data sources  After data processing of all sources is finished   the Kinect source passes on all processed data by raising its event    AllFramesReady     According to the described implementation  there is not possible to run the  particular data sources individually without using the Kinect source  In practice   there is an instance ofthe KinectSource created and the particular data sources    are accessed through the interface of this instance     A class diagram describing a KinectSource object representation is illus     trated by the Figure 3 27     43          IDisposable               KinectSensor  j Cla                    Sensor                                                      KinectStatus     KinectSource   DepthSource   KinectDepthSource  Enum Class      Class  3 7      Undefined   Properties _ _ _  Disconnected    IsRunning   bool   
105. t  a central point in physical space  a distance  of the area s plane from the central point along the Z axis and the offset of the area  from the central point  As the central point we can use a position of the center of  the shoulders which is lying on the user s central axis  Then we can define the    offset from this point to the center of the rectangular area  The design is illustrated    22    by Figure 3 8  As seen from      following diagram  we get a rectangular area in a  certain distance from user s body and located around the user s hand relatively to  the user s central axis  The last thing is to specify dimensions of the rectangular  area  The width and height values of the area are given in meters  i e  in the physi   cal units  In practice the dimensions should be given as relative values to the    physical size of the user              Z        j i  i i  i i    1         i Shoulder  1      Center  i     o o ae                lt A                         gt      i  Shoulder   distance  Center   offset Y Z    Offset X    distance    Figure 3 8   A planar physical interaction zone design  green area      The following mapping of the user s hand position into the boundaries of the  screen is very straightforward  When the user s hand is within the boundaries of  the physical interaction zone we use the physical X and Y coordinates of the user s  hand  move them to the central point and transform they values into the range  from 0 to 1 by using rectangular area di
106. tation   get  set      Vector3D  FaceFrame Dictionary   int  TrackedFace   trackedFaces     Translation   get  set     Vector3D                                Triangles   get  set      Triangle    thods      Clone   TrackedFace      Dispose   void      TrackedFace Rect faceRect  Vector3D     2             Figure 3 21   A class diagram describing architecture of the face frame data structure     3 2 3  Data Sources   Logic for processing of obtained data from the sensor is implemented by data  sources  There are four types of data sources  DepthSource  ColorSource   SkeletonSource and FaceSource which are additionally composed into the  KinectSource which handles the logic for obtaining data from the sensor  Each  data source implements logic for handling a given data input and processes ob   tained data into the corresponding data structure  When data processing is fin   ished the data source forwards the result data structure through its event based    interface for the further data processing              Generates       Kinect  Toolkit    Kinect for Windows SDK       Figure 3 22   A block diagram describing the data sources architecture and their output data     3 2 3 1  Depth Source   A depth image obtained from the sensor is processed into the DepthFrame  data structure using logic implemented by      KinectDepthSource class  The  processing is handled by the method ProcessDepthImage   that passes a  native depth image represented by the Microsoft Kinect DepthImage     3
107. tem is done via the Touch Injection API  available only in the Windows 8  It enables any application to generate touch mes   sages and inject them into the system s message loop  Microsoft doesn t provide  any  NET wrapper for this API and that s way the wrapper has been implemented  as a part of this thesis  The wrapper is implemented as           project TouchDLL    and the C   class TouchOperationsWin8     The touch injection is initialized by calling a method InitializeTouch    that passes a maximal number of touches as its parameter  A particular touch is  represented by the structure POINTER TOUCH INFO containing information  about its position  state  pressure  etc  When any touch is intended to be injected   an instance of this structure is created  all its attributes specified and then passed    into the system s message loop by calling the method SendTouchInput        The integration of the touch less interactions is implemented by the class  TouchlessDriverWin8  The implementation combines touch injection for  multi touch interaction with the system using the touch less cursors and integra   tion of calling keyboard shortcuts using swipe gestures  The driver s constructor  requires an instance of the natural interaction interface  described in chapter    3 2 4 2  The driver registers update event of the interaction interface  When    58    cursors are updated      driver calculates      cursor s position on      screen and    then injects touches with a state regar
108. teractions in real  case scenario by using it for controlling the Windows 8 UI and applications  As  a testing application  the Touch less Interface for Windows 8  described in chap   ter 3 3 2  is used  The level of usability in the real case scenario is defined by a  rating scale divided into eight degrees  The rating 7 represents the comfortable  and intuitive experience without any noticeable fatigue and the rating 0 repre   sents a physically and practically challenging experience  The rating scale is de   scribed by the Table 3              Intuitive and Usable  no fatigue Usable  fatigue Challenging  comfortable  7 6 5 4 3 2 1 0                            Table 3   The level of usability rating scale for the real case scenario     The test conducts the user s subjective level of usability for the following    Windows 8 applications and common actions     Using swipe gestures for presentation  Using swipe gesture for showing the Start screen  Using swipe gesture for closing an active application    Launching a Windows 8 application    O O O O O    Selecting items in Windows 8 application          Targeting and selecting small items          Using maps    o Using web browsers    A test form used for conducting the user s subjective level of usability  level of    comfort and level in experience using touch less interactions for controlling the    Window 8 Ul is attached in the appendix E     3 4 2  Tests Results    This chapter shows the results of the subjective usabili
109. the skeleton source and in its handler method it handles recognition for    all gestures in the list     The abstract class Gesture represents a gesture s detection logic  A gesture  implemented on the basis of this class must implement all its abstract methods  A  method Initialize   implements an initialization code for the gesture and  provides an instance of the current skeleton source  A method Recognize    implements the bespoken algorithm for gesture detection using passed tracked  skeleton and its result returns in the instance of the GestureResult class  For    resetting the gesture s state there is the method Reset    that sets all variables to       the default values  The class provides the GestureState property informing  about the current detection state  When the gesture detection state has changed    the StateChanged event is raised     52    An object representation of the gesture interface is illustrated           Figure    3 33        9 IDisposable IDisposable    GestureInterface A   1 Gesture       Class   Abstract Class      7           Properties 9  instantiatedGesture       Properties          gt  gt    GestureState   get  set      int      Enabled   get  set      bool     Source   get  set      KinectSkeletonSource       SkeletonOfinterestld   get  set      int     SkeletonSource   get  set      KinectSkeletonSource     Methods   7 Methods       Dispose   void        Initialize KinectSkeletonSource source    void        AddGesture Type gestureType    v
110. tion about which pixel is related to which tracked user     The DepthFrame class provides an interface for a basic manipulation with  depth and user index data such as getting and setting a depth or user index value  at given X and Y coordinates  flipping and cropping depth image  The class also    provides a method Clone    for creation of its copy     A class diagram describing a DepthFrame object representation is illustrated  by the Figure 3 18     35    9 IDisposable    DepthFrame  Class             7 Properties   Data   get  set      short     Format   get  set      DepthImageFormat  Height   get  set      int   Offset   get  set      Point   RawData   get  set     object  TimeStamp   get  set      long  UserIndexData   get  set      byte    Width   get  set      int   ethods    rereeeee    z              DepthFrame     Clone     DepthFrame   Crop Rect3D rect    DepthFrame   DepthFrame int width  int height  long timestamp    DepthFrame short   depthData  byte   userIndexData  DepthImageFormat format  long timestamp   DepthFrame short   depthData  byte   userIndexData  int width  int height  long timestamp   Dispose     void   GetDepth int x  int y    short   GetUserIndex int x  int y    short   Rotate OrientationEnum orientation    void   SetDepth int x  int y  short depth    void   SetUserIndex int x  int y  byte index    void    00000000000          Figure 3 18   A class diagram describing the depth frame data structure     3 2 2 2  Color Frame   A color image is rep
111. tive experience  in using the touch less interface for common actions like clicking  dragging and  multi touch gestures  Multi touch integration with the prototype application is  done by implementing a custom WPF input device described in chapter 3 2 5  In  order to evaluate mentioned subjective experience  the prototype consists of the  following six scenarios  an illustration for each scenario is attached as the    appendix B      1  A scenario with a large button that is moved across the screen and user is  intended to click on it  A positions of the button are chosen with regard to  evaluate user s subjective experience in clicking on the button in standard  and extreme situations such as button located at the corners of the screen    or buttons located too near to each other     2  Ascenario similar to the first one but instead of the large button there is a  small button used in order to evaluate user s subjective experience in    clicking and pointing on very small objects     62    3     A scenario aimed on the user s experience in dragging objects onto the  desired place  The test is designed in such a way the objects must be  dragged from the extreme positions  which are the corners of the screen   and must be dragged into the middle of the screen  In order to avoid user s  first time confusion and help him or her to get easily oriented in the task  there are added visual leads showing which object is supposed to be moved    on which place     A scenario evaluating
112. to discern and track movements of the human body     Starting with the Microsoft Kinect for Xbox 360 introduced in November 2010   the new touch less interaction has unleashed a wave of innovative solutions in the  field of entertainment  shopping  advertising  industry or medicine  The new inter   action revealed a world of new possibilities so far known only from sci fi movies    like Minority Report     The goal of this thesis is to design and implement the touch less interface  using the Microsoft Kinect for Windows device and investigate the usability of vari   ous approaches in different designs of the touch less interactions by conducting  subjective user tests  Finally  on the basis of the results of the performed user tests  the most intuitive and comfortable design of the touch less interface is integrated  with the ICONICS GraphWorX64   application as a demonstration of using the    touch less interactions with the real application     2  Theoretical Part    This chapter introduces a theoretical basis for the related terminology  tech   nology and software  linked to the subject of this thesis  In the first chapter  the  Natural User Interface terminology  history and its practical application is de   scribed  The following chapter describes the Microsoft Kinect sensor  its compo   nents  features  limitations and available Software Development Kits for its pro   gramming  The last chapter introduces the official Microsoft Kinect for Windows    SDK and describes it
113. to the position on  which the cursor was located before an action was detected  This functionality    enables to easily perform a click action     The ActionDetectorBase class provides events CursorDown and  CursorUp notifying about whether the down or up action happened  For setting  the current action the class provides internal methods OnCursorDown   and  OnCursorUp   for handling logic of these actions  These methods are supposed  to be called in the further implementation of the particular action detector  The  detection of the action is supposed to be done by implementing the abstract  method Detect    which passes a collection of cursors  skeleton of interest and  current depth frame as its parameter  These parameters are used in the further    implementation of the method for detecting the final action     A class diagram describing the ActionDetectorBase object representation    is illustrated by the Figure 3 32     9 IDisposable    ActionDetectorBase     Abstract Class      Properties      AllowPostionLimitation   bool      Enabled  bool      Touchlessinterface   TouchlessInteractioninterface    Methods      ActionDetectorBase         Detect    void       Dispose   void   e  HandleCursorSnapping     void      Initialize     void  OnCursorDown     void    OnCursorUp     void                                void    Events       CursorDown   EventHandler lt TouchlessCursorEventArgs gt      CursorUp  EventHandler lt TouchlessCursorEventArgs gt     Figure 3 32   Aclass di
114. too invasive and doesn t distract the user  But there is one  problem to solve in order to make the window invisible for the mouse and touch  events and allow them to be handled by the applications and not by the overlay  window  This problem has been resolved using the Windows API for setting a flag  WS EX TRANSPARENT to the window s extended style  This makes the window    transparent for any input events  The method for creating the transparent window       is implemented as an window s extension method SetInputEventTrans     parent    inthe static class WindowExtensions     The cursors and assistant visualization are implemented in their own    separated overlay window  The reason of their separation into their own windows    59    is      resulting performance  A rendering of the visualization graphics is time   consuming and it slows down rendering and the resulting effect is less smooth and    could make the using of the touch less interface uncomfortable     3 2 7 2  Cursors Visualization   The cursors visualization shows a position of the user s hands mapped into  the screen space  The position is visualized by drawing graphics at the position of  the cursor  The cursor s graphical representation is a hand shape image that is  made at least 7096 transparent with regard to make the controls under the cursor  visible  In an inactive state  when the cursor has no action and it is only moving  across the screen in order to point a desired place  an opened palm hand shape
115. ts of the level of usability for targeting and selecting items     7          The Level of Usability   Windows 8 Applications       6 5 4 3 2 1  hne  Ine    6s    Us          Us    C               bh         bhe    bhe Meng     3  en   gt              Sing Sing            et   Su  SY                                  9t                  7        B Launching Windows 8 application    Using Windows 8 maps    Using Windows 8 web browser    Figure 3 50   A chart showing the results of the level of usability for using Windows 8 applications     3 4 3  Tests Evaluation    This chapter evaluates the results of the usability tests shown in the previous  chapter 3 4 2  The level of comfort  level of usability and level of usability in real  case scenario are evaluated  Based on these results  the most comfortable and usa     ble design will be chosen for the final implementation in chapter 3 5     3 4 3 1  The Level of Comfort   The level of comfort has been investigated for two types of the physical inter   action zone  the planar  3 1 4 1  and curved  3 1 4 2   The Figure 3 44 shows that  the planar physical interaction zone was fatiguing for most of the tested users   Keeping hands in a certain distance toward the sensor was sometimes fatiguing  especially for users of low height  They often reported that they are getting tired  and were missing the possibility of resting their arms at their side so they would    relax the hand     Figure 3 45 shows that the curved physical interacti
116. tureResult     Recognize     GestureResult      Reset   void     Reset   void      swipeRecognizer     swipeRecognizer    SwipeGestureRecognizer    Class    7      Properties    CurrentState   int    Methods  9  LoadSettings     void      Reset    void      SwipeGestureRecognizer        TrackSwipe    bool      UpdatePosition     void     Nested Types  P       Figure 3 37   An object model of the swipe gestures     3 2 4 9  Iterative NUI Development and Tweaking    A development of the NUI is not as straightforward as a development of other    UI  The reason is primarily in the unpredictability of the user s individual access to    using the natural gestures and movements  In the other words  not everybody con     siders a gesture made in one way as natural as  the same gesture made by other person   Although  the meaning of the gesture is the  same the movements are different either  because of the various high of the persons or  difference in their innate gesticulation  In the  result a final setup of the natural interactions    has an influence on the resulting user s    ecd    experience and also on fatigue and comfort of Figure 3 38   An iteration process of the NUI    the natural interaction     56    development     The finding of the optimal setup for      natural interactions is      most  important and circuitous process of the NUI development  Usually  it is based on  the iterative process that starts with a new design of the setup  Then  the setup is  tested by the
117. ty tests  The tests have    been conducted by testing 15 users of various heights and various knowledge in    human computer interactions  The following charts visualize the results in partic     ular aspects of the designed test described in chapter 3 4 1     67             The level of comfort for      Planar Physical Interaction Zone     The Level of Comfort   Planar PhlZ    10  9   LLL  dp        dc  LLL  4  3         2         1   Aaa NM M  0   5   Comfortable     4  Comfortable 3   Fatiguing 2   Fatiguing 1  Challenging 0   Challenging    Figure 3 44   A chart showing the results of the level of comfort for Planar Physical Interaction Zone     e The level of comfort for the Curved Physical Interaction Zone     The Level of Comfort   Curved       2    12    10    10                  0 0 0 0    N    5   Comfortable 4  Comfortable 3   Fatiguing 2   Fatiguing 1   Challenging 0   Challenging    Figure 3 45   A chart showing the results of the level of comfort for Curved Physical Interaction Zone     68       The level of usability for Point and Wait action trigger     The Level of Usability   Point and Wait action trigger       8 2 6 5 4 3 2 1 0        Ing    6s    Us                               U   U    uiti                bk Wire Wire                                     lt           Unip Shap  775     toy        fui  is                         Se Se    BClick                5                   7          Figure 3 46   A chart showing      results of the level of usabilit
118. uality   PositionQualityInfo        KinectSource AllFramesReady     void    PreviousPosition   Point3D   9   OnTrackedUserChanged     void   TIL  Time2Live      TouchlessInteractionInterface     WorldPosition   Point3D    Events   Methods      CursorActivated   EventHandler  TouchlessCur    9                       s PoimaD        CursorDeactivated   EventHandler  TouchlessC      GetViewportPosition    Point      CursorDown   EventHandler lt TouchlessCursorE      w TouchlessCursor       CursorsUpdated   EventHandler  TouchlessInte         CursorUp   EventHandler  TouchlessCursorEve         StateChanged   EventHandler lt Touchlessintera         TrackedUserChanged   EventHandler lt Tracked        State Y              Action  TouchlessInteractionInterfaceState A TouchlessCursorType    TouchlessCursorActions A  Enum Enum Enum   7  Idle Right None  Tracking Left Down  Count Move  Up          Figure 3 31   An object model of the touch less interaction interface     3 2 4 3  Action Detector  The action detector is implemented by the abstract class Action   DetectorBase        it is used for detecting an action which is analogous to mouse    button down and up event  The action can be detected in different ways  There are    49    two approaches to action detection implemented  Point and Wait  see also 3 2 4 4     and Grip  see also 3 2 4 5     The action detector creates basis logic for performing down and up events on  cursors  The logic also implements a cursor s position snapping 
119. urce uninitialization  e Source Uninitialize     break     true     Currently instantiated and connected Kinect sources it is possible to enumer     ate by using the collection s indexer     A class diagram describing a KinectSourceCollection object represen     tation is illustrated by the Figure 3 28     45    IEnumerable lt KinectSource gt   IEnumerable    IDisposable   Sources   KinectSourceCollection       KinectSourceCollection A  Class    7     Properties     Count  int      KinectDevicelnstalled   bool    Methods              void  Contains     bool  CopyTo     void                CreateKinectSource     KinectSource   Dispose     void   GetEnumerator     IEnumerator   KinectSource    KinectSensors_StatusChanged     void    ee          KinectSourceCollection    Remove     bool  System Collections IEnumerable GetEnumerator     IEnumerator              UpdateSensorCollection     void      Events       KinectSourceStatusChanged   EventHandler  KinectSourceStatusEve             fa kinectSourceStatusChanged   EventHandler  KinectSourceStatusEven        gt  gt   KinectSource v    Class         Figure 3 28      class diagram describing an object model of the Kinect source collection     3 2 4  Touch less Interface    This chapter describes an implementation of the Touch less Interface de     signed in chapter 3 1 and its integration with WPF application and Windows 8    operating system  The Touch less Interface is    implemented as a part of the    application layer and it i
120. user for better experience     3 1 3  Interaction Quality    The essential purpose of the Natural Interaction is a creation of a natural  experience in controlling a computer  If we had an ideal sensor for capturing a user  pose in all angles of view and its optics would have an infinite field of view we  would be able to track the user s movements in all his or her poses regardless on  his or her position and angle  Unfortunately  the current parameters of NUI  devices  including the Kinect device  are still very far from the ideal parameters   These limitations may affect the precision of the user s movement tracking which  could result in the incorrect recognition of the user s interaction  For instance   such undesired situation could happen when the user moves out of the sensor s    field of view or he or she is too far from or too close to the sensor     19       Figure 3 5   An illustration of      example of a problematic scenario      touch less interaction     In order to evaluate how precise an interaction the user should expect we    define a variable with a value within range from 0 to 1 and call it the Interaction    Quality  When the interaction quality value is equal to 1 it means that current cap     turing conditions are the best for the user s interaction  Conversely  if the value is    equal to 0 we should expect the undesired behavior caused by inconvenient cap     turing conditions     In the real scenario the inter   action quality is dependent on the  
121. user s distance from the borders of  the sensor s field of view and the  user s distance from the sensor  In  other words  the best interaction  quality we get if the user is within  the sensor s field of view and  within the certain range of dis   tance from the sensor  When the  user is within the sensor s field of  view the resulting interaction  quality value is constantly equal to    1 but when the user approaches    q  dorset threshold             Sensor    Figure 3 6   An illustration of the sensor s field of view  FOV   with inner border and the interaction quality function q d      20    the borders of      sensor s field of view in a certain distance d  the interaction  quality starts to decrease to zero  The situation and interaction quality function is  described by the Figure 3 6  It means that when any part of the user is out of the  sensor s field of view  the interaction quality is zero  The distance beyond which  the interaction quality starts to decrease is not fixed and it is given by a tolerance  set in the interaction recognizer  The calculation of the quality is described by the  Equation 3 1 where    is the quality  d is distance from the FOV  dore specifies in  which distance from the FOV the quality is set to zero and threshold defines a    range within the quality value changes between 1 and 0     2  Max 0 d    dor fset   threshold    q d    Min         Equation 3 1   An equation of the interaction quality function     The most of the current devices
122. vancement  Since then  the  potential of computers has inspired many sci fi movies and books in which the  authors predicted futuristic machines with artificial intelligence which are able to  understand a speech  mimics and body language  Such a natural way of human   computer interaction remained only as a topic for sci fi for the next 40 years   However  over time  exploratory work at universities  government and corporate  research has made great progress in computer vision  speech recognition and ma     chine learning  In conjunction with increasing performance of microprocessors     2    the technological advancement allowed creating sensors that are capable to see   feel and hear better than before  A vision of a real NUI was not just a farfetched  idea anymore but its creation came to be only a matter of time  During the research  of NUI there evolved a number of various approaches starting with speech recogni   tion  touch interfaces and ending with more unconventional experiments like  Microsoft Skinput project  1   muscle computer interface  1  or mind reading using    Electroencephalography  EEG   2      The touch interface and its successor  a multi touch interface  are considered  as the first real applications of NUI  They let users interact with controls and appli   cations more intuitively than a cursor based interface because it is more direct so  instead of moving a cursor to select an item and clicking to open it  the user intui   tively touches its graphi
123. y for Point and Wait action trigger     e The level of usability for Grip action trigger     The Level of Usability   Grip action trigger    10       1             9  gt  6 5 4 3 2 4 0  Whigs 7   7     Us                     Uh     tiva    tive be              Uy  Wire    Wire    7 to 7 to    Sab je  apy     x            ab  0            bit bj Se Se    E Click  Drag   5                 HZoom    Figure 3 47   A chart showing the results of the level of usability for Grip action trigger     69    The level of usability for      real case scenario     The Level of Usability   Swipe Gestures             vL l1          8  s  x        D       6   4 3 2 4 0    0  U i  U          B   Muti Sabh Usa  bh Sabh                       gt        gt        2 f 7     gt  9t   gt            in       m      ati                  4  Orr  Su  Su E 5   90 95        Bl Swipe gestures for Presentation Bi Left swipe gesture for showing the Start    El Right swipe gesture for closing an application    Figure 3 48   A chart showing the results of the level of usability for swipe gestures     The Level of Usability   Targeting and Selecting Items                                                                gt         d    6  Sy 4  3  2  4  0   lug                                                              0  2   19     ar  7  7   Mfr  fo        9    Sue Sue s 7              Sue Se     Targeting and selecting large items    Trageting and selecting small items    Figure 3 49   A chart showing the resul
124. zing  patterns in user s movement that match a specific gesture  The gestures allow exe   cuting predefined actions very quickly and naturally  but the quality of the result   ing user s experience critically depends on the gesture s design  If the gesture is  not reliable  the application will feel unresponsive and difficult to use  There are  many factors and situations which must be considered for a reliable and respon     sive gesture design in order to avoid users  frustration  14      3 1 7 1  Designing a Gesture  A reliable gesture design considers its variability depending on the user s    interpretation of a gesture that could be completely different from the other users     30    Also  the design must take into account that once      user has engaged with       system  the sensor is always monitoring and looking for patterns that match a  gesture  It means that the design should be able to distinguish intentional gestures    and ignore other movements such as touching face  adjusting glasses  drinking  etc     Another influence on the gesture s practicability has a choice of one or two   handed gestures  One handed gesture is more intuitive and easier to do than two   handed  Also  when there is a two handed gesture designed  it should be symmet   rical which is more intuitive and comfortable for the user  A target usage of both  gesture types should be also considered  One handed gestures should be used for  critical and frequent tasks so the user can do them quick
    
Download Pdf Manuals
 
 
    
Related Search
    
Related Contents
Kohler 10ERG Portable Generator User Manual  The HP 1660C/CS/CP-Series Benchtop Logic Analyzers Technical  Craftsman 37639 Lawn Mower User Manual  HQ W7-51701/ N  GP Forward HT - Grainger.com  Fixapart ZH-RC11R wire connector  A35 - Ortec    Copyright © All rights reserved. 
   Failed to retrieve file