Virtual Mouse

.

AI Virtual Mouse with Hand Gesture Detection


**Date: **2022-06-20

Author: Hee-Yun Kang, Woo-Ju So

Github: hyKangHGU/VirtualMouse

Demo Video: AI Virtual Mouse with Hand Gesture Detection


1. Introduction

In this project, we can control the mouse cursor on the monitor by recognizing the hand gesture on the webcam. This shows that you can control your PC by replacing the mouse and touchpad. In addition, the lecturer will be able to take a presentation in front of the camera only by moving their hands.

Goal

The goal of this project is to perform all basic operations of mouse with only hand gestures. The following are basic operations of mouse.

  • Left Click

  • Right Click

  • Double Click

  • Drag & Drop

  • Scroll

Hand can be recognized by using MediaPipe's hand landmakr model. As shown in the figure below, landmark information for each joint of the hand is available.


2. Requirement

Hardware

  • Webcam

Software Installation

  • Python 3.7.x or python 3.8.x

  • opencv-python

  • mediapipe

  • autopy

  • pyautogui


3. Tutorial Procedure

Setting up

Anaconda Install

Anaconda - DLIP (gitbook.io)

Create virtual environment

Run anaconda prompt in administrator mode.

Python version == 3.7.13 or 3.8.13

conda create -n virtual_mouse python=3.7.13
conda activate virtual_mouse

Download Files

hyKangHGU/VirtualMouse (github.com)

Install Libraries

pip install opencv-python
pip install mediapipe
pip install autopy
pip install pyautogui

Solution for error

After downloading files and install libraries, you can see execution error. This error means that the protobuf package should be downgraded to 3.20.x.

pip uninstall protobuf
pip install protobuf==3.20.1

Code Desription

  • VirtualMouse.py : Main Program.

  • HandTrackingModule.py : Class and functions that recognize hands using MediaPipe are defined. (It is referenced by youtube link)

  • Defines.py : Constant variables related to setting are defined.

  • MouseOperation.py : Functions related to mouse operation are defined.

The overall flow of the program is in the order of initialization, hand tracking, mode update, mouse operation, and image display.

Initialization

In first step, the necessary modules are imported. Then the webcam and handDetector class are initialized. 'handDetector' class is defined at HandTrackingMoudule.py. The number of hands detected on the webcam is determined by 'maxHands'.

import cv2 as cv
from Defines import *
from MouseOperation import *
from HandTrackingModule import *
import autopy as ap

cap = cv.VideoCapture(0)
cap.set(3, wCam)
cap.set(4, hCam)
detector = handDetector(maxHands=1)

Hand Tracking

From here, it is calculated every frame of the webcam. A hand is detected in an image obtained from webcam. Then landmarks and a bounding box of the hand are obtained. And, it is checked whether the landmarks of the hand are detected or not.

# =================== 2. Hand Tracking ================== #
# 2-1. Find hand Landmarks
success, img = cap.read()
img = detector.findHands(img)
lmList, bbox = detector.findPosition(img)

# 2-2. Check the hand detected
if len(lmList) != 0:

Mode Update

If the hand is detected, the mode is updated. The information on hand landmarks is utilized of mode change. The 'fingers' variable means whether each finger is stretched or folded. For example, fingers[0] = 1 means the thumb is up. The location information of each landmark is obtained through 'tipsPosision' function. It is used for detail condition of mode change. Some modes may require the distance between landmarks. The location information is used for calculating the distance. The landmark number is described on image in introduction. Therefore, the modes are determined by each finger's state and the distance between landmarks. The flowchart for mode selection is shown in the following figure.

# =================== 3. Mode update ================== #
# 3-1. Check which fingers are up
fingers = detector.fingersUp()

# 3-2. Check distance between landmarks
x0 , y0  = detector.tipsPosition(0)
x3 , y3  = detector.tipsPosition(3)
x4 , y4  = detector.tipsPosition(4)
x5 , y5  = detector.tipsPosition(5)
x6 , y6  = detector.tipsPosition(6)
x8 , y8  = detector.tipsPosition(8)
x9 , y9  = detector.tipsPosition(9)
x12, y12 = detector.tipsPosition(12)

xdist_48 = cal_1Ddist(x4, x8)
xdist_49 = cal_1Ddist(x4, x9)
xdist_412 = cal_1Ddist(x4, x12)
xdist_812 = cal_1Ddist(x8, x12)
ydist_812 = cal_1Ddist(y8, y12)
dist_412 = cal_2Ddist((x4, y4), (x12, y12))
dist_812 = cal_2Ddist((x8, y8), (x12, y12))
xdist_box = bbox[2] - bbox[0] # xmax - xmin
ydist_box = bbox[3] - bbox[1] # ymax - ymin

xdist_ratio_48  = xdist_48/xdist_box
xdist_ratio_49  = xdist_49/xdist_box
xdist_ratio_412 = xdist_412/xdist_box
xdist_ratio_812 = xdist_812/xdist_box
ydist_ratio_812 = ydist_812/ydist_box

# 3-3. Mode Update
if    fingers[0] == 1 and fingers[1] == 1 and fingers[2] == 1 and fingers[3] == 1             :    cMode = NO_MODE
elif  fingers[0] == 1 and fingers[1] == 1 and fingers[2] == 0 and xdist_ratio_48 >= 0.25      :    cMode = MOUSE_L_CLICK_WAIT
elif  fingers[0] == 1 and fingers[1] == 1 and fingers[2] == 1 and xdist_ratio_412 >= 0.87     :    cMode = MOUSE_DRAG_UP
elif  fingers[1] == 1 and fingers[2] == 1 and xdist_ratio_49 >= 0.55                          :    cMode = SCROLL_WAIT
elif  fingers[1] == 1 and fingers[2] == 1 and xdist_ratio_49 < 0.55                           :    cMode = SCROLL_MOVE
elif  fingers[0] == 0 and fingers[1] == 1 and fingers[4] == 1                                 :    cMode = MOUSE_R_CLICK_WAIT
elif  xdist_ratio_48 < 0.15 and xdist_ratio_412 < 0.32 and xdist_ratio_812 < 0.32 and ydist_ratio_812 < 0.25    
																							  :    cMode = MOUSE_DRAG_DOWN
elif  fingers[1] == 0 and fingers[2] == 1 and fingers[3] == 0                                 :    cMode = MIDDLE_FINGER_UP
elif  fingers[0] == 0 and fingers[1] == 1                                                     :    cMode = MOUSE_MOVE

Mouse Operation

  • Mouse cursor moving

Moving Mode

Drag Down Mode

The mouse cursor can be moved during the pre-click and click-down state. As shown in the figure above, when only the second finger is up, the mouse moves without clicking. On the other hand, drag down mode is performed in click-down state and can move until drop mode is executed with click-up.

# 4-1. Mouse cursor is moving
if cMode == MOUSE_MOVE or cMode == MOUSE_DRAG_DOWN:
    # 1) Check whether mouse cursor is down.
    if pMode != MOUSE_DRAG_DOWN and cMode == MOUSE_DRAG_DOWN:
    ap.mouse.toggle(down=True)

    # 2) Get the location of mouse cursor.
    clocX, clocY, cconX, cconY = get_current_location(x0, y0, plocX, plocY, pconX, pconY, fingers_prev)

    # 3) Move mouse cursor.
    ap.mouse.move(W_SCR - clocX, clocY)
    cv.circle(img, (x8, y8), 15, PINK, cv.FILLED)

    # 4) Update the states.
    plocX, plocY = clocX, clocY
    pconX, pconY = cconX, cconY
    fingers_prev = True
    pMode        = cMode
  • Left Click

Left-Click Standby Mode

Left-Click Mode

Left-click standby mode stops the mouse movement before executing a left-click. For detecting the left-click mode, the distance condition between each joint of the 2nd finger is used. In addition, if the condition is continuously satisfied, the left-click is also continuously executed. This problem is solved by recording the state in the previous frame through a variable called 'pMode'.

# 4-2. Left Click
elif cMode == MOUSE_L_CLICK_WAIT:
    ap.mouse.toggle(down=False) # (mouse up)

    # 1) Find distance between landmarks (58, 56)
    dist_58, img, lineInfo = detector.findDistance(5, 8, img)
    dist_56 = cal_2Ddist((x5, y5), (x6, y6))

    # 2) Click mouse if distance condition is satisfied
    pMode = click_mouse_left(pMode, cMode, img, lineInfo, dist_56/ydist_box, dist_58/ydist_box)

    # 3) Update the states.
    fingers_prev = False
  • Right Click

Right-Click Standby Mode

Right-Click Mode

The right-click mode is implemented same to the left-click mode except for the shape of the fingers.

# 4-3. Right Click
elif cMode == MOUSE_R_CLICK_WAIT:
    ap.mouse.toggle(down=False) # (mouse up)

    # 1) Find distance between between landmarks (58, 56)
    dist_58, img, lineInfo = detector.findDistance(5, 8, img)
    dist_56 = cal_2Ddist((x5, y5), (x6, y6))

    # 2) Click mouse if distance condition is satisfied.
    pMode = click_mouse_right(pMode, cMode, img, lineInfo, dist_56/ydist_box, dist_58/ydist_box)

    # 3) Update the states.
    fingers_prev = False
  • Drag Up

Drag Down Mode

Drag Up Mode

Drag-up mode is a function to escape from drag mode. In drag mode, the cursor can be moved in a click-down state, and stopped in a click-up state. Drag-up mode executes a click-up.

# 4-4. Drag Up
elif pMode != MOUSE_DRAG_UP and cMode == MOUSE_DRAG_UP:
    # 1) Mouse up.
    ap.mouse.toggle(down=False) 

    # 2) Update the states.
    pMode = cMode
  • Scroll

Scroll Standby Mode

Scroll

In the scroll standby state, the current mouse cursor position is saved. And, in the scroll mode, the screen moves as much as the hand moves in the y-axis direction. The scroll speed is difference in the current hand position compared to the saved mouse cursor position in the scroll standby state.

# 4-5. Wait the Scroll mode.
elif cMode == SCROLL_WAIT:
    # 1) Get initial position.
    pscroll_x = plocX
    pscroll_y = plocY

    # 2) Update the states.
    pMode = cMode

# 4-6. Scroll on the screen
elif pMode == SCROLL_WAIT and cMode == SCROLL_MOVE:
    ap.mouse.toggle(down=False) # (mouse up)

    # 1) Get the location of mouse cursor.
    clocX, clocY, cconX, cconY = get_current_location(x0, y0, plocX, plocY, pconX, pconY, fingers_prev)

    # 2) Scroll on the screen.
    if cnt_scroll >= MIN_CNT_SCROLL:
    pg.scroll( int(pscroll_y - clocY) ) # from inital position.
    cnt_scroll = 0

    # 3) Update the states.
    cnt_scroll += 1
    plocX, plocY = clocX, clocY
    pconX, pconY = cconX, cconY
    fingers_prev = True
  • Middle Finger Filter

Since someone can offend the other person by swearing with their fingers, blur processing is performed to prevent this.

 # 4-7. Middle Finger Up... -> Don't say swear words!!
 elif cMode == MIDDLE_FINGER_UP:
     # 1) Get the bounding box
     xmin, ymin = bbox[0]-30, bbox[1]-30
     xmax, ymax = bbox[2]+30, bbox[3]+30
     if xmin < 1: xmin = 1
     if ymin < 1: ymin = 1
     if xmax > W_SCR : xmax = wCam
     if ymax > H_SCR : ymax = hCam

    roi = img[ymin:ymax, xmin:xmax]

    # 2) Blur the bounding box
    roi_blur = cv.blur(roi, (25,25))
    img[ymin:ymax, xmin:xmax] = roi_blur
  • None

This is a nothing mode. This is a mode that prevents the mouse from moving and not doing any action.

# 4-8. None
else:
    ap.mouse.toggle(down=False)
    fingers_prev          = False
    pMode                 = cMode

Image Display

# 5. Image Display
img   = cv.flip(img, 1)             # 1) Flip image

print_mode(cMode, img)              # 2) Print current mode on the image

pTime = check_show_time(img, pTime) # 3) Frame Rate

cv.imshow("Image", img)             # 4) Display

4. Results and Analysis

CPS Test

CPS means click per second. CPS can be measured in this website. We were able to compare the click speed of a general mouse and a virtual mouse. The test counts the number of clicks in 5 seconds. The results for 5 times are as follows.

![general_mouse_CPS TEST](.\images\general_mouse_CPS TEST.png)![virtual_mouse CPS Test](.\images\virtual_mouse CPS Test.png)

General Mouse

Virtual Mouse

The average CPS of general mouse is 5.56. On the other hand, the average CPS of the virtual mouse is 4.16. This means that the virtual mouse can click at about 75% of the speed of a general mouse.

Mole Game

The mole game was developed to practice mouse movements and clicks. Using this game, we compared a general mouse and a virtual mouse.

![mole game](.\images\mole game.png)

![general_mouse mole_game](.\images\general_mouse mole_game.png)![virtual_mouse mole_game](.\images\virtual_mouse mole_game.png)

General Mouse

Virtual Mouse

For a general mouse, a maximum score of 609 was obtained, and for a virtual mouse, a maximum of 200 was obtained. It means that the movement and click accuracy of the virtual mouse is about 3 times lower than that of a general mouse.

Evaluation by Function

  • Left / Right / Double click

    It works well without any functional problems, and unwanted multiple clicks are prevented.

  • Scroll

​ It works well, but lowers the FPS

  • Drag & Drop

​ It works well.

Further Work

  • Wheel click

​ A general mouse has a wheel click function. However, that function has not yet been implemented in this project.

  • FPS down problem at Scroll

​ Although the scroll function is well implemented, there is a problem that the FPS is lowered.

  • Stopping Error

    Stopping error is generated when top bar of the image window is clicked by virtual mouse. We haven't tried it, but here's a solution to solve this. The whole program is composed of two main program files as each independent. One is to control the virtual mouse from image processing. The other thing is to show the image in real time.

  • Robustness for Each Operation

    It is implemented to perform specific actions with predefined hand gestures. Even if the gesture is the same in the eyes of a human, it may be recognized as a different gesture. For example, even though it is cursor moving mode, it may be recognized as left click waiting mode according to a change in posture. It should be implemented to minimized the interference between each operation by giving additional conditions.

    Moving Mode

    Fault Recognition

    Left Click Waiting Mode

  • Brick Breaker Game

    Among the games with the mouse movements, there is a brick breaker game. If you link this game with the virtual mouse, it will be interesting content.

  • Two Hands Recognition

    More functions can be implemented by recognizing both hands. If the recognition of the two hands movements is further developed, it is possible to create a model that can interpret sign language.


5. References

Hands - mediapipe (google.github.io)

(2) AI Virtual Mouse | OpenCV Python | Computer Vision - YouTube

Last updated