πŸ“š
DLIP
  • Introduction
  • Prerequisite
  • Image Processing Basics
    • Notes
      • Thresholding
      • Spatial Filtering
      • Masking with Bitwise Operation
      • Model n Calibration
    • Tutorial
      • Tutorial: Install OpenCV C++
      • Tutorial: Create OpenCV Project
      • Tutorial: C++ basics
      • Tutorial: OpenCV Basics
      • Tutorial: Image Watch for Debugging
      • Tutorial: Spatial Filter
      • Tutorial: Thresholding and Morphology
      • Tutorial: Camera Calibration
      • Tutorial: Color Image Processing
      • Tutorial: Edge Line Circle Detection
      • Tutorial: Corner Detection and Optical Flow
      • Tutorial: OpenCV C++ Cheatsheet
      • Tutorial: Installation for Py OpenCV
      • Tutorial: OpenCv (Python) Basics
    • LAB
      • Lab Report Template
      • Lab Report Grading Criteria
      • LAB Report Instruction
      • LAB: Grayscale Image Segmentation
        • LAB: Grayscale Image Segmentation -Gear
        • LAB: Grayscale Image Segmentation - Bolt and Nut
      • LAB: Color Image Segmentation
        • LAB: Facial Temperature Measurement with IR images
        • LAB: Magic Cloak
      • LAB: Straight Lane Detection and Departure Warning
      • LAB: Dimension Measurement with 2D camera
      • LAB: Tension Detection of Rolling Metal Sheet
  • Deep Learning for Perception
    • Notes
      • Lane Detection with Deep Learning
      • Overview of Deep Learning
        • Object Detection
        • Deep Learning Basics: Introduction
        • Deep Learning State of the Art
        • CNN, Object Detection
      • Perceptron
      • Activation Function
      • Optimization
      • Convolution
      • CNN Overview
      • Evaluation Metric
      • LossFunction Regularization
      • Bias vs Variance
      • BottleNeck Unit
      • Object Detection
      • DL Techniques
        • Technical Strategy by A.Ng
    • Tutorial - PyTorch
      • Tutorial: Install PyTorch
      • Tutorial: Python Numpy
      • Tutorial: PyTorch Tutorial List
      • Tutorial: PyTorch Example Code
      • Tutorial: Tensorboard in Pytorch
      • Tutorial: YOLO in PyTorch
        • Tutorial: Yolov8 in PyTorch
        • Tutorial: Train Yolo v8 with custom dataset
          • Tutorial: Train Yolo v5 with custom dataset
        • Tutorial: Yolov5 in Pytorch (VS code)
        • Tutorial: Yolov3 in Keras
    • LAB
      • Assignment: CNN Classification
      • Assignment: Object Detection
      • LAB: CNN Object Detection 1
      • LAB: CNN Object Detection 2
      • LAB Grading Criteria
    • Tutorial- Keras
      • Train Dataset
      • Train custom dataset
      • Test model
      • LeNet-5 Tutorial
      • AlexNet Tutorial
      • VGG Tutorial
      • ResNet Tutorial
    • Resource
      • Online Lecture
      • Programming tutorial
      • Books
      • Hardware
      • Dataset
      • Useful sites
  • Must Read Papers
    • AlexNet
    • VGG
    • ResNet
    • R-CNN, Fast-RCNN, Faster-RCNN
    • YOLOv1-3
    • Inception
    • MobileNet
    • SSD
    • ShuffleNet
    • Recent Methods
  • DLIP Project
    • Report Template
    • DLIP 2021 Projects
      • Digital Door Lock Control with Face Recognition
      • People Counting with YOLOv4 and DeepSORT
      • Eye Blinking Detection Alarm
      • Helmet-Detection Using YOLO-V5
      • Mask Detection using YOLOv5
      • Parking Space Management
      • Vehicle, Pedestrian Detection with IR Image
      • Drum Playing Detection
      • Turtle neck measurement program using OpenPose
    • DLIP 2022 Projects
      • BakeryCashier
      • Virtual Mouse
      • Sudoku Program with Hand gesture
      • Exercise Posture Assistance System
      • People Counting Embedded System
      • Turtle neck measurement program using OpenPose
    • DLIP Past Projects
  • Installation Guide
    • Installation Guide for Pytorch
      • Installation Guide 2021
    • Anaconda
    • CUDA cuDNN
      • CUDA 10.2
    • OpenCV
      • OpenCV Install and Setup
        • OpenCV 3.4.13 with VS2019
        • OpenCV3.4.7 VS2017
        • MacOS OpenCV C++ in XCode
      • Python OpenCV
      • MATLAB-OpenCV
    • Framework
      • Keras
      • TensorFlow
        • Cheat Sheet
        • Tutorial
      • PyTorch
    • IDE
      • Visual Studio Community
      • Google Codelab
      • Visual Studio Code
        • Python with VS Code
        • Notebook with VS Code
        • C++ with VS Code
      • Jupyter Notebook
        • Install
        • How to use
    • Ubuntu
      • Ubuntu 18.04 Installation
      • Ubuntu Installation using Docker in Win10
      • Ubuntu Troubleshooting
    • ROS
  • Programming
    • Python_Numpy
      • Python Tutorial - Tips
      • Python Tutorial - For Loop
      • Python Tutorial - List Tuple, Dic, Set
    • Markdown
      • Example: API documentation
    • Github
      • Create account
      • Tutorial: Github basic
      • Tutorial: Github Desktop
    • Keras
      • Tutorial Keras
      • Cheat Sheet
    • PyTorch
      • Cheat Sheet
      • Autograd in PyTorch
      • Simple ConvNet
      • MNIST using LeNet
      • Train ConvNet using CIFAR10
  • Resources
    • Useful Resources
    • Github
Powered by GitBook
On this page
  • AI Virtual Mouse with Hand Gesture Detection
  • 1. Introduction
  • 2. Requirement
  • 3. Tutorial Procedure
  • 4. Results and Analysis
  • 5. References

Was this helpful?

  1. DLIP Project
  2. DLIP 2022 Projects

Virtual Mouse

PreviousBakeryCashierNextSudoku Program with Hand gesture

Last updated 2 years ago

Was this helpful?

.

AI Virtual Mouse with Hand Gesture Detection


**Date: **2022-06-20

Author: Hee-Yun Kang, Woo-Ju So

Github:

Demo Video:


1. Introduction

In this project, we can control the mouse cursor on the monitor by recognizing the hand gesture on the webcam. This shows that you can control your PC by replacing the mouse and touchpad. In addition, the lecturer will be able to take a presentation in front of the camera only by moving their hands.

Goal

The goal of this project is to perform all basic operations of mouse with only hand gestures. The following are basic operations of mouse.

  • Left Click

  • Right Click

  • Double Click

  • Drag & Drop

  • Scroll

Hand can be recognized by using MediaPipe's hand landmakr model. As shown in the figure below, landmark information for each joint of the hand is available.


2. Requirement

Hardware

  • Webcam

Software Installation

  • Python 3.7.x or python 3.8.x

  • opencv-python

  • mediapipe

  • autopy

  • pyautogui


3. Tutorial Procedure

Setting up

Anaconda Install

Create virtual environment

Run anaconda prompt in administrator mode.

Python version == 3.7.13 or 3.8.13

conda create -n virtual_mouse python=3.7.13
conda activate virtual_mouse

Download Files

Install Libraries

pip install opencv-python
pip install mediapipe
pip install autopy
pip install pyautogui

Solution for error

After downloading files and install libraries, you can see execution error. This error means that the protobuf package should be downgraded to 3.20.x.

pip uninstall protobuf
pip install protobuf==3.20.1

Code Desription

  • VirtualMouse.py : Main Program.

  • Defines.py : Constant variables related to setting are defined.

  • MouseOperation.py : Functions related to mouse operation are defined.

The overall flow of the program is in the order of initialization, hand tracking, mode update, mouse operation, and image display.

Initialization

In first step, the necessary modules are imported. Then the webcam and handDetector class are initialized. 'handDetector' class is defined at HandTrackingMoudule.py. The number of hands detected on the webcam is determined by 'maxHands'.

import cv2 as cv
from Defines import *
from MouseOperation import *
from HandTrackingModule import *
import autopy as ap

cap = cv.VideoCapture(0)
cap.set(3, wCam)
cap.set(4, hCam)
detector = handDetector(maxHands=1)

Hand Tracking

From here, it is calculated every frame of the webcam. A hand is detected in an image obtained from webcam. Then landmarks and a bounding box of the hand are obtained. And, it is checked whether the landmarks of the hand are detected or not.

# =================== 2. Hand Tracking ================== #
# 2-1. Find hand Landmarks
success, img = cap.read()
img = detector.findHands(img)
lmList, bbox = detector.findPosition(img)

# 2-2. Check the hand detected
if len(lmList) != 0:

Mode Update

If the hand is detected, the mode is updated. The information on hand landmarks is utilized of mode change. The 'fingers' variable means whether each finger is stretched or folded. For example, fingers[0] = 1 means the thumb is up. The location information of each landmark is obtained through 'tipsPosision' function. It is used for detail condition of mode change. Some modes may require the distance between landmarks. The location information is used for calculating the distance. The landmark number is described on image in introduction. Therefore, the modes are determined by each finger's state and the distance between landmarks. The flowchart for mode selection is shown in the following figure.

# =================== 3. Mode update ================== #
# 3-1. Check which fingers are up
fingers = detector.fingersUp()

# 3-2. Check distance between landmarks
x0 , y0  = detector.tipsPosition(0)
x3 , y3  = detector.tipsPosition(3)
x4 , y4  = detector.tipsPosition(4)
x5 , y5  = detector.tipsPosition(5)
x6 , y6  = detector.tipsPosition(6)
x8 , y8  = detector.tipsPosition(8)
x9 , y9  = detector.tipsPosition(9)
x12, y12 = detector.tipsPosition(12)

xdist_48 = cal_1Ddist(x4, x8)
xdist_49 = cal_1Ddist(x4, x9)
xdist_412 = cal_1Ddist(x4, x12)
xdist_812 = cal_1Ddist(x8, x12)
ydist_812 = cal_1Ddist(y8, y12)
dist_412 = cal_2Ddist((x4, y4), (x12, y12))
dist_812 = cal_2Ddist((x8, y8), (x12, y12))
xdist_box = bbox[2] - bbox[0] # xmax - xmin
ydist_box = bbox[3] - bbox[1] # ymax - ymin

xdist_ratio_48  = xdist_48/xdist_box
xdist_ratio_49  = xdist_49/xdist_box
xdist_ratio_412 = xdist_412/xdist_box
xdist_ratio_812 = xdist_812/xdist_box
ydist_ratio_812 = ydist_812/ydist_box

# 3-3. Mode Update
if    fingers[0] == 1 and fingers[1] == 1 and fingers[2] == 1 and fingers[3] == 1             :    cMode = NO_MODE
elif  fingers[0] == 1 and fingers[1] == 1 and fingers[2] == 0 and xdist_ratio_48 >= 0.25      :    cMode = MOUSE_L_CLICK_WAIT
elif  fingers[0] == 1 and fingers[1] == 1 and fingers[2] == 1 and xdist_ratio_412 >= 0.87     :    cMode = MOUSE_DRAG_UP
elif  fingers[1] == 1 and fingers[2] == 1 and xdist_ratio_49 >= 0.55                          :    cMode = SCROLL_WAIT
elif  fingers[1] == 1 and fingers[2] == 1 and xdist_ratio_49 < 0.55                           :    cMode = SCROLL_MOVE
elif  fingers[0] == 0 and fingers[1] == 1 and fingers[4] == 1                                 :    cMode = MOUSE_R_CLICK_WAIT
elif  xdist_ratio_48 < 0.15 and xdist_ratio_412 < 0.32 and xdist_ratio_812 < 0.32 and ydist_ratio_812 < 0.25    
																							  :    cMode = MOUSE_DRAG_DOWN
elif  fingers[1] == 0 and fingers[2] == 1 and fingers[3] == 0                                 :    cMode = MIDDLE_FINGER_UP
elif  fingers[0] == 0 and fingers[1] == 1                                                     :    cMode = MOUSE_MOVE

Mouse Operation

  • Mouse cursor moving

Moving Mode

Drag Down Mode

The mouse cursor can be moved during the pre-click and click-down state. As shown in the figure above, when only the second finger is up, the mouse moves without clicking. On the other hand, drag down mode is performed in click-down state and can move until drop mode is executed with click-up.

# 4-1. Mouse cursor is moving
if cMode == MOUSE_MOVE or cMode == MOUSE_DRAG_DOWN:
    # 1) Check whether mouse cursor is down.
    if pMode != MOUSE_DRAG_DOWN and cMode == MOUSE_DRAG_DOWN:
    ap.mouse.toggle(down=True)

    # 2) Get the location of mouse cursor.
    clocX, clocY, cconX, cconY = get_current_location(x0, y0, plocX, plocY, pconX, pconY, fingers_prev)

    # 3) Move mouse cursor.
    ap.mouse.move(W_SCR - clocX, clocY)
    cv.circle(img, (x8, y8), 15, PINK, cv.FILLED)

    # 4) Update the states.
    plocX, plocY = clocX, clocY
    pconX, pconY = cconX, cconY
    fingers_prev = True
    pMode        = cMode
  • Left Click

Left-Click Standby Mode

Left-Click Mode

Left-click standby mode stops the mouse movement before executing a left-click. For detecting the left-click mode, the distance condition between each joint of the 2nd finger is used. In addition, if the condition is continuously satisfied, the left-click is also continuously executed. This problem is solved by recording the state in the previous frame through a variable called 'pMode'.

# 4-2. Left Click
elif cMode == MOUSE_L_CLICK_WAIT:
    ap.mouse.toggle(down=False) # (mouse up)

    # 1) Find distance between landmarks (58, 56)
    dist_58, img, lineInfo = detector.findDistance(5, 8, img)
    dist_56 = cal_2Ddist((x5, y5), (x6, y6))

    # 2) Click mouse if distance condition is satisfied
    pMode = click_mouse_left(pMode, cMode, img, lineInfo, dist_56/ydist_box, dist_58/ydist_box)

    # 3) Update the states.
    fingers_prev = False
  • Right Click

Right-Click Standby Mode

Right-Click Mode

The right-click mode is implemented same to the left-click mode except for the shape of the fingers.

# 4-3. Right Click
elif cMode == MOUSE_R_CLICK_WAIT:
    ap.mouse.toggle(down=False) # (mouse up)

    # 1) Find distance between between landmarks (58, 56)
    dist_58, img, lineInfo = detector.findDistance(5, 8, img)
    dist_56 = cal_2Ddist((x5, y5), (x6, y6))

    # 2) Click mouse if distance condition is satisfied.
    pMode = click_mouse_right(pMode, cMode, img, lineInfo, dist_56/ydist_box, dist_58/ydist_box)

    # 3) Update the states.
    fingers_prev = False
  • Drag Up

Drag Down Mode

Drag Up Mode

Drag-up mode is a function to escape from drag mode. In drag mode, the cursor can be moved in a click-down state, and stopped in a click-up state. Drag-up mode executes a click-up.

# 4-4. Drag Up
elif pMode != MOUSE_DRAG_UP and cMode == MOUSE_DRAG_UP:
    # 1) Mouse up.
    ap.mouse.toggle(down=False) 

    # 2) Update the states.
    pMode = cMode
  • Scroll

Scroll Standby Mode

Scroll

In the scroll standby state, the current mouse cursor position is saved. And, in the scroll mode, the screen moves as much as the hand moves in the y-axis direction. The scroll speed is difference in the current hand position compared to the saved mouse cursor position in the scroll standby state.

# 4-5. Wait the Scroll mode.
elif cMode == SCROLL_WAIT:
    # 1) Get initial position.
    pscroll_x = plocX
    pscroll_y = plocY

    # 2) Update the states.
    pMode = cMode

# 4-6. Scroll on the screen
elif pMode == SCROLL_WAIT and cMode == SCROLL_MOVE:
    ap.mouse.toggle(down=False) # (mouse up)

    # 1) Get the location of mouse cursor.
    clocX, clocY, cconX, cconY = get_current_location(x0, y0, plocX, plocY, pconX, pconY, fingers_prev)

    # 2) Scroll on the screen.
    if cnt_scroll >= MIN_CNT_SCROLL:
    pg.scroll( int(pscroll_y - clocY) ) # from inital position.
    cnt_scroll = 0

    # 3) Update the states.
    cnt_scroll += 1
    plocX, plocY = clocX, clocY
    pconX, pconY = cconX, cconY
    fingers_prev = True
  • Middle Finger Filter

Since someone can offend the other person by swearing with their fingers, blur processing is performed to prevent this.

 # 4-7. Middle Finger Up... -> Don't say swear words!!
 elif cMode == MIDDLE_FINGER_UP:
     # 1) Get the bounding box
     xmin, ymin = bbox[0]-30, bbox[1]-30
     xmax, ymax = bbox[2]+30, bbox[3]+30
     if xmin < 1: xmin = 1
     if ymin < 1: ymin = 1
     if xmax > W_SCR : xmax = wCam
     if ymax > H_SCR : ymax = hCam

    roi = img[ymin:ymax, xmin:xmax]

    # 2) Blur the bounding box
    roi_blur = cv.blur(roi, (25,25))
    img[ymin:ymax, xmin:xmax] = roi_blur
  • None

This is a nothing mode. This is a mode that prevents the mouse from moving and not doing any action.

# 4-8. None
else:
    ap.mouse.toggle(down=False)
    fingers_prev          = False
    pMode                 = cMode

Image Display

# 5. Image Display
img   = cv.flip(img, 1)             # 1) Flip image

print_mode(cMode, img)              # 2) Print current mode on the image

pTime = check_show_time(img, pTime) # 3) Frame Rate

cv.imshow("Image", img)             # 4) Display

4. Results and Analysis

CPS Test

![general_mouse_CPS TEST](.\images\general_mouse_CPS TEST.png)
![virtual_mouse CPS Test](.\images\virtual_mouse CPS Test.png)

General Mouse

Virtual Mouse

The average CPS of general mouse is 5.56. On the other hand, the average CPS of the virtual mouse is 4.16. This means that the virtual mouse can click at about 75% of the speed of a general mouse.

The mole game was developed to practice mouse movements and clicks. Using this game, we compared a general mouse and a virtual mouse.

![mole game](.\images\mole game.png)

![general_mouse mole_game](.\images\general_mouse mole_game.png)
![virtual_mouse mole_game](.\images\virtual_mouse mole_game.png)

General Mouse

Virtual Mouse

For a general mouse, a maximum score of 609 was obtained, and for a virtual mouse, a maximum of 200 was obtained. It means that the movement and click accuracy of the virtual mouse is about 3 times lower than that of a general mouse.

Evaluation by Function

  • Left / Right / Double click

    It works well without any functional problems, and unwanted multiple clicks are prevented.

  • Scroll

​ It works well, but lowers the FPS

  • Drag & Drop

​ It works well.

Further Work

  • Wheel click

​ A general mouse has a wheel click function. However, that function has not yet been implemented in this project.

  • FPS down problem at Scroll

​ Although the scroll function is well implemented, there is a problem that the FPS is lowered.

  • Stopping Error

    Stopping error is generated when top bar of the image window is clicked by virtual mouse. We haven't tried it, but here's a solution to solve this. The whole program is composed of two main program files as each independent. One is to control the virtual mouse from image processing. The other thing is to show the image in real time.

  • Robustness for Each Operation

    It is implemented to perform specific actions with predefined hand gestures. Even if the gesture is the same in the eyes of a human, it may be recognized as a different gesture. For example, even though it is cursor moving mode, it may be recognized as left click waiting mode according to a change in posture. It should be implemented to minimized the interference between each operation by giving additional conditions.

    Moving Mode

    Fault Recognition

    Left Click Waiting Mode

  • Brick Breaker Game

    Among the games with the mouse movements, there is a brick breaker game. If you link this game with the virtual mouse, it will be interesting content.

  • Two Hands Recognition

    More functions can be implemented by recognizing both hands. If the recognition of the two hands movements is further developed, it is possible to create a model that can interpret sign language.


5. References

HandTrackingModule.py : Class and functions that recognize hands using MediaPipe are defined. (It is referenced by )

CPS means click per second. CPS can be measured in this . We were able to compare the click speed of a general mouse and a virtual mouse. The test counts the number of clicks in 5 seconds. The results for 5 times are as follows.

hyKangHGU/VirtualMouse
AI Virtual Mouse with Hand Gesture Detection
Anaconda - DLIP (gitbook.io)
hyKangHGU/VirtualMouse (github.com)
youtube link
website
Mole Game
Hands - mediapipe (google.github.io)
(2) AI Virtual Mouse | OpenCV Python | Computer Vision - YouTube