Artificial Intelligence Code: Action Recognition from Video Using Deep Learning Models in Python

Action recognition is a computer vision problem in which we have to recognize. Action recognition means recognizing human actions performed in a video clip.

Recognizing human actions from video automatically is very useful for AI and many different applications such as surveillance, video search. It is a trending topic in computer vision. Therefore, creating a model that can take video as an input and recognize all of the actions performed by humans present in that video will be very beneficial. The most popular method for action recognition from the video is using deep learning models. Which are neural network models with lots of layers, they learn important features using gradient descent algorithm. These are iterative algorithm, which reduces the error over multiple iterations. This whole process is automatic, except for creating a labeled dataset.

There are many factors, which are hard to deal with when doing action recognition from videos. Such as occlusion (view of some object is blocked), Intraclass similarity (similarity between 2 or more different classes), and camera motion or camera viewpoint.

Human Actions

Humans perform many different kinds of actions from small actions to some group activity, some of these actions are labeled and others are not mentioned most of the time.

Human actions are of different types such as:

⦁ Gestures

Gestures include small body part movements, such as waving an arm or leg.

⦁ Actions

Action is a combination of gestures to perform certain actions such as walling, running, dancing, etc.

⦁ Human-Object Interaction

Video clips in which human interaction with some objects such as opening the door, moving chairs, etc.

⦁ Interaction among 2 people

Video clips in which human interaction with another human such as shaking hands, hugging, fighting, etc.

⦁ Group Activities

Video clips in which a group of people combines to do some activities such as playing a football match.

Real-World Application of Action Recognition

Action recognition from the video is in high demand in many different areas such as surveillance, entertainment, content-based video retrieval, Human-Computer interaction, Robotics.

⦁ Video Surveillance

In many different parts of the world, many different people have installed surveillance cameras around their homes or businesses, but constant monitoring of these cameras is not an easy task since a human cannot focus on videos all the time. These cameras are normally used as a fact-checking source after an incident has occurred. Having an automatic monitoring system for videos will help to make surveillance easy and improve the efficiency of the surveillance by a lot.

⦁ Content-based video retrieval

Many platforms have billions of videos and several thousand videos are uploaded daily on such platforms, but these videos can only be searched through their title, tags, description or the user who upload them. Sometimes these videos are either misleading or their title could be clickbait. If we could only understand what is in the videos, then we could improve the search of these videos based on their content.

⦁ Entertainment

Action Recognition can be used in gaming to get input from the user, or we can transfer the whole-body movement of a person into a game, which could be used to play games such as dance or sports. This can make gaming very interactive for many different people.

⦁ Human-Computer Interaction

This action recognition can be used as an input for the computer or any computer technology such as robots or Automatic driving cars. Robots can observe us and help us when needed, or robots can detect the intention of a person then act according to it.

Colab Notebook

https://colab.research.google.com/drive/18D1hWvHAonuUm-G3OFEiDQ7B5hR4eqjH?usp=sharing

4 comments:

DavidMarch 8, 2021 at 1:49 PM
hi, where i can find the source code ? did you have a course step by step doing this in phyton keras ??
Faheem AliApril 18, 2021 at 2:45 AM
Here you can find complete code

https://colab.research.google.com/drive/18D1hWvHAonuUm-G3OFEiDQ7B5hR4eqjH?usp=sharing
Mahela WijekoonMay 27, 2021 at 12:47 PM
Hello, I'm getting this error. Do you have any suggestions on this ?

TypeError: Image data of dtype object cannot be converted to float
UnknownJune 25, 2021 at 3:43 AM
Thank you very much

Artificial Intelligence Code

Action Recognition from Video Using Deep Learning Models in Python

Human Actions

Real-World Application of Action Recognition

Colab Notebook

4 comments:

Building a CLI-Based People Tracking and Dwell Time Analytics System Using YOLOv8 and DeepSORT

Followers