Action recognition is a computer vision problem in which we have to recognize. Action recognition means recognizing human actions performed in a video clip.
Recognizing human actions from video automatically is very useful for AI and many different applications such as surveillance, video search. It is a trending topic in computer vision. Therefore, creating a model that can take video as an input and recognize all of the actions performed by humans present in that video will be very beneficial. The most popular method for action recognition from the video is using deep learning models. Which are neural network models with lots of layers, they learn important features using gradient descent algorithm. These are iterative algorithm, which reduces the error over multiple iterations. This whole process is automatic, except for creating a labeled dataset.
There are many factors, which are hard to deal with when doing action recognition from videos. Such as occlusion (view of some object is blocked), Intraclass similarity (similarity between 2 or more different classes), and camera motion or camera viewpoint.
Human Actions
Humans perform many different kinds of actions from small actions to some group activity, some of these actions are labeled and others are not mentioned most of the time.
Human actions are of different types such as:
⦁ Gestures
Gestures include small body part movements, such as waving an arm or leg.
⦁ Actions
Action is a combination of gestures to perform certain actions such as walling, running, dancing, etc.
⦁ Human-Object Interaction
Video clips in which human interaction with some objects such as opening the door, moving chairs, etc.
⦁ Interaction among 2 people
Video clips in which human interaction with another human such as shaking hands, hugging, fighting, etc.
⦁ Group Activities
Video clips in which a group of people combines to do some activities such as playing a football match.
Real-World Application of Action Recognition
Action recognition from the video is in high demand in many different areas such as surveillance, entertainment, content-based video retrieval, Human-Computer interaction, Robotics.
⦁ Video Surveillance
In many different parts of the world, many different people have installed surveillance cameras around their homes or businesses, but constant monitoring of these cameras is not an easy task since a human cannot focus on videos all the time. These cameras are normally used as a fact-checking source after an incident has occurred. Having an automatic monitoring system for videos will help to make surveillance easy and improve the efficiency of the surveillance by a lot.
⦁ Content-based video retrieval
Many platforms have billions of videos and several thousand videos are uploaded daily on such platforms, but these videos can only be searched through their title, tags, description or the user who upload them. Sometimes these videos are either misleading or their title could be clickbait. If we could only understand what is in the videos, then we could improve the search of these videos based on their content.
⦁ Entertainment
Action Recognition can be used in gaming to get input from the user, or we can transfer the whole-body movement of a person into a game, which could be used to play games such as dance or sports. This can make gaming very interactive for many different people.
⦁ Human-Computer Interaction
This action recognition can be used as an input for the computer or any computer technology such as robots or Automatic driving cars. Robots can observe us and help us when needed, or robots can detect the intention of a person then act according to it.
Colab Notebook
https://colab.research.google.com/drive/18D1hWvHAonuUm-G3OFEiDQ7B5hR4eqjH?usp=sharing
hi, where i can find the source code ? did you have a course step by step doing this in phyton keras ??
ReplyDeleteHere you can find complete code
ReplyDeletehttps://colab.research.google.com/drive/18D1hWvHAonuUm-G3OFEiDQ7B5hR4eqjH?usp=sharing
Hello, I'm getting this error. Do you have any suggestions on this ?
ReplyDeleteTypeError: Image data of dtype object cannot be converted to float
Thank you very much
ReplyDelete