| dc.description.abstract |
One of the key detection methods that offers benefits in a variety of fields,
including video surveillance, video captioning, security, content censorship, and
military applications, is human action recognition. The main goal of this suggested
research is to efficiently categorise many actions from a video in order to create the
action label that each action represents. Convolution neural networks (CNNs) are a
subset of deep learning models that can operate directly on raw data; as a result, we
may utilise this subset of models to extract the appropriate relevant features from
the data and transform them to generate a series of frames with comparable actions.
Later, these frames can be used as a sequential input to feed the LSTM network for
action prediction. In this effort, a network termed the LRCN (Long-Term
Recurrent Convolutional Network) was created by combining these two networks.
In order to accomplish the project's objective, various networks are leveraged, and
their performance is assessed. Numerous publicly accessible datasets, including
UCF50, are used to assess the paper. These datasets' results indicate that the
suggested approach produces superior results in terms of overall accuracy. |
en_US |