Enhancement of RGB-Based Human Action Recognition via Event-Inspired Multitask Learning

Jihwan Won, Hanwoong Ryu, Junghwan Lee, Danilo P. Madnic, Youngho Cho, Cheolsoo Park

June 2026

Abstract

Conventional human action recognition (HAR) using RGB cameras is often limited by challenges such as lighting variations and motion blur. While event cameras offer a promising alternative due to their high temporal resolution, they lack textural detail and are constrained by the scarcity of large-scale datasets. To address these issues, this paper proposes a multi-task learning training paradigm that uses event data exclusively as auxiliary supervision during training. Rather than replacing existing recognition architectures, the proposed approach introduces an auxiliary task that transforms RGB data into event-like representations, guiding a shared encoder to learn motion-sensitive features. Since the auxiliary branch is entirely discarded after training, the model operates on RGB input alone at inference, enabling deployment without event sensors. Training is optimized using a loss annealing strategy that gradually shifts focus from the auxiliary task to the primary HAR task. Experiments across five diverse backbones spanning CNN and transformer families show that the proposed framework improves RGB-only baselines across all five tested backbones, with the largest gains observed on transformer-based models in this setting. For select backbones, performance approaches or is slightly better than that of models trained on real event data.

Type

Journal article

Publication

IEEE Access

Enhancement of RGB-Based Human Action Recognition via Event-Inspired Multitask Learning

Abstract

Jihwan Won

Ph. D. Student

Hanwoong Ryu

Researcher, Suresoft Technologies Inc.

Junghwan Lee

Ph. D. Student

Cheolsoo Park

Professor