VireoJD-MM @ TRECVID 2019: Activities in extended video (ACTEV)
In this paper, we describe the system developed for Activities in Extended Video(ActEV) task at TRECVid 2019 [1] and the achieved results. Activities in Extended Video(ActEV): The goal of Activities in Extended Video is to spatially and temporally localize the action instances in a surveillance sett...
Saved in:
Main Authors: | , , , |
---|---|
Format: | text |
Language: | English |
Published: |
Institutional Knowledge at Singapore Management University
2019
|
Subjects: | |
Online Access: | https://ink.library.smu.edu.sg/sis_research/6492 https://ink.library.smu.edu.sg/context/sis_research/article/7495/viewcontent/VireoJD_mm_actev.pdf |
Tags: |
Add Tag
No Tags, Be the first to tag this record!
|
Institution: | Singapore Management University |
Language: | English |
Summary: | In this paper, we describe the system developed for Activities in Extended Video(ActEV) task at TRECVid 2019 [1] and the achieved results. Activities in Extended Video(ActEV): The goal of Activities in Extended Video is to spatially and temporally localize the action instances in a surveillance setting. We have participated in previous ActEV prize challenge. Since the only difference between the two challenges is evaluation metric, we maintain previous pipeline [2] for this challenge. The pipeline has three stages: object detection, tubelet generation and temporal action localization. This time we extend the system for two aspects separately: better object detection and advanced two-stream action classification. We submit 2 runs, which are summarised below. - VireoJD-MM Pipeline1: This run achieves Partial AUDC=0.6012 using advanced two-stream action classification. It has been recognized in many papers [3, 4] that two-stream structure increases action recognition performance. In our prize challenge model, we only use RGB frames as input. For the submission this time, we extend the action classification stage into an advanced two-stream action classification module. - VireoJD-MM SecondarySystem: This run achieves Partial AUDC=0.6936 using better object detection model. The CMU team released the groundtruth of object bounding box provided by Kitware as well as their object detection and tracking code1 based on VIRAT dataset. They build a system to detect and track small objects in outdoor scenes for surveillance videos. For the submission this time, we replace our object detection and tracking code with their code and keep the remaining stages of tubelet generation and temporal action localization. |
---|