引用本文
  • 姜淼,李敏,任俊星,杨阳,白入文,赵士贤,黄伟庆.基于深度学习的人-物交互检测研究进展综述[J].信息安全学报,已采用    [点击复制]
  • jiang miao,li min,ren jun xing,yang yang,bai ru wen,zhao shi xian,huang wei qing.Survey of Deep Learning-Based Human-Object Interaction Detection[J].Journal of Cyber Security,Accept   [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭

过刊浏览    高级检索

本文已被:浏览 11798次   下载 3498  
基于深度学习的人-物交互检测研究进展综述
姜淼, 李敏, 任俊星, 杨阳, 白入文, 赵士贤, 黄伟庆
0
(中国科学院信息工程研究所)
摘要:
人-物交互检测旨在识别图像中的人体、物体及二者之间的交互关系。在以人为中心的场景中,人-物交互检测作为更高层语义理解的基础,在行为分析、场景理解、视频结构化等计算机视觉任务中发挥着重要的作用,同时在公共安全、企业管理等社会生活领域有很高的应用价值。近年来,深度学习的发展与大规模数据集的提出推动人-物交互检测领域的发展,但当前该领域的综述很少且不够全面。本文旨在对基于深度学习的人-物交互检测方法进行全面地综述,将人-物交互检测分为目标检测、人-物对关联与交互预测三部分,并重点介绍其中的人-物对关联方法与交互预测方法,从检测框架、特征信息、交互区域定义等角度对方法进行总结。具体而言,本文首先介绍人-物交互检测的背景,其次概述基于深度学习的人-物交互检测框架,并讨论串行与并行人-物交互检测中的人-物对关联模块与交互预测模块,再次介绍人-物交互检测的数据集和评价指标,并在两个常用数据集上对比不同方法的性能,然后指出人-物交互数据集中的长尾分布问题,并讨论该问题的解决方案,最后对人-物交互检测领域的未来研究方向进行总结与展望。本文期望通过梳理基于深度学习的人-物交互检测的研究现状,为该领域的未来方向提供可借鉴的思路。
关键词:  人-物交互检测  深度学习  计算机视觉
DOI:10.19363/J.cnki.cn10-1380/tn.2024.08.15
投稿时间:2023-04-19修订日期:2023-06-19
基金项目:国家重点研发计划
Survey of Deep Learning-Based Human-Object Interaction Detection
jiang miao, li min, ren jun xing, yang yang, bai ru wen, zhao shi xian, huang wei qing
(Institute of Information Engineering, Chinese Academy of Sciences)
Abstract:
Human-object interaction detection aims to identify humans, objects, and their interactions in images. In human-centered scenarios, human-object interaction detection serves as the foundation for higher-level semantic understanding, and plays an important role in computer vision tasks such as behavior analysis, scene understanding, and video structuring. It also has high application value in social life fields such as public safety and enterprise management. In recent years, the de-velopment of deep learning and the availability of large-scale datasets have driven the advancement of human-object in-teraction detection. However, there are few comprehensive reviews on this field currently. This paper aims to provide a comprehensive overview of human-object interaction detection methods based on deep learning, considering it as con-sisting of three parts: object detection, human-object pair association, and interaction prediction. This paper focuses on methods for human-object pair association and interaction prediction, and summarizes the methods from the perspectives of framework, feature, and interaction region. Specifically, this paper first introduces the background of human-object interaction detection, then outlines the framework of deep learning-based human-object interaction detection, and dis-cusses the human-object pair association module and interaction prediction module in both sequential and parallel human-object interaction detection. It further introduces the datasets and evaluation metrics for human-object interaction detection, compares the performance of different methods on two commonly used datasets, and points out the long-tail distribution problem in human-object interaction datasets, as well as discusses solutions to this problem. Finally, this paper summarizes and prospects the future research directions in the field of human-object interaction detection. It is expected that this paper will provide insightful ideas for the future direction of research in the field of deep learning-based hu-man-object interaction detection through a comprehensive review of the current research status.
Key words:  human-object interaction detection  deep learning  computer vision