  • 刘嘉勇,韩家璇,黄诚.源代码漏洞静态分析技术[J].信息安全学报,2022,7(4):100-113    [点击复制]
  • LIU Jiayong,HAN Jiaxuan,HUANG Cheng.Vulnerability Detection In Source Code Using Statice Analysis[J].Journal of Cyber Security,2022,7(4):100-113   [点击复制]
【打印本页】 【下载PDF全文】 查看/发表评论下载PDF阅读器关闭


过刊浏览    高级检索

本文已被:浏览 5709次   下载 8018 本文二维码信息
刘嘉勇, 韩家璇, 黄诚
(四川大学 网络空间安全学院 成都 中国 610207)
关键词:  源代码漏洞  静态分析  数据流分析  污点分析  机器学习
Vulnerability Detection In Source Code Using Statice Analysis
LIU Jiayong, HAN Jiaxuan, HUANG Cheng
(School of Cyber Science and Engineering, Sichuan University, Chengdu 610207, China)
The term vulnerability has gone through several decades with the development of the computer software field. Since the first software vulnerability in the world was made public, software security researchers and engineers have been exploring the methods of vulnerability mining and analysis. The static analysis of source code vulnerability is a technology that can run through the whole software development life cycle and help software developers find software vulnerabilities early. It is widely used in the industry. However, with the increasing volume and complexity of software, how to represent and model the software source code is a difficult problem at present. In addition, in recent years, researchers tend to combine static analysis of source code vulnerabilities with machine learning, trying to improve the accuracy of vulnerability mining by introducing machine learning model. Nonetheless, how to select and build a suitable machine learning model is a core issue in this research direction. This paper focuses on the static analysis technology of source code vulnerability (hereinafter referred to as static analysis technology), and reviews the related work in this field. The research of static analysis technology is divided into two directions: traditional static analysis and learning-based static analysis. Traditional static analysis mainly uses a series of software analysis technologies such as data flow analysis and taint analysis to model and analyze the source code of the software; learning-based static analysis represents the source code in numerical form and submits it to the learning model, then using the learning model to mine the deep representation features and relevance of the source code. This paper first expounds the basic concepts of software vulnerability analysis technology, and compares the advantages and disadvantages of static analysis technology and dynamic analysis technology. Next, the representation method of the source code is explained. After that, this paper summarizes the general steps of traditional static analysis and learning-based static analysis, and systematically combs the typical research results of these two research directions, summarizes their technical characteristics and workflow, puts forward the existing problems in the current static analysis technology, and looks forward to the future research work in these directions.
Key words:  source code vulnerability  static analysis  dataflow analysis  taint analysis  machine learning