引用本文: |
-
周建华,许丽丽,李丰,刘益铭,霍玮.二进制程序语义信息恢复技术研究综述[J].信息安全学报,已采用 [点击复制]
- zhoujianhua,xulili,lifeng,liuyiming,huowei.A Survey of Semantic Information Recovery in Binary Programs[J].Journal of Cyber Security,Accept [点击复制]
|
|
摘要: |
以变量类型信息、程序控制流信息、功能信息等为代表的语义信息,是二进制程序分析与理解的基础,对于软件漏洞检测、恶意代码检测等技术精度的提升至关重要。然而,受到编译优化、符号剥离,以及编程语言、编译器、操作系统和目标体系结构差异的影响,二进制程序语义信息恢复面临诸多技术挑战。本文将目前研究人员普遍关注的二进制程序语义信息技术归纳为基于程序数据的类型推断技术、基于代码指令的程序结构识别技术以及服务代码理解的程序功能恢复技术三大类,对近10年间的代表性技术进行阐述,并统计分析了上述技术在实验数据集选取、基础分析平台选用、体系结构支持等方面的趋势与不足。最后,对未来的研究方向进行了展望。 |
关键词: 二进制程序 类型推断 控制流 语义信息恢复 |
DOI:10.19363/J.cnki.cn10-1380/tn.2023.08.30 |
投稿时间:2021-05-17修订日期:2021-07-26 |
基金项目:国家自然科学基金项目(面上项目,重点项目,重大项目) |
|
A Survey of Semantic Information Recovery in Binary Programs |
zhoujianhua, xulili, lifeng, liuyiming, huowei
|
(Institute of Information Engineering, Chinese Academy of Sciences) |
Abstract: |
Semantic information of binary programs, such as variable type, control flow, and functionalities, is the basis of binary program analysis and is essential for improving the accuracy of software vulnerability detection and malicious code detection. However, due to the compilation and stripping processes, and the differences in programming languages, compilers, operating systems and target architectures, the recovery of binary program semantic information can be an extremely challenging task. This paper surveys the technologies for the recovery of binary program semantic information that researchers generally concern about, and summarizes them into three categories: type inference technology based on program data, program structure recognition technology based on code instructions, and program functionality recovery technology based on code understanding. The representative technologies in the recent ten years are presented accordingly. The trends and deficiencies of the above technologies in the benchmarks used, the platforms selected, and the architectures supported are statistically analyzed. Finally, the future research directions are prospected. |
Key words: Binary Program, Type Inference, control flow, semantic information recovery |