Jiang MingPersonal website: http://ranger.uta.edu/~ming/
Jiang Ming is an assistant professor of CSE Department. His research focuses on security, especially software security and malware defense, including secure information flow analysis, software plagiarism detection, malicious binary code analysis, and software analysis for security issues. Jiang Ming has extensive academic and industry experience in computer security. His work has been published in prestigious security and software engineering conferences (USENIX Security, CCS, Euro S&P, FSE, and ASE). More recently he has been working on automated symbolic security analysis of malicious binary code.
Malware, or malicious software with harmful intents to compromise computer systems, is one of the major challenges to the Internet. Over the past years, the ecosystem of malware has evolved dramatically from “for-fun” activities to a profit-driven underground market, where cyber-criminals can simply purchase access to tens of thousands of malware-infected hosts for nefarious purposes. By the end of 2015, nearly half a billion malicious software are in circulation. Therefore, to protect computer systems, Security-relevant Software Analysis techniques have been developed to detect the intrusions, fix vulnerabilities, and analyze malware.
Many Security-relevant Software Analysis tools work on binary code instead of source code. Why binary code analysis is attractive in security? First, in many cases, people do not have access to the source code of the program they care about, such as commercial off-the-shelf software and malicious software. The binary code becomes the only available resource to be analyzed. Second, Binary analysis provides the ground truth about program behavior since computers execute binaries, not source code.
One class of binary code security analysis is to monitor an application execution and perform runtime security analysis. Compared to the pre-execution tools, runtime security analysis has the advantage of observing the actual dynamic state and better resilience to obfuscation methods.Unfortunately, most of the useful runtime security analysis
tools are extremely expensive. The main reason is security analysis code is injected into the original application binary code, and the execution context has to frequently switch between application execution and security analysis code. Dr. Ming’s research work (USENIX Security’15, ASE’16, and SCAM’16) is to address the performance bottleneck of dynamic taint analysis, which is a very useful runtime security analysis to trace the information flow along program execution. This technique has been widely applied in software attack detection, information flow control, data leak detection, and malware analysis. However, dynamic taint analysis suffers from high performance penalty. The slowdown introduced by conventional dynamic taint analysis tools can easily go up to 30X times. The high overhead has limited its adoption in production systems.
Inspired by the hardware pipelining, Dr. Ming et al. develop TaintPipe to parallelize taint analysis in a pipeline style to take advantage of pervasive multi-core platforms. The execution of expensive dynamic taint analysis is decoupled to multiple pipeline stages running in parallel. TaintPipe has two concurrently running parts: 1) the instrumented application thread doing a lightweight online logging and acting as the source of the pipeline; 2) multiple worker threads as different stages of the pipeline to perform concrete/symbolic taint analysis in parallel. A novelty of TainPipe is each worker thread starts running symbolic taint analysis very early even without knowing the explicit taint tags. TainPipe represents the unknown taint tags as symbolic variables and calculate the taint state symbolically, just like the symbolic execution on the straight-line code. When the first work thread completes the concrete taint analysis, TaintPipe updates the taint state of the second worker thread by replacing the relevant symbolic taint tags with real taint tags or concrete values. After that, the second worker thread will switches to the concrete taint analysis and continue processing the left code segment. In addition to pipeline-style parallelism, TaintPipe has a number of novel features including precision, performance, ability to handle incomplete inputs, conditional tainting, and multi-tag taint analysis. With the developed techniques, TaintPipe is able to significantly improve the performance of taint analysis and advance the state of the art, enabling broader adoption of information tracking technology.
In addition to secure information flow analysis, Dr. Ming also works on formal program semantics based methods for obfuscated binary code analysis (FSE’14, CCS’15, TRE’16, and TSE’17). Relentless malware developers typically apply various obfuscation schemes (e.g. packer, polymorphism, and metamorphism) to camouflage arresting features, circumvent malware detection, and impede reverse engineering attempts; software plagiarist transforms the stolen code in various ways to hide its appearance and logic. Therefore, an obfuscation-resilient binary code analysis method is of great necessity. Dr. Ming et al. have developed a set of principled, flexible techniques for analyzing obfuscated binary code. They make an effective argument that by using symbolic execution and SMT solvers, there is significant potential for analyzing obfuscated code. The proposed method is based on strong principles of program semantics and logic. Compared to the existing approaches, their approach is more resilient to automatic obfuscation schemes.
At UT-Arlington, Dr. Ming has begun collaborations with researchers in the area of machine learning, operating systems, and software engineering to study Internet of Things (IoT) security. IoT devices have been ubiquitous in our
daily lives to serve critical functions. Dr. Ming is interested in developing techniques to reason security properties of software running on IoT devices. One feature of IoT systems is that many critical functions such as communication, are handled by the software that has already been embedded in the device (termed “firmware”). Most of the firmware exist in binary executable code, not source code. Dr. Ming plans to build a binary analysis platform with static and dynamic hybrid approaches to reason security properties of the firmware. In particular, Dr. Ming will develop a set of firmware binary code analysis techniques to automate the tedious reverse engineering work, help people find firmware vulnerabilities (e.g., memory corruption flaws, command injection vulnerabilities, and application logic flaws), and defeat malicious attacks.