This is the accompanying code for the research report Brandon Wu and I had written for CMU's 15-745, graduate Optimizing Compilers course. We perform an investigation on Andersen Analysis (Program Analysis and Specialization for the C Programming Language) on production sized, and widely used programs C/C++ programs such as Redis Server, Bcrypt, Bzip2, and and Gzip. We evaluated our approach using LLVM, and specifically targetted C/C++ programs since we have noticed compilers research has focused very heavily on Java.
We utilized GraalVM to seamlessly compile such large programs since it stores generated bitcode in the output binary in the ".llvmbc" section, and allows us just to swap in compilers seamlessly with environment flags.
We augment Andersen analysis through a context sensitive, flow insensitive approach using "call-strings" as referenced in the paper: Evaluating the impact of context-sensitivity on Andersen's algorithm for Java programs. We further investigate the application of this context-sensitive approach towards Semantic Models as described in Scalable Pointer Analysis of Data Structures using Semantic Models.
We obtain a variety of results on the accuracy/performance/size of analysis on each program, as well as attempt to expand upon Semantic Models by evaluating use cases in C/C++.