- As a
- software developer using GitHub code scanning
- I want to
- get quick feedback about security vulnerabilities on my PRs
- the current code scanning analyses often take too long to deliver updated feedback after a code change.
- Incremental CodeQL helps by
- delivering an update time that is proportional to the size of a code change
- the current code scanning analyses which re-analyze the entire code base from scratch irrespective of the change size.
GitHub code scanning is powered by the CodeQL analysis engine. The strength of code scanning lies in the sophisticated open source CodeQL analyses that engineers at GitHub and experts from the security community write to find security vulnerabilities. CodeQL offers unmatched support for deep and precise static analysis for a number of popular programming languages.
The security vulnerability was found by a CodeQL analysis that automatically runs after a push to the PR. This is a severe issue and even the corresponding CWE numbers are displayed.
While this kind of automated feedback is important to eliminate potential security vulnerabilities, it is only really helpful if it comes in a timely manner. Analysis results should ideally be available within (at most) a minute or two after a code change, otherwise developers may switch context and start working on something else, which quickly decreases the usefulness of the automated analysis.
If we investigate the logs, we can see that the analysis took around 6 minutes on the code base:
Unfortunately, this run time cost must be paid every time the analysis runs, even if only a very small part of the code base changes. As an extreme example, even if the developer just fixes a typo in a comment they still have to wait 6 minutes before updated (and unchanged!) analysis results become available. This is because all analysis runs are entirely independent from each other, and previous results are not reused in any way.
In the incremental CodeQL project, we investigate how we can speed up CodeQL analyses with incrementalization. Instead of repeated re-analysis from scratch on the entire code base, an incremental analysis keeps around the previously computed result and updates that based on the changed code parts. Prior research shows that this approach can lead to significant performance improvements because a small program change typically leads to small changes in the analysis result. In turn, developers can expect to see much quicker feedback on their PRs (seconds instead of minutes) because the runtime becomes proportional to the size of the code change and not the size of the entire code base.
While the idea of incrementalization is promising when it comes to static analyses, sophisticated static analyses are not necessarily compatible with incrementalization simply because of their nature. For example, analyzing across functions is typically required for good precision, but such analyses come with a high computational cost. When considering the transitive effects of called functions, it is easy to introduce program changes that have a large impact on previously computed results, making it difficult to gain anything with incremental behavior. To this end, we first investigate if it is really the case that typical program changes have a small impact on already computed results. We take real-world projects with their commit histories together with sophisticated CodeQL analyses and we look for empirical evidence that incrementality can actually be efficient.
We already carried out concrete experiments as part of this phase. We took a concrete CodeQL analysis for Ruby which identifies potential SQL injections in Ruby source code. The analysis is representative in terms of its complexity and practical relevance because it exhibits features and building blocks that industry-strength static analyses typically employ: inter-procedural reasoning with flow-sensitivity based on SSA representation of the input program. We measured the impact of program changes using a technique introduced in prior research in the context of incremental static analyses. Concretely, we took real-world Ruby projects with their commit histories. For each commit, we considered the analysis result that got computed for the immediately preceding commit, and we measured how much of the previous result needs to be changed to get to the result representative for the new commit. Below we present a plot for one of the Ruby projects we benchmarked:
The x axis shows the size of the commits arranged into buckets. The number range shows how many lines were inserted/deleted by a particular commit. The y axis shows the change rate of the previous analysis result. The plot shows the trend that small commits lead to small changes in the analysis result, and we only see >10 % change rates for considerably large commits (affecting more than a 1000 lines). We observed a similar trend for other Ruby projects, as well. All in all, these results show that incrementalization has potential when it comes to speeding up static analyses.
Phase 2: Can we create a prototype evaluator that exploits the incrementalizability of CodeQL analyses?
The goal of this phase is to show that we can actually exploit that typical program changes have a small impact on previously computed analysis results. To investigate this, we create a prototype incremental evaluator for CodeQL. Instead of changing the production evaluator (which is a complex, heavily optimized piece of software), we take an existing incremental evaluator that is close to the existing CodeQL evaluator in terms of its capabilities. Then, we take representative CodeQL analyses, and we transform those into a format that the other incremental solver can understand.
For benchmarking of run times, we use the same real-world projects and their commit histories that we used in the previous phase. Primarily we benchmark run times, but we pay careful attention to correctness, as well: We automatically verify that the results computed by the other solver are exactly the same as what CodeQL would compute if we would run it from scratch.
We are focused on relative performance improvements and not so much on the absolute numbers: We want to see that an incremental update, on average, takes only a small fraction of a complete re-analysis. We do not plan to replace the production CodeQL evaluator at this phase of the project, this is why the absolute performance is not that important.
We are currently working on this phase, and we do not yet have a production-ready implementation for the public. Stay tuned for a demo though!
This phase is about creating a production-ready incremental evaluator for CodeQL. Given that the existing evaluator comes with a large number of optimizations that are crucial for good performance even in a non-incremental setting, it seems obvious that we should modify the existing evaluator and make that incremental. Given that the existing evaluator is a highly complex piece of software, there remain numerous research challenges for this phase.