Improving Modern Code Review Leveraging Contextual and Structural Information from Source Code
Date
2023-08-28
Authors
Shuvo, Ohiduzzaman
Journal Title
Journal ISSN
Volume Title
Publisher
Abstract
Review comments are a major building block of modern code reviews. Ensuring the quality of code review comments is essential, but manually writing high-quality review comments is technically challenging and time-consuming. Over the years, there have been numerous attempts to automatically assess and recommend code review comments, but they could be limited in several aspects. First, according to existing evidence, various development practices including code reviews could be drastically different between open and closed-source systems. However, only a little research has been done to better understand how existing techniques might perform differently when assessing the code reviews from open and closed-source systems. Second, existing techniques that recommend or generate code review comments often
suffer from a lack of scalability (e.g., requirements of specialized hardware by Deep Learning models) and generalizability (e.g., use of only one programming language).
In this thesis, we (a) conduct an empirical study to better understand the challenges of existing techniques for code review assessment and (b) propose a novel, scalable technique for review comment recommendation. First, we empirically investigate how existing techniques perform in assessing code reviews from open-source and closed-source systems. We find that the performance of existing techniques significantly differs when assessing code reviews from these two types of systems. Our findings also suggest that less experienced developers submit more non-useful review comments to both systems, which warrants for automated support in writing code reviews. Second, to help developers write better review comments, we propose a novel technique – RevCom – that recommends relevant review comments by leveraging various code-level changes with structured information retrieval. Our technique outperforms both IR-based and DL-based baselines while being lightweight, scalable and has the potential to reduce the cognitive effort and time of the reviewers.
Description
Keywords
Software Engineering, Modern Code Review, Code Changes, Structured Information Retrieval, Review Quality Assessment, Code Review Comments