Nowadays, many software developers choose to host their projects in open software repositories. These repositories have become valuable resources of software engineering research. Many emerging software engineering research efforts are based on analyzing and mining these repositories, or are evaluated on the software projects collected from these repositories. However, a major obstacle of effectively leveraging these repositories is that, most of repositories contain only source code, while a large portion of program analysis techniques target at objective code only. Thus, most software engineering researchers have to either manually build a small number of software projects to enable objective-code analysis on them, or use only source-code analysis techniques. In this paper, we propose to bridge this gap by automatic compiling and building software projects in software repositories. Specifically, we present a fully automatic building tool called AutoBuilder, that detects build configuration files, downloads dependencies, and resolves conflicts and incompatibilities. To evaluate our technique, we apply our approach on a randomly collected set of 1,000 software projects with 23 million lines of code from Github. The experimental results show that our approach is able to successfully build 58.0\% of the projects and compile 59.0\% of the code (in lines), compared with building 32.6\% of the projects, and compiling 32.3\% of the code (in lines) by a baseline approach.
xiaoyin.wang AT utsa.edu
Download AutoBuilder 0.1
Automatically bulding software projects to support analysis of software repositories
Preliminary Study Details
Up-to-date Set Evaluation Details
Historical Set Evaluation Details
Study of Build Correctness