Publications Related to Boa
If you have used Boa in a research paper, please contact us and let us know so we can include the publication in this list!
Export citations (BibTeX)
Initial Publication (please cite this)
Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen, "Boa: A Language and Infrastructure for Analyzing Ultra-Large-Scale Software Repositories", In the proceedings of the 35th International Conference on Software Engineering (ICSE 2013), May 22, 2013. San Francisco, CA. [paper] [slides]
Publications about Boa
- Che Shian Hung, and Robert Dyer, "Boa Views: Easy Modularization and Sharing of MSR Analyses", In the proceedings of the 17th International Conference on Mining Software Repositories (MSR 2020), October 5, 2020. Seoul, Korea. [paper]
- Ramanathan Ramu, Ganesha Upadhyaya, Hoan Anh Nguyen, and Hridesh Rajan, "BCFA: Bespoke Control Flow Analysis for CFA at Scale", In the International Conference on Software Engineering (ICSE 2020), May 2020. Seoul, South Korea. [paper]
- Ganesha Upadhyaya, and Hridesh Rajan, "Collective Program Analysis", In the International Conference on Software Engineering (ICSE 2018), May 2018. Gothenburg, Sweden. [paper]
- Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen, "Boa: an Enabling Language and Infrastructure for Ultra-large Scale MSR Studies", A book chapter in The Art and Science of Analyzing Software Data, Sep. 15, 2015. Morgan-Kaufmann.
- Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen, "Boa: Ultra-Large-Scale Software Repository and Source Code Mining", In ACM Transactions on Software Engineering and Methodology (TOSEM), Vol. 25, No. 1, Article 7. [paper]
- Robert Dyer, Hridesh Rajan, and Tien N. Nguyen, "Declarative Visitors to Ease Fine-grained Source Code Mining with Full History on Billions of AST Nodes", In the proceedings of the 12th International Conference on Generative Programming: Concepts & Experiences (GPCE 2013), Oct 27, 2013. Indianapolis, IN. [paper] [slides]
- Robert Dyer, "Task Fusion: Improving Utilization of Multi-user Clusters", Student Research Competition at the 4th International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2013), Oct 31, 2013. Indianapolis, IN. [paper] [poster] [slides]
Examples of publications that have used Boa
- Robert Dyer, Hridesh Rajan, Hoan Anh Nguyen, and Tien N. Nguyen, "Mining Billions of AST Nodes to Study Actual and Potential Usage of Java Language Features", In the proceedings of the 36th International Conference on Software Engineering (ICSE 2014), June, 2014. Hyderabad, India. [paper] [slides] [supplement]
- Samuel W. Flint, A. M. Keshk, and Robert Dyer, "How Do Developers Use Type Inference: An Exploratory Study in Kotlin", In Empirical Software Engineering (EMSE), Vol. 30, No. 55, 2025. [paper]
- Capilla R, Salamanca V, Valdezate A, Robles G., "Can instability variations warn developers when open-source projects boost?", In Empirical Software Engineering (EMSE), Vol. 29, No. 4, 2024. [paper]
- Zhongxing Yu, Matias Martinez, Zimin Chen, Tegawendé F. Bissyandé, and Martin Monperrus, "Learning the Relation Between Code Features and Code Transforms With Structured Prediction", In IEEE Transactions on Software Engineering (TSE), Vol. 9, No. 7, 2023. [paper]
- Robert Dyer, and Jigyasa Chauhan, "An Exploratory Study on the Predominant Programming Paradigms in Python Code", In the proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2022), November, 2022. Singapore. [paper] [supplement]
- Samuel W. Flint, Jigyasa Chauhan, and Robert Dyer, "Pitfalls and Guidelines for Using Time-Based Git Data", In Empirical Software Engineering (EMSE), October, 2022.
- Sumon Biswas, Mohammad Wardat, and Hridesh Rajan, "The Art and Practice of Data Science Pipelines: A Comprehensive Study of Data Science Pipelines In Theory, In-The-Small, and In-The-Large", In the proceedings of the 44th International Conference on Software Engineering (ICSE 2022), May, 2022. Pittsburgh, PA. [paper]
- Hushuang Zeng, Jingxin Chen, Beijun Shen, and Hao Zhong "Mining API Constraints from Library and Client to Detect API Misuses", In the proceedings of the 28th Asia-Pacific Software Engineering Conference (APSEC 2021), December 2021. Taipei, Taiwan.
- Samuel W. Flint, Jigyasa Chauhan, and Robert Dyer, "Escaping the Time Pit: Pitfalls and Guidelines for Using Time-Based Git Data", A distinguished paper in the proceedings of the International Conference on Mining Software Repositories (MSR'21), May, 2021. Madrid, Spain. [paper] [supplement]
- Kathryn Cunningham, Barbara J. Ericson, Rahul Agrawal Bejarano, and Mark Guzdial "Avoiding the Turing Tarpit: Learning Conversational Programming by Starting from Code's Purpose", In the proceedings of the 2021 CHI Conference on Human Factors in Computing Systems (CHI 2021), May 2021. Yokohama, Japan.
- Victor Hugo Santiago C. Pinto, Alberto Luiz Oliveira Tavares de Souza, Yuri Matheus Barboza de Oliveira, and Danilo Monteiro Ribeiro "Cognitive-Driven Development: Preliminary Results on Software Refactorings", In the proceedings of the 16th International Conference on Evaluation of Novel Approaches to Software Engineering (ENASE 2021), April 2021. Online.
- Gustavo Pereira, and Andre Hora "Assessing Mock Classes: An Empirical Study", In the proceedings of the 36th International Conference on Software Maintenance and Evolution (ICSME 2020), September 27, 2020. Adelaide, Australia.
- M. Köhler, and G. Salvaneschi, "Automated Refactoring to Reactive Programming", In the proceedings of the 34th International Conference on Automated Software Engineering (ASE 2019), November 14, 2019. San Diego, CA.
- C. Lima, and A. Hora, "What Are the Characteristics of Popular APIs? A Large Scale Study on Java, Android, and 165 Libraries", In Software Quality Journal, vol. 1, pages 1-38, 2019.
- V. Bezerra, L. Rocha, J. Filho, and F. Trinta, "An Empirical Study on Inter-Component Exception Notification in Android Platform", In the proceedings of the XXXIII Brazilian Symposium on Software Engineering (SBES 2019), September 23, 2019. Salvador, Brazil.
- A. Matos, J. Filho, and L. Rocha, "Splitting APIs: an exploratory study of software unbundling", In the proceedings of the 16th International Conference on Mining Software Repositories (MSR 2019), May 26, 2019. Montreal, Canada.
- S. Amann, H. A. Nguyen, S. Nadi, T. N. Nguyen, and M. Mezini, "Investigating Next Steps in Static API-Misuse Detection", In the proceedings of the 16th International Conference on Mining Software Repositories (MSR 2019), May 26, 2019. Montreal, Canada.
- C. Chen, Z. Xing, Y. Liu, and K.L.X. Ong, "Mining Likely Analogical APIs across Third-Party Libraries via Large-Scale Unsupervised API Semantics Embedding", In IEEE Transactions on Software Engineering (TSE), January 2019.
- Jackson Maddox, Yuheng Long, and Hridesh Rajan, "Large-scale Study of Substitutability in the Presence of Effects", In the proceedings of the ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE 2018), November 7, 2018. Orlando, Florida.
- Fernando Lopez de la Mora, and Sarah Nadi, "An Empirical Study of Metric-based Comparisons of Software Libraries", In the 14th International Conference on Predictive Models and Data Analytics in Software Engineering (PROMISE 2018), October 2018. Oulu, Finland.
- Fernando Lopez de la Mora, and Sarah Nadi, "Which library should I use? A metric-based comparison of software libraries", In the International Conference on Software Engineering New Ideas and Emerging Results Track (ICSE NIER 2018), May 2018. Gothenburg, Sweden.
- Tianyi Zhang, Ganesha Upadhyaya, Anastasia Reinhardt, Hridesh Rajan, and Miryung Kim, "Are Code Examples on an Online Q&A Forum Reliable? A Study of API Misuse on Stack Overflow", In the International Conference on Software Engineering (ICSE 2018), May 2018. Gothenburg, Sweden.
- Elena Sherman, and Matthew B. Dwyer, "Structurally Defined Conditional Data-Flow Static Analysis", In International Conference on Tools and Algorithms for the Construction and Analysis of Systems (TACAS 2018), April 2018. Thessaloniki, Greece.
- Elena L. Glassman, Tianyi Zhangk, Björn Hartmann, and Miryung Kim, "Visualizing API Usage Examples at Scale", In Conference on Human Factors in Computing Systems (CHI 2018), April 2018. Montreal, Canada.
- Eduardo C. Campos, and Marcelo A. Maia, "On the actual use of inheritance and interface in Java projects: evolution and implications", In Proceedings of the 27th Annual International Conference on Computer Science and Software Engineering (CASCON'17). November 2017.
- Eduardo C. Campos, and Marcelo A. Maia, "Common bug-fix patterns: a large-scale observational study", In Proceedings of the 11th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM'17). November 2017.
- Michele Tufano, Fabio Palomba, Gabriele Bavota, Massimiliano Di Penta, Rocco Oliveto, Andrea De Lucia, and Denys Poshyvanyk, "There and Back Again: Can you Compile that Snapshot?", In Journal of Software: Evolution and Process, Vol. 29, No. 4. April 2017.
- Neil C. Borle, Meysam Feghhi, Eleni Stroulia, Russell Greiner, and Abram Hindle, "Analyzing the effects of test driven development in GitHub", In Empirical Software Engineering, 2017.
- Jafar M. Al-Kofahi, Suresh Kothari, and Christian Kästner, "Four languages and lots of macros: analyzing autotools build systems", In the proceedings of the 16th ACM SIGPLAN International Conference on Generative Programming: Concepts and Experiences (GPCE 2017), October 23, 2017. Vancouver, BC, Canada.
- Benjamin Holland, Ganesh Ram Santhanam, and Suresh Kothari, "Transferring state-of-the-art immutability analyses: An experimentation toolbox and accuracy benchmark", In the proceedings of the 10th IEEE International Conference on Software Testing, Verification and Validation (ICST 2017), March 16, 2017. Tokyo, Japan.
- Laerte Xavier, Andre Hora, and Marco Tulio Valente, "Historical and Impact Analysis of API Breaking Changes: A Large Scale Study", In the proceedings of the 24th International Conference on Software Analysis, Evolution and Reengineering (SANER 2017), February 22, 2017. Klagenfurt, Austria.
- Andre Hora, Marco Tulio Valente, Romain Robbes, and Nicolas Anquetil, "When Should Internal Interfaces be Promoted to Public?", In the proceedings of the 24th International Symposium on the Foundations of Software Engineering (FSE 2016), November 16, 2016. Seattle, Washington.
- Maurício Aniche, Christoph Treude, Andy Zaidman, Arie van Deursen, and Marco Aurélio Gerosa, "SATT: Tailoring Code Metric Thresholds for Different Software Architectures", In the proceedings of the 16th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM 2016), October 2, 2016. Raleigh, North Carolina.
- Abbas Shakiba, Robert Green, and Robert Dyer, "FourD: "Do Developers Discuss Design?" Revisited", In the proceedings of the 2nd Workshop on Software Analytics (SWAN 2016), November 13, 2016. Seattle, Washington.
- Maurício Aniche, Gabriele Bavota, Christoph Treude, Arie van Deursen, and Marco Aurélio Gerosa, "A Validated Set of Smells in Model-View-Controller Architectures", In the proceedings of the 32nd International Conference on Software Maintenance and Evolution (ICSME 2016), October 8, 2016. Raleigh, North Carolina.
- Thomas Shippey, Tracy Hall, Steve Counsell, and David Bowes, "So You Need More Method Level Datasets for Your Software Defect Prediction? Voilà!", In the proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement (ESEM 2016), September 8, 2016. Ciudad Real, Spain.
- Md Rakibul Islam, and Minhaz F. Zibran, "Towards understanding and exploiting developers' emotional variations in software engineering", In the proceedings of the 14th IEEE International Conference on Software Engineering Research, Management and Applications (SERA 2016), June 8, 2016. Towson, Maryland.
- Mary Beth Kery, Claire Le Goues, and Brad Myers, "Examining Programmer Practices for Locally Handling Exceptions", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Themistoklis Diamantopoulos, Klearchos Thomopoulos, and Andreas Symeonidis, "QualBoa: Reusability-aware Recommendations of Source Code Components", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Casimir Desarmeaux, Andrea Pecatikov, and Shane McIntosh, "The Dispersion of Build Maintenance Activity across Maven Lifecycle Phases", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Jacob Barnett, Charles Gathuru, Luke Soldano, and Shane McIntosh, "The Relationship between Commit Message Detail and Defect Proneness in Java Projects on GitHub", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Suman Nakshatri, Maithri Hegde, and Sahithi Thandra, "Analysis of Exception Handling Patterns in Java Projects: An Empirical Study", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Eddie Antonio Santos, and Abram Hindle, "Judging a commit by its cover: Correlating commit message entropy with build status on Travis-CI", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Shaiful Chowdhury, and Abram Hindle, "Characterizing Energy-Aware Software Projects: Are They Different?", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Mauricio Soto, Ferdian Thung, Chu-Pan Wong, Claire Le Goues, and David Lo, "A deeper look into bug fixes: Patterns, replacements, deletions, and additions", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Muhammad Asaduzzaman, Muhammad Ahasanuzzaman, Chanchal K. Roy, and Kevin Schneider, "How Developers Use Exception Handling in Java?", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Vinayak Sinha, Alina Lazar, and Bonita Sharif, "Analyzing Developer Sentiment in Commit Logs", In the proceedings of the 13th International Conference on Mining Software Repositories (MSR 2016), May 14, 2016. Austin, Texas.
- Rijnard van Tonder, and Claire Le Goues, "Defending Against the Attack of the Micro-clones", In the proceedings of the 24th IEEE International Conference on Program Comprehension (ICPC 2016), May 16, 2016. Austin, Texas.
- Elena Sherman, and Matthew B. Dwyer, "Exploiting Domain and Program Structures to Synthesize Efficient and Precise Data Flow Analyses", In the proceedings of the 30th International Conference on Automated Software Engineering (ASE 2015), November 9, 2015. Lincoln, Nebraska.
- Christopher Vendome, Mario Linares-Vasquez, Gabriele Bavota, Massimiliano Di Penta, Daniel German, and Denys Poshyvanyk, "When and Why Developers Adopt and Change Software Licenses", In the proceedings of the 31st IEEE International Conference on Software Maintenance and Evolution (ICSME 2015), Sep 30, 2015. Bremen, Germany.
- Christopher Vendome, Mario Linares-Vasquez, Gabriele Bavota, Massimiliano Di Penta, Daniel German, and Denys Poshyvanyk, "License Usage and Changes: A Large-Scale Study of Java Projects on GitHub", In the proceedings of the 23rd IEEE International Conference on Program Comprehension (ICPC 2015), May 18, 2015. Florence, Italy.
- Vahid Amintabar, Abbas Heydarnoori, and Mohammad Ghafari, "ExceptionTracer: A Solution Recommender for Exceptions in an Integrated Development Environment", In the proceedings of the 23rd IEEE International Conference on Program Comprehension (ICPC 2015), May 18, 2015. Florence, Italy.
- Ishtiaque Hussain, Christoph Csallner, Mark Grechanik, Qing Xie, Sangmin Park, Kunal Taneja, and B.M. Mainul Hossain, "RUGRAT: Evaluating program analysis and testing tools and compilers with large generated random benchmark applications", In Software—Practice & Experience, 2014.
- Arnout Roemers, Kardelen Hatun, and Christoph Bockisch, "An Adapter-aware, Non-intrusive Dependency Injection Framework for Java", In the proceedings of the 2013 International Conference on Principles and Practices of Programming on the Java Platform: Virtual Machines, Languages, and Tools (PPPJ 2013), Sep 11, 2013. Stuttgart, Germany.
Theses/Dissertations using Boa
- Brent van Bladel, "Test Code: a New Frontier in Code Cloning Research", Ph.D. dissertation, Universiteit Antwerpen, 2023. [paper]
- Jigyasa Chauhan, "An Empirical Study on the Classification of Python Language Features Using Eye-Tracking", Masters thesis, School of Computerint, University of Nebraska-Lincoln, 2022. [paper]
- Sumon Biswas, "Understanding and Reasoning Fairness in Machine Learning Pipelines", Ph.D. dissertation, Dept. of Computer Science, Iowa State University, 2022. [paper]
- Che Shian Hung, "Boa Views: Enabling Modularization and Sharing of Boa Queries", Masters thesis, Dept. of Computer Science, Bowling Green State University, 2019. [paper]
- Hamid Bagheri, "Domain-specific language and infrastructure for genomics", Masters thesis, Dept. of Computer Science, Iowa State University, 2019. [paper]
- Md Johirul Islam, "BoaT: A domain specific language and shared data science infrastructure for large scale transportation data analysis", Masters thesis, Dept. of Computer Science, Iowa State University, 2019. [paper]
- Tianyi Zhang, "Leveraging Program Commonalities and Variations for Systematic Software Development and Maintenance", Ph.D. dissertation, University of California, Los Angeles, 2019. [paper]
- Mohd Arafat, "An Investigation of Routine Repetitiveness in Open-Source Projects", Masters thesis, Dept. of Computer Science, Bowling Green State University, 2018. [paper]
- Benjamin Robert Holland, "Computing Homomorphic Program Invariants", Ph.D. dissertation, Dept. of Computer Engineering, Iowa State University, 2018. [paper]
- Nitin Tiwari, "The design and implementation of Candoia: A platform for building and sharing mining software repositories tools as apps", Masters thesis, Dept. of Computer Science, Iowa State University, 2017. [paper]
- Ramanathan Ramu, "A hybrid approach for selecting and optimizing graph traversal strategy for analyzing big code", Masters thesis, Dept. of Computer Science, Iowa State University, 2017. [paper]
- Mehdi Bagherzadeh, "Toward a Concurrent Programming Model with Modular Reasoning", Ph.D. dissertation, Dept. of Computer Science, Iowa State University, 2016. [paper]
- Yuheng Long, "Formal Foundations for Hybrid Effect Analysis", Ph.D. dissertation, Dept. of Computer Science, Iowa State University, 2016. [paper]
- Vinayak Sinha, "Sentiment Analysis on Java Source Code in Large Software Repositories", Masters thesis, Dept. of Computer Science and Information Systems, Youngstown State University, 2016. [paper]
- Robert Dyer, "Bringing ultra-large-scale software repository mining to the masses with Boa", Ph.D. dissertation, Dept. of Computer Science, Iowa State University, 2013. [paper]
Tutorials, demonstrations, and other publications about Boa
- Brian Sigurdson, Samuel W. Flint, and Robert Dyer, "Boidae: Your Personal Mining Platform", In the International Conference on Software Engineering (ICSE 2024), April 2024. Lisbon, Portugal. [paper]
- Ganesha Upadhyaya, Robert Dyer, Hridesh Rajan, and Tien N. Nguyen, "Program Analysis on Thousands of Projects", Tutorial at the 32nd International Conference on Automated Software Engineering (ASE 2017), Oct 31, 2017. Urbana-Champaign, IL. [slides]
- Robert Dyer, Hridesh Rajan, Tien N. Nguyen, and Hoan Anh Nguyen, "Mining Programming Language Usage with Boa", Tutorial at the 6th International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2015), Oct 29, 2015. Indianapolis, IN. [slides]
- Robert Dyer, Hridesh Rajan, Tien N. Nguyen, and Hoan Anh Nguyen, "Demonstrating Programming Language Feature Mining Using Boa", Demonstration at the 6th International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2015), Oct 28, 2015. Indianapolis, IN. [paper] [slides]
- Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen, "Efficiently Mining Source Code with Boa", Tutorial at the 36th International Conference on Software Engineering (ICSE 2014), June, 2014. Hyderabad, India. [slides]
- Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen, "Mining Source Code Repositories with Boa", Demonstration at the 4th International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2013), Oct 31, 2013. Indianapolis, IN. [paper] [slides]
- Robert Dyer, and Hridesh Rajan, "Mining Ultra-Large-Scale Software Repositories with Boa", Tech Talk at the 4th International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2013), Oct 30, 2013. Indianapolis, IN. [slides]
- Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen, "Analyzing Ultra-Large-Scale Code Corpus with Boa", Demonstration at the 3rd International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2012), Oct 23-25, 2012. Tucson, AZ. [paper] [slides]
- Robert Dyer, Hoan Anh Nguyen, Hridesh Rajan, and Tien N. Nguyen, "Boa: Analyzing Ultra-Large-Scale Code Corpus", Poster presentation at the 3rd International Conference on Systems, Programming, Languages, and Applications: Software for Humanity (SPLASH 2012), Oct 22, 2012. Tucson, AZ. [paper] [poster]