Rice University

Events at Rice

Thesis Defense

Graduate and Postdoctoral Studies
Computer Science

Speaker: Rohan Mukherjee
Masters Candidate

Mining Natural APIs from Large Code Corpora using a Mixture of Hidden Markov Models

Monday, May 1, 2017
10:00 AM  to 12:15 PM

DH 1049  Duncan Hall

A Natural API is a collection of API methods that tend to be used following certain discernible statistical patterns in real-world code. In this thesis, I present a method for learning an interpretable statistical model for such natural APIs. My model is trained on sequences of API calls produced from large software repositories through program analysis. Once trained, the model is able to recognize complex temporal dependences between methods, including methods that technically belong to different APIs, and can be used as a proxy for formal correctness specifications. Our experiments train the model on sequences of method calls generated from over 150 million lines of Android code. We evaluate the learned model by measuring accuracy in learnt specifications from the corpus, completing code with missing API calls, and searching for code that uses APIs in a way that matches a query. Our encouraging results indicate that statistical models of API calls learned from large code corpora can have broad value in software engineering.

<<   May 2017   >>
1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30 31

Search for Events