Research Datasets


Dataset Description: A dataset of digits in identifier names


Dataset Description: Training set for the SCANL ensemble part-of-speech tagger


Dataset Description: A dataset of refactoring discussions on Stack Overflow


Dataset Description: A manually annoated dataset of part-of-speech tags in test method names


Dataset Description: A dataset of “simple stupid bugs” (SStuBs) in test and non-test (i.e., production) files in popular open-source Java Maven projects


Dataset Description: A dataset of 1,335 manually POS-tagged identifier names


Dataset Description: A dataset of rename refactorings and their related context


Dataset Description: A dataset of 861 abbreviations and their corresponding expansions