ISSTA 2018 Artifact Evaluation

IMPORTANT NOTE

The ISSTA submission calls our technique JDoctor to keep our identity confidential. The artifact instead uses the real name of the project, Toradocu. What we refer to as OldToradocu in this document is what in the paper submission we call Toradocu instead, and in essence it is the status of the project at the ISSTA 2016 paper.

System Requirements

Toradocu requires Java JDK 1.8 and Python 2.7+. It has been tested on Ubuntu and macOS.

NOTE: When you build Toradocu for the first time the build file will download the Glove models from our repository. This will take some time, as the models contain approximately 1GB of information.

Reproduce accuracy experiment

These steps will run the experiments described in Section 5, and will produce Table 2 in the paper.

Clone Toradocu and move to its folder:

git clone https://github.com/albertogoffi/toradocu.git
cd toradocu

Run experiments with Toradocu and produce its result file:
```
./stats/precision_recall_summary.sh toradocu_semantics
```
This takes about 16 minutes and creates file results_semantics.csv.
Run experiments with @tComment and produce its result file:
```
./stats/precision_recall_summary.sh tcomment
```
This takes about 5 minutes and creates file results_tcomment.csv. Some of the tests fail, this is expected and does not alter precision/recall numbers. TODO Can we remove failing assertions??

Run experiments with OldToradocu and produce its result file:
```
git checkout version0.1
./precision_recall_summary.sh
```
This takes about 4.5 minutes and creates file results_toradocu-1.0.csv.

Once all the CSV files with results are created, go back to master branch and run the script that produced the result table:

git checkout master
./stats/latex.sh paper

The script takes about 10 minutes to complete since it creates a fat jar containing Toradocu and all its dependencies in toradocu/build/libs/toradocu-1.0-all.jar.

Once completed, you can inspect file accuracy-table.tex in the latex folder to see the results of Table 2 of the paper.

P.S. Notice that OldToradocu has slightly worse precision and recall than what we reported in our submission. We found a minor bug in our older script, we will update the results in the preparation of the camera ready.

Reproduce Randoop and Randoop+Toradocu experiments

These instructions would allow you to reproduce the results reported in Section 6. However, keep in mind that it took us several weeks of manual effort to produce the results. We provide links to our repositories mainly for you to assess how we ran the evaluation process:

https://gitlab.cs.washington.edu/randoop/toradocu-manual-evaluation-may-2017.git

A much easier way to verify how the integration between Randoop and Toradocu works is to run the tools on a toy example:

https://github.com/ariannab/toyproject.git

Running Toradocu on any class

You may wish to try Toradocu on a particular class you desire.

If you haven't already reproduced the accuracy results you need to clone the Toradocu repository and build the fat jar with gradle:

git clone https://github.com/albertogoffi/toradocu.git 
cd toradocu
./gradlew shadowJar

A typical invocation of Toradocu on a class MyClass of a certain project looks like this:

java -jar jdoctor-1.0-all.jar \
   --target-class mypackage.MyClass \
   --source-dir project/src \
   --class-dir project/bin

For example:

java -jar build/libs/jdoctor-1.0-all.jar \
--target-class org.apache.commons.collections4.map.LRUMap \
--source-dir src/test/resources/src/commons-collections4-4.1-src/src/main/java \
--class-dir src/test/resources/bin/commons-collections4-4.1.jar

The terminal shows the output in a few seconds. It is formatted as JSON and contains the produced conditions for every method in the class. For each category of tag the method's Javadoc declares (throwsTags, paramTags, returnTag) you find the comment (field "comment") and the related translation produced by Toradocu (field "condition").

Provide feedback

Saved searches