Skip to content

Commit

Permalink
readme file in directory SQLFeatureExtraction is updated
Browse files Browse the repository at this point in the history
  • Loading branch information
clarknerd committed Mar 6, 2017
1 parent 8388f9a commit 8951c20
Show file tree
Hide file tree
Showing 2 changed files with 35 additions and 7 deletions.
28 changes: 28 additions & 0 deletions SQLFeatureExtraction/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,31 @@ In order to produce the executable file *SQLComparison.jar* from Java source cod
mvn clean install

This command will install the maven project and create a executable jar file named *SQLComparison.jar* in the directory *SQLFeatureExtraction/target*

By running *SQLComparison.jar* with no options, user can reproduce feature vectors generated by:

(1) Each of the three similarity metrics (aligon, makiyama and aouiche);

(2) Under three different SQL log-like inputs (UB student exam, Bombay IIT and pocketdata-google+);

(3) With no further feature engineering;

(4) With each feature engineering module activated individually;

(5) With all feature engineering modules activated.

Options:

(1) User can specify a single input data set by using -input option. The available option parameters are: (1)ub; (2)bombay; (3)googleplus. (e.g. Java -jar SQLComparison.jar -input ub)

(2) User can specify a single similarity metric by using -metric option. The available option parameters are: (1)aligon; (2)makiyama; (3)aouiche. (e.g. Java -jar SQLComparison.jar -metric aligon)

(3) User can specify regularization modules to be activated by using -modules option. The available modules are: ID=1, Naming; ID=2, Expression Regularization; ID=3, From-Nesting Flattening; ID=4, UNION pull-out. User should specify corresponding IDs and use symbol "&" for combining multiple selected modules, if any. (e.g. Java -jar SQLComparison.jar -modules 1&2&3&4)

Result of running *SQLComparison.jar* will be saved to directory *SQLFeatureExtraction/data* and the result file naming format is: [input]+"_"+[metric]+ ".csv". If regularization is applied, naming format is: [input]+"_"+[metric]+"_regularization_"+[module IDs]+".csv"; If all modules of regularization are applied, naming format is simplified as: [input]+"_"+[metric]+"_regularization.csv".






14 changes: 7 additions & 7 deletions SQLFeatureExtraction/src/main/java/Main.java
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ public static void main(String[] args) {
File f=new File(current);
String parent=f.getParent();
datapath=parent+"/data/";
System.out.println(datapath);
System.out.println("data path : "+datapath);
} catch (IOException e) {
e.printStackTrace();
}
Expand Down Expand Up @@ -79,7 +79,7 @@ else if (name.contains("makiyama"))
ArrayList<ArrayList<String> > queryLists=new ArrayList<ArrayList<String>> ();
System.out.println("---begin query retrieval------");
System.out.println();
long start=System.nanoTime();
//long start=System.nanoTime();

for(int i=0;i<data.length;i++){
System.out.println("data set using is "+data[i]);
Expand All @@ -95,7 +95,7 @@ else if (name.contains("makiyama"))
queryLists.add(queryList);
System.out.println();
}
long end=System.nanoTime();
//long end=System.nanoTime();
System.out.println("query parsing finished.");
System.out.println();

Expand Down Expand Up @@ -150,26 +150,26 @@ public static void queryComparison(String data,String method,ArrayList<String> q
//do a pass of regularization
if(regu){
System.out.println("begin regularization");
long start=System.nanoTime();
//long start=System.nanoTime();
for(int k=0;k<statementList1.size();k++){
Select s=(Select) statementList1.get(k);
statementList.add(CombinedRegularizer.regularize(s,modules));
}
long end=System.nanoTime();
//long end=System.nanoTime();
System.out.println("regularization ended.");
System.out.println();
}

System.out.println("begin query comparison and clustering. ");
long start=System.nanoTime();
//long start=System.nanoTime();
// method name can be either aouiche, makiyama or aligon
double[][] matrix;
if(regu)
matrix = Utility.createDistanceMatrix(method, statementList);
else
matrix = Utility.createDistanceMatrix(method, statementList1);

long end=System.nanoTime();
//long end=System.nanoTime();
// write distance matrix to file
String outputpath;
if(!regu)
Expand Down

0 comments on commit 8951c20

Please sign in to comment.