Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skyhook cephmods #72

Open
wants to merge 11 commits into
base: skyhook-luminous
Choose a base branch
from
Open

Skyhook cephmods #72

wants to merge 11 commits into from

Conversation

KDahlgren
Copy link
Contributor

Edited copy from code to call transform function on src and append the results to the target object. Transform is currently hard-coded for fbx to arrow.
Src schema is currently hard-coded to lineitem.
Project-cols is currently hard-coded to orderkey.
Getting rid of the hard code will require parameterizing do_copy_get in PrimaryLogPG.cc.

@KDahlgren KDahlgren requested a review from jlefevre September 21, 2019 02:43
@KDahlgren
Copy link
Contributor Author

KDahlgren commented Sep 21, 2019

Tested build against FBX, ARROW, FBU_Rows, and FBU_Cols quereies in vcluster. Everything works fine.

Here're the tests I used:

 #PREP FOR SKYHOOK-CEPHMODS BRANCH PR

# --------------------------------------------------------------------------- #
# COPY FROM TRANSFORM APPEND TESTS
../src/stop.sh; make -j12 vstart; ../src/stop.sh; ../src/vstart.sh -d -n -x; bin/rados mkpool tpchflatbuf ; bin/ceph osd pool set tpchflatbuf size 1 ; bin/rados mkpool tpchdata ; bin/ceph osd pool set tpchdata size 1 ; bin/rados mkpool paper_exps ; bin/ceph osd pool set paper_exps size 1 ;

bin/transform-copyfrom-merge --pool paper_exps ;

# --------------------------------------------------------------------------- #
# FBX TESTS
../src/stop.sh; make -j12 vstart; ../src/stop.sh; ../src/vstart.sh -d -n -x; bin/rados mkpool tpchflatbuf ; bin/ceph osd pool set tpchflatbuf size 1 ; bin/rados mkpool tpchdata ; bin/ceph osd pool set tpchdata size 1 ; bin/rados mkpool paper_exps ; bin/ceph osd pool set paper_exps size 1 ;

OBJ_TYPE=SFT_FLATBUF_FLEX_ROW;  # choose one of {SFT_FLATBUF_FLEX_ROW, SFT_ARROW, SFT_JSON}
OBJ_BASE_NAME=skyhook.${OBJ_TYPE}.lineitem;
for i in {0..1}; do 
    rm -rf ${OBJ_BASE_NAME}.$i;   # remove the old test objects
    wget https://users.soe.ucsc.edu/~jlefevre/skyhookdb/testdata/${OBJ_BASE_NAME}.$i;
done;
yes | PATH=$PATH:bin ../src/progly/rados-store-glob.sh tpchdata ${OBJ_BASE_NAME}.*;

bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem" --quiet;
bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem"  --select-preds "orderkey,lt,3"  --project-cols "orderkey,tax,comment,linenumber,returnflag" --quiet;
bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem"  --select-preds "orderkey,geq,3"  --project-cols "orderkey,tax,comment,linenumber,returnflag" --quiet ;

bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem" --quiet --use-cls;
bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem"  --select-preds "orderkey,lt,3"  --project-cols "orderkey,tax,comment,linenumber,returnflag" --quiet --use-cls;
bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem"  --select-preds "orderkey,geq,3"  --project-cols "orderkey,tax,comment,linenumber,returnflag" --quiet --use-cls;
 
# --------------------------------------------------------------------------- #
# ARROW TESTS
../src/stop.sh; make -j12 vstart; ../src/stop.sh; ../src/vstart.sh -d -n -x; bin/rados mkpool tpchflatbuf ; bin/ceph osd pool set tpchflatbuf size 1 ; bin/rados mkpool tpchdata ; bin/ceph osd pool set tpchdata size 1 ; bin/rados mkpool paper_exps ; bin/ceph osd pool set paper_exps size 1 ;

OBJ_TYPE=SFT_ARROW
OBJ_BASE_NAME=skyhook.${OBJ_TYPE}.lineitem;
for i in {0..1}; do 
    rm -rf ${OBJ_BASE_NAME}.$i;   # remove the old test objects
    wget https://users.soe.ucsc.edu/~jlefevre/skyhookdb/testdata/${OBJ_BASE_NAME}.$i;
done;
yes | PATH=$PATH:bin ../src/progly/rados-store-glob.sh tpchdata ${OBJ_BASE_NAME}.*;

bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem" --quiet;
bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem"  --select-preds "orderkey,lt,3"  --project-cols "orderkey,tax,comment,linenumber,returnflag" --quiet;
bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem"  --select-preds "orderkey,geq,3"  --project-cols "orderkey,tax,comment,linenumber,returnflag" --quiet ;

bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem" --quiet --use-cls;
bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem"  --select-preds "orderkey,lt,3"  --project-cols "orderkey,tax,comment,linenumber,returnflag" --quiet --use-cls;
bin/run-query --num-objs 2 --pool tpchdata --wthreads 1 --qdepth 10 --query flatbuf --table-name "lineitem"  --select-preds "orderkey,geq,3"  --project-cols "orderkey,tax,comment,linenumber,returnflag" --quiet --use-cls;

# --------------------------------------------------------------------------- #
# FBU_ROWS TESTS
../src/stop.sh; make -j12 vstart; ../src/stop.sh; ../src/vstart.sh -d -n -x; bin/rados mkpool tpchflatbuf ; bin/ceph osd pool set tpchflatbuf size 1 ;

bin/fbwriter_fbu --filename testdata.txt --write_type rows --debug yes --schema_datatypes int,float,string,int --schema_attnames att0,att1,att2,att3 --table_name atable --nrows 6 --ncols 4 --targetoid obj --targetpool tpchflatbuf --targetformat SFT_FLATBUF_UNION_ROW --writeto ceph --schema_iskey 0,0,0,0 --schema_isnullable 0,0,0,0 --savedir . --rid_start_value 2 --rid_end_value 4 --obj_counter 0 ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "*" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att2 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,40;att3,lt,215;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select-preds ";att0,sum,0;" --table-name "atable" --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" ;

# ++++ use-cls ++++ 
bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "*" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att2 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,40;att3,lt,215;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select-preds ";att0,sum,0;" --table-name "atable" --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" --use-cls ;

# --------------------------------------------------------------------------- #
# FBU_COLS 1 TESTS
../src/stop.sh; make -j12 vstart; ../src/stop.sh; ../src/vstart.sh -d -n -x; bin/rados mkpool tpchflatbuf ; bin/ceph osd pool set tpchflatbuf size 1 ;

bin/fbwriter_fbu --filename testdata.txt --write_type cols --debug yes --schema_datatypes int,float,string,int --schema_attnames att0,att1,att2,att3 --table_name atable --nrows 6 --ncols 4 --targetoid obj --targetpool tpchflatbuf --targetformat SFT_FLATBUF_UNION_COL --writeto ceph --cols_per_fb 1 --schema_iskey 0,0,0,0 --schema_isnullable 0,0,0,0 --savedir . --rid_start_value 2 --rid_end_value 4 --obj_counter 0 ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "*" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att2 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,40;att3,lt,215;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select-preds ";att0,sum,0;" --table-name "atable" --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" ;

# ++++ use-cls ++++ 
bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "*" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att2 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,40;att3,lt,215;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select-preds ";att0,sum,0;" --table-name "atable" --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" --use-cls ;

# --------------------------------------------------------------------------- #
# FBU_COLS 2 TESTS
../src/stop.sh; make -j12 vstart; ../src/stop.sh; ../src/vstart.sh -d -n -x; bin/rados mkpool tpchflatbuf ; bin/ceph osd pool set tpchflatbuf size 1 ;

bin/fbwriter_fbu --filename testdata.txt --write_type cols --debug yes --schema_datatypes int,float,string,int --schema_attnames att0,att1,att2,att3 --table_name atable --nrows 6 --ncols 4 --targetoid obj --targetpool tpchflatbuf --targetformat SFT_FLATBUF_UNION_COL --writeto ceph --cols_per_fb 2 --schema_iskey 0,0,0,0 --schema_isnullable 0,0,0,0 --savedir . --rid_start_value 2 --rid_end_value 4 --obj_counter 0 ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "*" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att2 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,40;att3,lt,215;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select-preds ";att0,sum,0;" --table-name "atable" --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" ;

# ++++ use-cls ++++ 
bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "*" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att2 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,40;att3,lt,215;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select-preds ";att0,sum,0;" --table-name "atable" --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" --use-cls ;

# --------------------------------------------------------------------------- #
# FBU_COLS 4 TESTS
../src/stop.sh; make -j12 vstart; ../src/stop.sh; ../src/vstart.sh -d -n -x; bin/rados mkpool tpchflatbuf ; bin/ceph osd pool set tpchflatbuf size 1 ;

bin/fbwriter_fbu --filename testdata.txt --write_type cols --debug yes --schema_datatypes int,float,string,int --schema_attnames att0,att1,att2,att3 --table_name atable --nrows 6 --ncols 4 --targetoid obj --targetpool tpchflatbuf --targetformat SFT_FLATBUF_UNION_COL --writeto ceph --cols_per_fb 4 --schema_iskey 0,0,0,0 --schema_isnullable 0,0,0,0 --savedir . --rid_start_value 1 --rid_end_value 6 --obj_counter 0 ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "*" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att2 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,40;att3,lt,215;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select-preds ";att0,sum,0;" --table-name "atable" --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" ;

# ++++ use-cls ++++ 
bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "*" ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,25;att1,lt,25.0;"  --project-cols att0,att2 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select "att0,lt,40;att3,lt,215;"  --project-cols att0,att1,att2,att3 --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ; 3 7 0 0 ATT3" --use-cls ;

bin/run-query --num-objs 1 --pool tpchflatbuf --wthreads 1 --qdepth 10 --query flatbuf --select-preds ";att0,sum,0;" --table-name "atable" --data-schema "0 7 0 0 ATT0 ; 1 12 0 0 ATT1 ; 2 15 0 0 ATT2 ;3 7 0 0 ATT3" --use-cls ;

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant