Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make -q :: faster by calculating interactions on the fly instead of pre calculating them #2807

Merged
merged 57 commits into from
Feb 25, 2021
Merged
Changes from 1 commit
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
2f1a208
make -q :: quicker
olgavrou Feb 8, 2021
799287b
don't add constant everywhere
olgavrou Feb 8, 2021
0440cd0
don't add namespace when adding constant feature
olgavrou Feb 8, 2021
67da6be
some cleanup
olgavrou Feb 8, 2021
24762e1
make it work for ccb
olgavrou Feb 10, 2021
4877f12
working for ccb and cb
olgavrou Feb 10, 2021
60a5dcd
simplify
olgavrou Feb 10, 2021
edadd10
working for reduction tests
olgavrou Feb 11, 2021
0cd3336
cleanup
olgavrou Feb 11, 2021
1225655
make it work for flatbuffers too
olgavrou Feb 11, 2021
b62aa63
cleanup
olgavrou Feb 11, 2021
98dcc89
cleanup
olgavrou Feb 11, 2021
c1e2420
cleanup and fixup
olgavrou Feb 11, 2021
648b5a5
add warning
olgavrou Feb 11, 2021
0830bcb
enable test
olgavrou Feb 11, 2021
fb12612
add another test
olgavrou Feb 12, 2021
2ff5fc0
fix
olgavrou Feb 12, 2021
f1935dd
cleanup
olgavrou Feb 12, 2021
7a0ede4
enable unit test
olgavrou Feb 12, 2021
a01303d
remove temp test
olgavrou Feb 12, 2021
040f962
merge master into branch
olgavrou Feb 12, 2021
78a89a6
cleanup
olgavrou Feb 12, 2021
96b3b69
super simplify and remove some un-needed code
olgavrou Feb 13, 2021
a0bfe9e
fix formatting
olgavrou Feb 15, 2021
5d67c9f
fix slim
olgavrou Feb 15, 2021
cbbf449
interactions need to be sortedexpand_interactions
olgavrou Feb 15, 2021
40516df
rename
olgavrou Feb 15, 2021
4f106b7
better sorting
olgavrou Feb 16, 2021
edaaa47
add lock
olgavrou Feb 16, 2021
83be7ac
do not include extended ascii
olgavrou Feb 16, 2021
b043947
add another test
olgavrou Feb 16, 2021
62b235b
fix formatting and add comments
olgavrou Feb 16, 2021
eba64ed
fix windows build
olgavrou Feb 16, 2021
449c2bb
cleanup
olgavrou Feb 16, 2021
6dea35b
fix typo and pr comments on interaction expansion
olgavrou Feb 17, 2021
5aa1a22
set example interactions in setup_example
olgavrou Feb 17, 2021
f422951
Merge branch 'master' into quadratics
olgavrou Feb 18, 2021
57e35bc
merge origin master into 'quadratics' branch
olgavrou Feb 23, 2021
3bb9c24
Merge branch 'quadratics' of github.com:olgavrou/vowpal_wabbit into q…
olgavrou Feb 23, 2021
edb84be
typos and better naming
olgavrou Feb 23, 2021
8ecf998
run_tests.py open file with utf8 encoding
olgavrou Feb 23, 2021
334e94d
run_tests.py open file with utf8 encoding
olgavrou Feb 23, 2021
df1af13
fix bug in sort for interactions
olgavrou Feb 23, 2021
e88af2a
fix formatting and add tests
olgavrou Feb 23, 2021
c2c94a5
fix formatting
olgavrou Feb 23, 2021
12707f8
make smaller tests so that they don't time out
olgavrou Feb 23, 2021
2afdfae
Merge branch 'master' into quadratics
olgavrou Feb 23, 2021
32bcd85
make user provided order of interactions not affect the interaction o…
olgavrou Feb 25, 2021
3dd98ce
Merge branch 'master' into quadratics
olgavrou Feb 25, 2021
29c862f
increase run_test timeout because tests in macos ci are timeing out
olgavrou Feb 25, 2021
cae790e
check macos ci w bigger timeout for pytests
olgavrou Feb 25, 2021
0e0c729
new tests to use smaller dataset
olgavrou Feb 25, 2021
a47b0ae
test without run_test.py
olgavrou Feb 25, 2021
8722914
restore tests
olgavrou Feb 25, 2021
122210f
test wout sorting all interactions
olgavrou Feb 25, 2021
e3f3f12
stable sort
olgavrou Feb 25, 2021
7e121e6
correct comparator
olgavrou Feb 25, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
cleanup and fixup
olgavrou committed Feb 11, 2021
commit c1e2420a6057fedef1a47eaca241d96f64d9a7c2
18 changes: 15 additions & 3 deletions vowpalwabbit/conditional_contextual_bandit.cc
Original file line number Diff line number Diff line change
@@ -42,6 +42,7 @@ struct ccb
std::vector<uint32_t> origin_index;
CB::cb_class cb_label, default_cb_label;
std::vector<bool> exclude_list, include_list;
namsepace_interactions generated_interactions;
namsepace_interactions* original_interactions;
std::vector<CCB::label> stored_labels;
size_t action_with_label;
@@ -405,9 +406,20 @@ void learn_or_predict(ccb& data, multi_learner& base, multi_ex& examples)
if (should_augment_with_slot_info)
{
// Namespace crossing for slot features.
calculate_and_insert_interactions(data.shared, data.actions, *data.original_interactions);
data.shared->interactions = data.original_interactions;
for (auto* ex : data.actions) { ex->interactions = data.original_interactions; }
data.generated_interactions.interactions.clear();
std::copy(data.original_interactions->interactions.begin(), data.original_interactions->interactions.end(),
std::back_inserter(data.generated_interactions.interactions));
data.generated_interactions.quadraditcs_wildcard_expansion =
data.original_interactions->quadraditcs_wildcard_expansion;
data.generated_interactions.active_interactions = data.original_interactions->active_interactions;
data.generated_interactions.all_example_namespaces = data.original_interactions->all_example_namespaces;
data.generated_interactions.extra_interactions = data.original_interactions->extra_interactions;
data.generated_interactions.leave_duplicate_interactions =
data.original_interactions->leave_duplicate_interactions;

calculate_and_insert_interactions(data.shared, data.actions, data.generated_interactions);
data.shared->interactions = &data.generated_interactions;
for (auto* ex : data.actions) { ex->interactions = &data.generated_interactions; }
}

data.include_list.clear();
2 changes: 1 addition & 1 deletion vowpalwabbit/example_predict.cc
Original file line number Diff line number Diff line change
@@ -51,7 +51,7 @@ void example_predict::set_namespace(const namespace_index& ns, bool interact)
indices.push_back(ns);
// keep active namespaces if we are doing wildcard expansion for interactions
// skip if interact is false, for example if constant_feature
if (interact && (interactions != nullptr) && interactions->wild_card_expansion)
if (interact && (interactions != nullptr) && interactions->quadraditcs_wildcard_expansion)
{ interactions->all_example_namespaces.insert(ns); }
}

2 changes: 1 addition & 1 deletion vowpalwabbit/example_predict.h
Original file line number Diff line number Diff line change
@@ -21,7 +21,7 @@ struct namsepace_interactions
std::set<std::vector<namespace_index>> active_interactions; // TODO maybe remove
std::set<namespace_index> all_example_namespaces; // TODO maybe ordered vector
std::vector<std::vector<namespace_index>> interactions;
bool wild_card_expansion = false;
bool quadraditcs_wildcard_expansion = false;
bool leave_duplicate_interactions = false;
std::unordered_set<namespace_index> extra_interactions; // e.g. ccb_id_namespace from conditional_contextual_bandits
};
1 change: 0 additions & 1 deletion vowpalwabbit/interactions.h
Original file line number Diff line number Diff line change
@@ -60,7 +60,6 @@ inline void generate_interactions(vw& all, example_predict& ec, R& dat)
template <class R, class S, void (*T)(R&, float, S)>
inline void generate_interactions(vw& all, example_predict& ec, R& dat)
{
std::cout << "HERE" << std::endl;
if (all.weights.sparse)
generate_interactions<R, S, T, sparse_parameters>(
all.interactions, all.permutations, ec, dat, all.weights.sparse_weights);
5 changes: 2 additions & 3 deletions vowpalwabbit/interactions_predict.h
Original file line number Diff line number Diff line change
@@ -92,7 +92,7 @@ inline void inner_kernel(R& dat, features::iterator_all& begin, features::iterat
}
}

inline void expand_wildcard_interactions(namsepace_interactions& interactions, example_predict& ec)
inline void expand_quadratics_wildcard_interactions(namsepace_interactions& interactions)
{
auto set_interactions = interactions.all_example_namespaces;
std::vector<std::vector<namespace_index>> active_interactions;
@@ -172,8 +172,7 @@ inline void generate_interactions(namsepace_interactions& interactions, bool per
empty_ns_data.loop_end = 0;
empty_ns_data.self_interaction = false;

// loop throw the set of possible interactions
if (interactions.wild_card_expansion) { expand_wildcard_interactions(interactions, ec); }
if (interactions.quadraditcs_wildcard_expansion) { expand_quadratics_wildcard_interactions(interactions); }

for (auto& ns : interactions.interactions)
{ // current list of namespaces to interact.
1 change: 0 additions & 1 deletion vowpalwabbit/mwt.cc
Original file line number Diff line number Diff line change
@@ -76,7 +76,6 @@ void value_policy(mwt& c, float val, uint64_t index) // estimate the value of a
template <bool learn, bool exclude, bool is_learn>
void predict_or_learn(mwt& c, single_learner& base, example& ec)
{
// TODO is this OK?
c.observation = get_observed_cost(ec.l.cb);

if (c.observation != nullptr)
7 changes: 6 additions & 1 deletion vowpalwabbit/parse_args.cc
Original file line number Diff line number Diff line change
@@ -761,7 +761,12 @@ void parse_feature_tweaks(

if (options.was_supplied("leave_duplicate_interactions")) { all.interactions.leave_duplicate_interactions = true; }

if (new_quadratics[0][0] == ':' && new_quadratics[0][1] == ':') { all.interactions.wild_card_expansion = true; }
if (new_quadratics[0][0] == ':' && new_quadratics[0][1] == ':')
{
all.interactions.quadraditcs_wildcard_expansion = true;
all.trace_message << "WARNING: any duplicate namespace interactions will be removed" << endl
<< "You can use --leave_duplicate_interactions to disable this behaviour." << endl;
}
else
{
expanded_interactions =
7 changes: 3 additions & 4 deletions vowpalwabbit/parser.cc
Original file line number Diff line number Diff line change
@@ -625,6 +625,7 @@ example& get_unused_example(vw* all)
parser* p = all->example_parser;
auto ex = p->example_pool.get_object();
p->begin_parsed_examples++;
// Set the interactions for this example to the global set.
ex->interactions = &all->interactions;
VW_WARNING_STATE_PUSH
VW_WARNING_DISABLE_DEPRECATED_USAGE
@@ -700,10 +701,8 @@ void setup_example(vw& all, example* ae)
ae->total_sum_feat_sq += fs.sum_feat_sq;
}

if (all.interactions.wild_card_expansion) { INTERACTIONS::expand_wildcard_interactions(all.interactions, *ae); }

// Set the interactions for this example to the global set.
ae->interactions = &all.interactions;
if (all.interactions.quadraditcs_wildcard_expansion)
{ INTERACTIONS::expand_quadratics_wildcard_interactions(all.interactions); }

size_t new_features_cnt;
float new_features_sum_feat_sq;