Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: User preference prediction openday #53

Open
wants to merge 42 commits into
base: master
Choose a base branch
from

Conversation

Baschdl
Copy link
Member

@Baschdl Baschdl commented Mar 13, 2018

@GittiHab What do we have to do to make this mergeable into our master?

@GittiHab
Copy link
Member

What do we have to do to make this mergeable into our master?

@Baschdl what do you mean?

@@ -112,3 +114,44 @@ def _extract_data_labels(self, metapath_graph: MetaPathRatingGraph) -> (List[Tup
metapath_labels.append(LARGER) # >

return metapath_pairs, metapath_labels

def _test_score(self, x_test, y_test):
print('Test accuracy is {}'.format(self.classifier.score(X=self._preprocess(x_test), y=y_test)))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't always yield the accuracy as we say with the RandomForestRegressor

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean that this depends on the predictor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, either we we don't call self.classifier.score and instead predict the labels with our classifier and evaluate it with the general scikit-learn accuracy() method or we print "Test score is..." but that's more or less useless if you don't know which score this is.

@Baschdl
Copy link
Member Author

Baschdl commented Mar 13, 2018

@Baschdl what do you mean?

Like clean up things etc.
One thing is that I renamed the dataset to easily import it in the notebook, we have to fix this. Either remove the space in all dataset filenames or escape it in the notebook. But spaces in filenames are ugly anyway.

@coveralls
Copy link

coveralls commented Mar 13, 2018

Pull Request Test Coverage Report for Build 440

  • 13 of 30 (43.33%) changed or added relevant lines in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage decreased (-0.5%) to 60.606%

Changes Missing Coverage Covered Lines Changed/Added Lines %
domain_scoring/domain_scoring.py 13 30 43.33%
Totals Coverage Status
Change from base Build 438: -0.5%
Covered Lines: 640
Relevant Lines: 1056

💛 - Coveralls

@Baschdl
Copy link
Member Author

Baschdl commented Mar 15, 2018

broken build: We have to use a consistent filename for the datasets

@GittiHab
Copy link
Member

GittiHab commented Mar 19, 2018

The cryptography python module could not be installed and therefore the requirements couldn't be fully installed.

(Same error occurs locally)

@Baschdl
Copy link
Member Author

Baschdl commented Mar 19, 2018

cryptography version 2.1.4 works, the new 2.2 is failing. I'll set the version to 2.1.4 in the requirements.txt

@Baschdl
Copy link
Member Author

Baschdl commented Mar 19, 2018

Fixed in #59

GittiHab and others added 2 commits March 19, 2018 13:46
…m/KDD-OpenSource/32de-python into userpreference-prediction-openday

* 'userpreference-prediction-openday' of https://github.com/KDD-OpenSource/32de-python: (40 commits)
  Set cryptography to working version
  Updated dataset paths in notebooks.
  Restructured rated datasets
  Add new structured rnn notebook.
  Add rnn notebook with high score.
  Add notebook for evaluating the ground truth from the KDD day
  Parametrize the algorithm used by the oracle
  Add prediction based on past mean of ratings to random sampler
  Remove plotting capabilities
  Fix use mean of squared error
  Add FlexibleOracle for arbitrary rating methods
  Fix use of only recently used ratings for GPR fitting
  Add simple experiments on gaussian process regressors
  Remove kernel optimizer
  Remove plotting from test for travis
  Add tests for exmperimental setups
  Refactor meta-path constructor
  Add tfidf vectorization
  Remove comments.
  Add dirty fix for CORS
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants