Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs for word2vec.py forwarding functions and one more #1251

Merged
merged 2 commits into from
Apr 10, 2017
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
92 changes: 92 additions & 0 deletions gensim/models/word2vec.py
Original file line number Diff line number Diff line change
Expand Up @@ -355,6 +355,9 @@ class Word2Vec(utils.SaveLoad):
"""
Class for training, using and evaluating neural networks described in https://code.google.com/p/word2vec/

If you're finished training a model (=no more updates, only querying)
then switch to the :mod:`gensim.models.KeyedVectors` instance in wv

The model can be stored/loaded via its `save()` and `load()` methods, or stored/loaded in a format
compatible with the original word2vec implementation via `wv.save_word2vec_format()` and `KeyedVectors.load_word2vec_format()`.

Expand Down Expand Up @@ -1076,6 +1079,11 @@ def worker_loop():
return sentence_scores[:sentence_count]

def clear_sims(self):
"""
Removes all L2-normalized vectors for words from the model.
You will have to recompute them using init_sims method.
"""

self.wv.syn0norm = None

def update_weights(self):
Expand Down Expand Up @@ -1181,33 +1189,103 @@ def intersect_word2vec_format(self, fname, lockf=0.0, binary=False, encoding='ut
logger.info("merged %d vectors into %s matrix from %s" % (overlap_count, self.wv.syn0.shape, fname))

def most_similar(self, positive=[], negative=[], topn=10, restrict_vocab=None, indexer=None):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.most_similar`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange line breaks + docstring a bit obscure. I'm not sure reading this would help me if I didn't already know what it's talking about.

How about simply Deprecated; use self.ww.most_similar() instead.?

Same with all the other "forwarding" functions.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, The strange line breaks were added by me in the fix pep8 commit! I'm new to following pep8 all the time, do you think I overdid it there?

Is a depreciated warning to be added to these methods? Will they become unavailable in a later version? If yes, then this should surely be changed asap and I'll send a new PR with the same.

Copy link
Owner

@piskvorky piskvorky Apr 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a question for @tmylk , but I think we want to nudge users toward using the new API, yes.

The line breaks are definitely not good, hurts readability.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, changes made as suggested. #1274

Thanks for the line breaks tip 👍 . Will line break only if comments go excessively above PEP8 standards!

"""

return self.wv.most_similar(positive, negative, topn, restrict_vocab, indexer)

def wmdistance(self, document1, document2):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.wmdistance`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.wmdistance(document1, document2)

def most_similar_cosmul(self, positive=[], negative=[], topn=10):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.most_similar_cosmul`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.most_similar_cosmul(positive, negative, topn)

def similar_by_word(self, word, topn=10, restrict_vocab=None):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.similar_by_word`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.similar_by_word(word, topn, restrict_vocab)

def similar_by_vector(self, vector, topn=10, restrict_vocab=None):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.similar_by_vector`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.similar_by_vector(vector, topn, restrict_vocab)

def doesnt_match(self, words):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.doesnt_match`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.doesnt_match(words)

def __getitem__(self, words):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.__getitem__`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.__getitem__(words)

def __contains__(self, word):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.__contains__`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.__contains__(word)

def similarity(self, w1, w2):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.similarity`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.similarity(w1, w2)

def n_similarity(self, ws1, ws2):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.n_similarity`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.n_similarity(ws1, ws2)

def predict_output_word(self, context_words_list, topn=10):
Expand Down Expand Up @@ -1270,9 +1348,23 @@ def accuracy(self, questions, restrict_vocab=30000, most_similar=None, case_inse

@staticmethod
def log_evaluate_word_pairs(pearson, spearman, oov, pairs):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.log_evaluate_word_pairs`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return KeyedVectors.log_evaluate_word_pairs(pearson, spearman, oov, pairs)

def evaluate_word_pairs(self, pairs, delimiter='\t', restrict_vocab=300000, case_insensitive=True, dummy4unknown=False):
"""
Please refer to the documentation for
`gensim.models.KeyedVectors.evaluate_word_pairs`
This is just a forwarding function.
In the future please use the `gensim.models.KeyedVectors` instance in wv
"""

return self.wv.evaluate_word_pairs(pairs, delimiter, restrict_vocab, case_insensitive, dummy4unknown)

def __str__(self):
Expand Down