Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get word from vector #381

Closed
sheavalentine opened this issue Jul 2, 2015 · 11 comments
Closed

Get word from vector #381

sheavalentine opened this issue Jul 2, 2015 · 11 comments
Labels
difficulty easy Easy issue: required small fix wishlist Feature request

Comments

@sheavalentine
Copy link

sheavalentine commented Jul 2, 2015

It would be useful to have a method which returns the closest word when the model is given a vector.

@piskvorky
Copy link
Owner

What model?

@sheavalentine
Copy link
Author

Sorry... word2vec. Eg

vectors =  Word2Vec.load_word2vec_format(file)
vectors.word_from_vec(vectors['cat']) == 'cat'

@gojomo
Copy link
Collaborator

gojomo commented Jul 2, 2015

You can already more-or-less do this by passing a single positive example to most_similar(), like so:

vectors.most_similar(positive=[vectors['cat']],topn=1) 

I suppose as a convenience, it could detect a single ndarray (like it detects a single string) and just assume this is what the caller wants...

@sheavalentine
Copy link
Author

! Oh... that's not entirely obvious. The expectation was that positive needed to be a list of strings.

@gojomo
Copy link
Collaborator

gojomo commented Jul 3, 2015

Yes, the doc-comment should be more clear... and any future/revised tutorials should include a few examples of the multiple styles of arguments it takes.

@sheavalentine
Copy link
Author

Thank you - also thank you for making this tool.

@piskvorky
Copy link
Owner

Good point.

We'll probably split this API into two distinct methods (similar_by_word, similar_by_vector or something) if we want to promote it. Overloading arguments ultimately brings too much trouble and confusion, this was originally supposed to be just an internal quick hack.

@sheavalentine can you do it, and open a pull request?

I assume the by_word version will be just a thin wrapper around the vector version, which is more general.

@sheavalentine
Copy link
Author

I can do it, but it'll take me a few days to get the bandwidth. :)

@piskvorky
Copy link
Owner

No problem.

We plan to make a new release this weekend, so this change will be in the release after that, then.

@tmylk tmylk added wishlist Feature request difficulty easy Easy issue: required small fix labels Jan 10, 2016
@tmylk
Copy link
Contributor

tmylk commented Jan 10, 2016

@sheavalentine Is this still a needed feature? Do you think we could include your Pull Request into the January release?

@tmylk
Copy link
Contributor

tmylk commented Oct 18, 2016

Closing as abandoned

@tmylk tmylk closed this as completed Oct 18, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty easy Easy issue: required small fix wishlist Feature request
Projects
None yet
Development

No branches or pull requests

4 participants