Classification use-case #16

wbecker · 2017-06-11T14:15:05Z

This project doesn't currently allow for the predicting the type of an input, as there is no sense of knowing to what type an input value maps.

Normally when using a classifier, there is a two stage process.
1 - fit(X, y), using training input and output data
2 - predict(X), using unknown data, and returning the estimated

It would be good if this project presented a similar interface.

I would suggest creating a class, wmd_classifier, which implements these two models.

fit, which would:

take in an array of documents and break them down into bows
create a WMD instance
cache centroids

predict, which would:

take in a document
break it into a bow
calculate its centroid
call nearest_neighbours
calculate the output type, based on the k nearest neighbours, weighted by their closeness

The text was updated successfully, but these errors were encountered:

wbecker · 2017-06-11T14:15:51Z

I'd be happy to contribute something like this!

vmarkovtsev · 2017-06-13T21:12:07Z

@wbecker This is 👍
sklearn-like interface would be really useful. Feel free to PR.

My only suggestion is to abstract the way a document is transformed into nBOW. E.g. provide a function in __init__ and let the documents be "objects", with nice defaults for spacy/strings.

And let's name it WmdClassifier. I have just stated the contribution guidelines in https://github.com/src-d/wmd-relax/wiki/Contributions

wbecker mentioned this issue Jun 11, 2017

Build/document a way to tune nearest_neighbour settings #18

Open

vmarkovtsev added enhancement help wanted labels Jun 13, 2017

vmarkovtsev assigned wbecker Jun 13, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Classification use-case #16

Classification use-case #16

wbecker commented Jun 11, 2017

wbecker commented Jun 11, 2017

vmarkovtsev commented Jun 13, 2017 •

edited

Loading

Classification use-case #16

Classification use-case #16

Comments

wbecker commented Jun 11, 2017

wbecker commented Jun 11, 2017

vmarkovtsev commented Jun 13, 2017 • edited Loading

vmarkovtsev commented Jun 13, 2017 •

edited

Loading