Skip to content

An experimental duplicate question search algorithm for Stack Exchange.

License

Notifications You must be signed in to change notification settings

SOBotics/UniStack

Repository files navigation

UniStack

An experimental duplicate question search algorithm for Stack Exchange.

Current Algorithm

As this is an experimental project, everything is subject to change.

Extract question sentence + context words, then model remianing words.

  • Question sentence: words starting from an MD or Wxx tagged word till the end of the sentence.
  • Context words: any NN tagged word.
  • Model: PoS tagged bag-of-words (unigram).
  • Weighting: TF-IDF.
  • Similarity function: cosine.

About

An experimental duplicate question search algorithm for Stack Exchange.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages