Skip to content

Commit

Permalink
Fix errors in the doc2vec-lee notebook (#1841)
Browse files Browse the repository at this point in the history
This change fixes a couple different coding errors found in the
code snippets included in the doc2vec-lee tutorial, including
array index out-of-bounds selection and passing test document
words to vector inferencing.
  • Loading branch information
PeterHamilton authored and menshikh-iv committed Jan 16, 2018
1 parent 2c3c91c commit 1fbc8b4
Showing 1 changed file with 4 additions and 4 deletions.
8 changes: 4 additions & 4 deletions docs/notebooks/doc2vec-lee.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -553,7 +553,7 @@
],
"source": [
"# Pick a random document from the test corpus and infer a vector from the model\n",
"doc_id = random.randint(0, len(train_corpus))\n",
"doc_id = random.randint(0, len(train_corpus) - 1)\n",
"\n",
"# Compare and print the most/median/least similar documents from the train corpus\n",
"print('Train Document ({}): «{}»\\n'.format(doc_id, ' '.join(train_corpus[doc_id].words)))\n",
Expand Down Expand Up @@ -609,12 +609,12 @@
],
"source": [
"# Pick a random document from the test corpus and infer a vector from the model\n",
"doc_id = random.randint(0, len(test_corpus))\n",
"inferred_vector = model.infer_vector(test_corpus[doc_id])\n",
"doc_id = random.randint(0, len(test_corpus) - 1)\n",
"inferred_vector = model.infer_vector(test_corpus[doc_id].words)\n",
"sims = model.docvecs.most_similar([inferred_vector], topn=len(model.docvecs))\n",
"\n",
"# Compare and print the most/median/least similar documents from the train corpus\n",
"print('Test Document ({}): «{}»\\n'.format(doc_id, ' '.join(test_corpus[doc_id])))\n",
"print('Test Document ({}): «{}»\\n'.format(doc_id, ' '.join(test_corpus[doc_id].words)))\n",
"print(u'SIMILAR/DISSIMILAR DOCS PER MODEL %s:\\n' % model)\n",
"for label, index in [('MOST', 0), ('MEDIAN', len(sims)//2), ('LEAST', len(sims) - 1)]:\n",
" print(u'%s %s: «%s»\\n' % (label, sims[index], ' '.join(train_corpus[sims[index][0]].words)))"
Expand Down

0 comments on commit 1fbc8b4

Please sign in to comment.