Skip to content

Commit

Permalink
Merge branch 'master' into patch-2
Browse files Browse the repository at this point in the history
  • Loading branch information
ajaykarpur authored Oct 29, 2020
2 parents c2c7bef + bc70715 commit cd987ff
Showing 1 changed file with 1 addition and 1 deletion.
Original file line number Diff line number Diff line change
Expand Up @@ -279,7 +279,7 @@
"print('Tokenizing and counting, this may take a few minutes...')\n",
"start_time = time.time()\n",
"vectorizer = CountVectorizer(input='content', analyzer='word', stop_words='english',\n",
" tokenizer=LemmaTokenizer(), max_features=vocab_size, max_df=0.95, min_df=2)\n",
" tokenizer=LemmaTokenizer(), max_features=vocab_size, max_df=0.95, min_df=0.2)\n",
"vectors = vectorizer.fit_transform(data)\n",
"vocab_list = vectorizer.get_feature_names()\n",
"print('vocab size:', len(vocab_list))\n",
Expand Down

0 comments on commit cd987ff

Please sign in to comment.