Skip to content

Commit

Permalink
docs: fixed tutorial to not download model
Browse files Browse the repository at this point in the history
  • Loading branch information
KennethEnevoldsen committed Jan 18, 2023
1 parent c224580 commit 42589f6
Showing 1 changed file with 24 additions and 10 deletions.
34 changes: 24 additions & 10 deletions docs/tutorials/introductory_tutorial.ipynb
Original file line number Diff line number Diff line change
Expand Up @@ -19,14 +19,15 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"## Using TextDescriptives\n",
"In this tutorial we'll use TextDescriptives to get a quick overview of the [SMS Spam Collection Data Set](https://archive.ics.uci.edu/ml/datasets/SMS+Spam+Collection).\n",
"The dataset contains 5572 SMS messages categorized as ham or spam. \n",
"\n",
"To start, let's load a dataset and get a bit familiar with it.\n"
"To start, let's load a dataset and get a bit familiar with it."
]
},
{
Expand All @@ -38,10 +39,7 @@
"try:\n",
" import textdescriptives\n",
"except:\n",
" !pip install \"textdescriptives[tutorials]\"\n",
"\n",
"# download spaCy model\n",
"!python -m spacy download en_core_web_sm"
" !pip install \"textdescriptives[tutorials]\""
]
},
{
Expand Down Expand Up @@ -419,12 +417,29 @@
]
},
{
"attachments": {},
"cell_type": "markdown",
"metadata": {},
"source": [
"### The configurable way: Add pipes to spaCy\n",
"\n",
"If you want to use the component as part of a larger analysis with spaCy, simply add the pipeline components to an already initialized spaCy pipeline and use `extract_df` on the `Doc` object."
"If you want to use the component as part of a larger analysis with spaCy, simply add the pipeline components to an already initialized spaCy pipeline and use `extract_df` on the `Doc` object.\n",
"\n",
"To do this we will use the small English pipeline. If you haven't downloaded it yet it is simple to do:"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": [
"try:\n",
" nlp = spacy.load(\"en_core_web_sm\")\n",
"except:\n",
" # if not downloaded, download the model\n",
" !python -m spacy download en_core_web_sm\n",
" nlp = spacy.load(\"en_core_web_sm\")"
]
},
{
Expand Down Expand Up @@ -452,7 +467,6 @@
}
],
"source": [
"nlp = spacy.load(\"en_core_web_sm\")\n",
"\n",
"nlp.add_pipe(\"textdescriptives/readability\")\n",
"nlp.add_pipe(\"textdescriptives/dependency_distance\")"
Expand Down Expand Up @@ -882,7 +896,7 @@
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.10.9 ('.venv': venv)",
"display_name": "textdescriptives",
"language": "python",
"name": "python3"
},
Expand All @@ -896,12 +910,12 @@
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.9"
"version": "3.8.15 (default, Oct 11 2022, 21:31:25) \n[Clang 14.0.0 (clang-1400.0.29.102)]"
},
"orig_nbformat": 4,
"vscode": {
"interpreter": {
"hash": "1fec3abd59d8d4e793464ce299b69082c8b9c618d555ba6df7044c7d7b4183f8"
"hash": "31387647799921bb85032eec7bb02e281325ae7f8ffa6f9cd7cdead815b36c88"
}
}
},
Expand Down

0 comments on commit 42589f6

Please sign in to comment.