-
Notifications
You must be signed in to change notification settings - Fork 2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add evaluation and document conversion to tutorial 15 #2325
Conversation
Check out this pull request on See visual diffs & provide feedback on Jupyter Notebooks. Powered by ReviewNB |
@@ -95,7 +95,7 @@ Just as text passages, tables are represented as `Document` objects in Haystack. | |||
from haystack.utils import fetch_archive_from_http | |||
|
|||
doc_dir = "data" | |||
s3_url = "https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/ottqa_sample.zip" | |||
s3_url = "https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/table_text_dataset.zip" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MichelBartels Once the dataset is uploaded, please replace the line in the dictionary here:
haystack/haystack/telemetry.py
Line 205 in 8cd73a9
"https://s3.eu-central-1.amazonaws.com/deepset.ai-farm-qa/datasets/documents/ottqa_tables_sample.json.zip": "15", |
@@ -88,7 +88,7 @@ | |||
}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
…aystack into table_eval_tutorial
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for working on this, looking already really good to me. Before merging, it would be nice to see the following points addressed:
- Please remove the KeyboardInterrupt output from the notebook
- It would be nice to have the outputs of the print methods in the notebook. Like this, the user doesn't need to run the notebook in order to understand what the outcome of the different sections is.
- The Evaluation section is missing in the
.py
-file - Please add labels to the PR.
Thanks @bogdankostic, I have made your suggested changes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Proposed changes:
This adds an evaluation and a document conversion segment to tutorial 15. For this, it was necessary to change the dataset to a dataset with labels.