Evaluating a pipeline consisting only of a reader node #2132

julian-risch · 2022-02-07T10:12:42Z

Proposed changes:

pipeline.eval() gets an additional parameter pass_documents_as_input that enables passing the gold documents specified in the labels to the first node in the pipeline as input. It's an alternative way to evaluate the reader node only, with the advantage that no retriever needs to be specified in the pipeline.

Status (please check what you already did):

First draft (up for discussions & feedback)
Final code
Added tests
Updated documentation

tstadel · 2022-02-07T21:58:17Z

I'm sure I'm missing something. But doesn't this do exactly the same as add_isolated_node_eval besides that the results are populated with eval_mode="integrated" rather than "isolated"?

julian-risch · 2022-02-08T10:05:19Z

I'm sure I'm missing something. But doesn't this do exactly the same as add_isolated_node_eval besides that the results are populated with eval_mode="integrated" rather than "isolated"?

The effect is the same, yes. The difference is that the PR here also works with a pipeline that consists only of a reader node. For some use cases you might not have a retriever. Thus, it would be unintuitive to create an ExtractiveQAPipeline and then run add_isolated_node_eval to evaluate only a reader.

ju-gu · 2022-02-08T13:05:22Z

    def run(self, query: str, documents: List[Document] = None, top_k: Optional[int] = None, labels: Optional[MultiLabel] = None, add_isolated_node_eval: bool = False):  # type: ignore
        self.query_count += 1
        if documents:
            predict = self.timing(self.predict, "query_time")
            results = predict(query=query, documents=documents, top_k=top_k)
        else:
            results = {"answers": []}

        # Add corresponding document_name and more meta data, if an answer contains the document_id
        results["answers"] = [
            BaseReader.add_doc_meta_data_to_answer(documents=documents, answer=answer) for answer in results["answers"]
        ]

        # run evaluation with labels as node inputs
        if add_isolated_node_eval and labels is not None:
            predict = self.timing(self.predict, "query_time")
            unique_docs = {label.document.id: label.document for label in labels.labels}
            relevant_documents = unique_docs.values()
            results_label_input = predict(query=query, documents=relevant_documents, top_k=top_k)

            # Add corresponding document_name and more meta data, if an answer contains the document_id
            results["answers_isolated"] = [
                BaseReader.add_doc_meta_data_to_answer(documents=relevant_documents, answer=answer)
                for answer in results_label_input["answers"]
            ]

        return results, "output_1"

tstadel · 2022-02-08T13:07:54Z

I'm sure I'm missing something. But doesn't this do exactly the same as add_isolated_node_eval besides that the results are populated with eval_mode="integrated" rather than "isolated"?

The effect is the same, yes. The difference is that the PR here also works with a pipeline that consists only of a reader node. For some use cases you might not have a retriever. Thus, it would be unintuitive to create an ExtractiveQAPipeline and then run add_isolated_node_eval to evaluate only a reader.

@ju-gu and I just found out that we can achieve the same with isolated_node_eval (see @ju-gu's changes above).
All we need to do is change BaseReader.run() a bit:

make documents param optional
fix add_doc_meta_data_to_answer's documents param: should be relevant_documents instead
move predict = self.timing(self.predict, "query_time") before if documents block

The only advantage I can see for pass_documents_as_input is that it works with any node, not only Readers. But this is something we have deferred on porpuse. In addition, we might want to move the isolated_node_eval code from BaseReader.run() to BaseComponent._dispatch_run() in order to support all nodes. So I would opt for the isolated_node_eval solution here.

This reverts commit 2f4c2ec.

This reverts commit dcc51e4.

…o reader_eval_only

tstadel

LGTM!

julian-risch added 3 commits February 7, 2022 10:53

pass documents as extra param to eval

bf240c7

pass documents via labels to eval

dcc51e4

rename param in docs

2f4c2ec

julian-risch requested a review from tstadel February 7, 2022 10:12

Update Documentation & Code Style

1da7043

julian-risch changed the title ~~Reader eval only~~ Evaluating a pipeline consisting only of a reader node Feb 7, 2022

julian-risch added topic:eval topic:reader labels Feb 7, 2022

julian-risch and others added 5 commits February 8, 2022 15:17

Revert "rename param in docs"

2a931be

This reverts commit 2f4c2ec.

Revert "pass documents via labels to eval"

f0a203b

This reverts commit dcc51e4.

simplify iterating through labels and docs

e0b2017

Merge branch 'reader_eval_only' of github.com:deepset-ai/haystack int…

bfc5b71

…o reader_eval_only

Update Documentation & Code Style

08eadf3

tstadel approved these changes Feb 8, 2022

View reviewed changes

julian-risch merged commit 7fab027 into master Feb 9, 2022

julian-risch deleted the reader_eval_only branch February 9, 2022 08:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Evaluating a pipeline consisting only of a reader node #2132

Evaluating a pipeline consisting only of a reader node #2132

julian-risch commented Feb 7, 2022 •

edited

Loading

tstadel commented Feb 7, 2022

julian-risch commented Feb 8, 2022 •

edited

Loading

ju-gu commented Feb 8, 2022 •

edited by tstadel

Loading

tstadel commented Feb 8, 2022 •

edited

Loading

tstadel left a comment

Evaluating a pipeline consisting only of a reader node #2132

Evaluating a pipeline consisting only of a reader node #2132

Conversation

julian-risch commented Feb 7, 2022 • edited Loading

tstadel commented Feb 7, 2022

julian-risch commented Feb 8, 2022 • edited Loading

ju-gu commented Feb 8, 2022 • edited by tstadel Loading

tstadel commented Feb 8, 2022 • edited Loading

tstadel left a comment

Choose a reason for hiding this comment

julian-risch commented Feb 7, 2022 •

edited

Loading

julian-risch commented Feb 8, 2022 •

edited

Loading

ju-gu commented Feb 8, 2022 •

edited by tstadel

Loading

tstadel commented Feb 8, 2022 •

edited

Loading