Add support for Multi-Hop Dense Retrieval #2571

deutschmn · 2022-05-18T10:00:45Z

Proposed changes:
As discussed in #2555, this PR adds a new node for Multi-Hop Dense Retrieval (MDR): Answering Complex Open-Domain Questions with Multi-Hop Dense Retrieval.

Status:

TODOs:

Add basic MultihopDenseRetriever that can retrieve.
Add BiAdaptiveSharedModel that bridges the gap between MultihopDenseRetriever and Roberta. (might need refactoring for training)
Add MDR to retriever test cases.
Add support for save and load.
~~(Maybe) add an equivalent for DPRQuestionEncoder/DPRContextEncoder.~~
~~Implement and test training.~~ → no training support yet
Train a model, put a checkpoint on Hugging Face Hub and reference it as the default. → https://huggingface.co/deutschmann/mdr_roberta_q_encoder

@ZanSara @julian-risch @bogdankostic I'm still actively working on this, but if you already have some comments, I appreciate your feedback 😊

CLAassistant · 2022-05-18T10:00:54Z

All committers have signed the CLA.

ZanSara · 2022-05-18T12:05:15Z

Hello @deutschmn, great to see this PR so quickly! 🤩 I'm going to have a look at the diff soon. Is there anything specific you have doubts about, or that we should focus our attention on?

deutschmn · 2022-05-18T12:38:05Z

Hello @deutschmn, great to see this PR so quickly! 🤩 I'm going to have a look at the diff soon. Is there anything specific you have doubts about, or that we should focus our attention on?

Not yet, thanks! I'm not sure whether using BiAdaptiveSharedModel (derivation of BiAdaptiveModel) is the best solution, but I'll look into it a bit more when implementing the training.

deutschmn · 2022-05-18T14:39:21Z

Training, saving, and loading work now. I'll start a proper training with more data soon to check out whether I get reasonable results 😊

mathislucka · 2022-05-18T14:59:07Z

haystack/modeling/model/biadaptive_model.py

@@ -485,3 +485,67 @@ def convert_from_transformers(
            bi_adaptive_model.connect_heads_with_processor(processor.tasks)  # type: ignore

        return bi_adaptive_model
+
+
+class BiAdaptiveSharedModel(BiAdaptiveModel):


As far as I understand this, you are using the same language model as an encoder. So instead of the original DPR architecture where we have a question and a document encoder, you are using the same encoder for both.

I'd argue that this is the same architecture as used by sentence transformers (https://sbert.net/docs/training/overview.html) and maybe we should consider if it makes sense to re-implement all of the functionality already provided there or if it makes sense to use sentence transformers under the hood.

As far as I understand this, you are using the same language model as an encoder. So instead of the original DPR architecture where we have a question and a document encoder, you are using the same encoder for both.

Exactly. But, most importantly, MDR also uses a different retrieval mechanism that iteratively adds context documents (see https://ar5iv.labs.arxiv.org/html/2009.12756#S2.SS2.SSS0.Px1).

I'd argue that this is the same architecture as used by sentence transformers (https://sbert.net/docs/training/overview.html) and maybe we should consider if it makes sense to re-implement all of the functionality already provided there or if it makes sense to use sentence transformers under the hood.

Yes, I agree that it's quite similar to sentence transformers, except for the multi-hop part, which they don't support AFAIK.

For now, I based my implementation on Haystack's DensePassageRetriever, which uses Hugging Face's DPR, and then calls the document store's query_by_embedding. It took only a few changes to adapt this for MDR. I'm not sure if there is much to be gained when using sentence transformers. What you see in my changes now is all there was to do. But please let me know if you find parts that can be simplified by using sentence transformers 😊

I think the BiAdaptiveSharedModel would not be needed if an EmbeddingRetriever would be used inside the MDR. The EmbeddingRetriever would just be used to create the embeddings and the retrieval can be custom as required in MDR. I'd also argue that having the train method on the MDR is not intuitive because there isn't anything specific to MDR in the training code and it is just the same training as used in sentence transformers.

I think this would reduce the code that is needed quite a bit. At the same time maybe someone from the team could chime in (@julian-risch ?) If it makes sense to replicate sentence-transformers within FARM architecture?

Oh, that's a good point, thanks. Frankly, I didn't have a look at EmbeddingRetriever since I thought the closest to MDR would be DPR. Therefore, I duplicated and adapted its code - Hugging Face style. I built BiAdaptiveSharedModel so that MDR doesn't differ from DPR too much.

However, if the goal is to have less duplicated code, deriving from EmbeddingRetriever and overriding retrieve/retrieve_batch might be a good idea. Where do you propose we then implement the training? It would probably make sense to have that not only for MDR but every EmbeddingRetriever.

deutschmn · 2022-05-20T13:24:18Z

Short update on the training: The current implementation does DPR-style training, which is, as @mathislucka has correctly remarked, equivalent to what sentence transformers do if the encoder is shared. However, in the MDR paper and reference implementation, the authors use a different way that accounts for the multi-hop retrieval method by incorporating both hops into the loss function: https://github.com/facebookresearch/multihop_dense_retrieval/blob/62eb2427e36a648a927c6e39bb4c748796f7b366/mdr/retrieval/criterions.py#L114.

I've checked the results of the current method, and they seem suboptimal. I believe the retriever focuses too much on the first context passage if it's not explicitly taught not to do that. Apart from that, if we integrate MDR training into Haystack, it should be what was presented by the original authors, right?

My next steps will be to a) try to use Facebook's checkpoints or b) train a model with the reference implementation, push the weights to Hugging Face and load it into Haystack. If this is successful, we can move to adding training support.

What do you think? 😊

deutschmn · 2022-05-23T13:39:52Z

I've now successfully loaded one of Facebook's checkpoints into Haystack, pushed it to the HF Hub and made it the default model for MDR: deutschmann/mdr_roberta_q_encoder. Using the DPR checkpoint facebook/dpr-ctx_encoder-single-nq-base also works, but it hasn't been specifically trained for iterative retrieval. It might be interesting to run the benchmarks tests from your site to see how the model performs in comparison to DPR 😊

I've removed the DPR-style training code and decided to wait for your feedback before continuing. Maybe it would be a good idea to add training in another PR since it's likely not completely trivial to implement.

That's it for my first draft of this PR. It would be great if some of the maintainers could look into what I've done so far (@ZanSara, @julian-risch, @bogdankostic) 😊 Maybe you could also weigh in on the discussion regarding BiAdaptiveSharedModel I'm having with @mathislucka. It might come down to whether MDR training should also be implemented here or not.

bogdankostic · 2022-06-02T15:17:13Z

Hey @deutschmn, sorry for the late reply! 🙃 I will look at the details of your PR and provide some feedback tomorrow.

JulianGerhard21 · 2022-06-06T20:17:33Z

Hi @deutschmn ,

I am really interested in this PR and want to test it aswell. Could you please briefly describe which steps are necessary after merging the PR locally?

Kind regards
Julian

bogdankostic

Thanks for this PR, looks very clean!

Not sure whether we should introduce the BiAdaptiveSharedModel or try using the EmbeddingRetriever instead (as @mathislucka proposed). What do you think @julian-risch?

bogdankostic · 2022-06-07T06:57:52Z

haystack/nodes/retriever/dense.py

+
+    def retrieve_batch(
+        self,
+        queries: Union[str, List[str]],


In #2575, we changed the type of the queries param in the batch methods to List[str].

julian-risch · 2022-06-07T12:12:43Z

I would prefer re-using EmbeddingRetriever instead of having the new BiAdaptiveSharedModel with some code duplication (what @mathislucka wrote). That will require a bit additional work in this PR but it will help us a lot in the long run when we maintain that code. I also think that it will make the code a bit easier to understand for new users.

deutschmn · 2022-06-07T13:07:45Z

Thanks for your feedback, @bogdankostic and @julian-risch! I'll look into it in the next couple of days.

@JulianGerhard21 Happy to hear that you're interested! Here's a notebook showing how you can use MultihopDenseRetriever with a multi-hop example where it gives a better result than DensePassageRetriever: https://gist.github.com/deutschmn/40d3960c4f9a8dc6fa699460d3c56c05

julian-risch · 2022-06-22T13:39:19Z

Hi @deutschmn how is it going? 🙂 We are planning a Haystack release for in about ten days (first week of July). It would be great to have Multi-Hop Dense Retrieval in there as a feature! Do you think you can incorporate the change requests until then so that we can merge your PR?

deutschmn · 2022-06-23T08:01:15Z

Hey @julian-risch! Sorry, I was pretty busy the last couple of days. I'll start incorporating your feedback today or tomorrow so the feature can make the cut for the release 😊

…pset-ai#2575

deutschmn · 2022-06-23T13:18:13Z

Alright, MultihopDenseRetriever now derives from EmbeddingRetriever and inherits most of its behaviour, except retrieve and retrieve_batch.

I also

addressed @bogdankostic's remark about the signature change of retrieve_batch in Change signature of queries param in batch methods #2575,
completely removed BiAdaptiveSharedModel in the refactoring and
rebased the changes to the current master so we can merge without conflicts 😊

bogdankostic

Looking already pretty good to me, just left a few comments wrt naming, docstrings and filters and one question wrt to changing the signature of the forward method.

bogdankostic · 2022-07-04T13:33:30Z

haystack/nodes/retriever/dense.py

@@ -1898,3 +1898,303 @@ def save(self, save_dir: Union[Path, str]) -> None:
        :type save_dir: Union[Path, str]
        """
        self.embedding_encoder.save(save_dir=save_dir)
+
+
+class MultihopDenseRetriever(EmbeddingRetriever):


I'd propose renaming this to MultihopEmbeddingRetriever, this might make it more clear that this is an extension of the EmbeddingRetriever.

bogdankostic · 2022-07-04T13:34:44Z

haystack/nodes/retriever/dense.py

+        embed_meta_fields: List[str] = [],
+    ):
+        """
+        Same parameters as `EmbeddingRetriever` except


I'd like to see descriptions for all the parameters that can be set in the init method. Feel free to just copy them from the EmbeddingRetriever.

bogdankostic · 2022-07-04T13:37:43Z

haystack/nodes/retriever/dense.py

+                    " as queries or a single filter that will be applied to each query."
+                )
+        else:
+            filters = [{}] * len(queries)


This needs to be changed. If the user provides a single filter dict, it will be overwritten by an empty filter.

bogdankostic · 2022-07-04T13:53:11Z

haystack/modeling/model/prediction_head.py

@@ -971,7 +971,9 @@ def get_similarity_function(self):
                f"The similarity function can only be 'dot_product' or 'cosine', not '{self.similarity_function}'"
            )

-    def forward(self, query_vectors: torch.Tensor, passage_vectors: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:
+    def forward(
+        self, query_vectors: torch.Tensor, passage_vectors: Optional[torch.Tensor] = None


Can you maybe explain why this change is needed?

I made this change because I discovered that if a BiAdaptiveModel is instantiated with a TextSimilarityHead as one of its prediction_heads, the head can be called with None as the second forward parameter here:

haystack/haystack/modeling/model/biadaptive_model.py

Line 299 in dc48c44

embedding1, embedding2 = head(output1, output2)

I added the default parameter (= None) because it used to be necessary for my BiAdaptiveSharedModel.

However, as it doesn't really have anything to do with this PR anymore, I reverted the change now 😊

bogdankostic · 2022-07-05T07:46:18Z

Thanks for implementing the requested changes so fast! Could you merge the current master into your branch? In the last days, we did a few fixes wrt to our CI and deprecations from other libraries we use in Haystack.

deutschmn · 2022-07-05T08:33:22Z

@bogdankostic Done 😊

bogdankostic

Looking good to me! Thank you so much for your contribution to Haystack :)

* Implement MDR * Adapt conftest to new MDR signature * Update Documentation & Code Style * Change signature of queries param in batch methods of MDR like in deepset-ai#2575 * Update Documentation & Code Style * Rename MultihopDenseRetriever to MultihopEmbeddingRetriever * Fix filters in retrieve_batch * Add docstring for MultihopEmbeddingRetriever.__init__ * Update Documentation & Code Style * Revert forward signature of TextSimilarityHead Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

ss2342 · 2023-08-15T15:13:30Z

@deutschmn @bogdankostic hi! Is there any documentation I can refer to about the MultiHopDenseRetriever and how to use it? Thanks!

bogdankostic · 2023-08-15T15:59:41Z

Hi @Ssingh1997! We have a short passage about the MultihopEmbeddingRetriever in our documentation: https://docs.haystack.deepset.ai/docs/retriever#multihop-embedding-retriever

chan2k · 2023-10-10T06:53:16Z

@bogdankostic Is there documentation for MultiHopDenseRetriever on the parameters for it and how to set number of iterations?

julian-risch added the topic:retriever label May 18, 2022

ZanSara added the type:feature New feature or request label May 18, 2022

mathislucka reviewed May 18, 2022

View reviewed changes

deutschmn changed the title ~~WIP: Add support for Multi-Hop Dense Retrieval~~ Add support for Multi-Hop Dense Retrieval May 23, 2022

bogdankostic reviewed Jun 7, 2022

View reviewed changes

agnieszka-m added the action:needs documentation label Jun 13, 2022

masci linked an issue Jun 22, 2022 that may be closed by this pull request

Add support for Multi-Hop Dense Retrieval #2555

Closed

deutschmn added 2 commits June 23, 2022 14:40

Implement MDR

bf5ce51

Adapt conftest to new MDR signature

05f3374

deutschmn force-pushed the feat/mdr branch from a48c6a2 to 05f3374 Compare June 23, 2022 12:46

github-actions bot and others added 3 commits June 23, 2022 12:52

Update Documentation & Code Style

c4c1dc5

Change signature of queries param in batch methods of MDR like in dee…

7f50a6f

…pset-ai#2575

Update Documentation & Code Style

40710ad

julian-risch requested a review from bogdankostic July 4, 2022 07:20

bogdankostic suggested changes Jul 4, 2022

View reviewed changes

deutschmn added 2 commits July 5, 2022 08:34

Rename MultihopDenseRetriever to MultihopEmbeddingRetriever

8a73828

Fix filters in retrieve_batch

856029a

deutschmn and others added 3 commits July 5, 2022 08:38

Add docstring for MultihopEmbeddingRetriever.__init__

74c9c2f

Update Documentation & Code Style

919b63b

Revert forward signature of TextSimilarityHead

b584816

deutschmn requested a review from bogdankostic July 5, 2022 06:53

Merge branch 'master' into feat/mdr

0b007cc

bogdankostic approved these changes Jul 5, 2022

View reviewed changes

bogdankostic merged commit 1db3fd0 into deepset-ai:master Jul 5, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support for Multi-Hop Dense Retrieval #2571

Add support for Multi-Hop Dense Retrieval #2571

deutschmn commented May 18, 2022 •

edited

Loading

CLAassistant commented May 18, 2022 •

edited

Loading

ZanSara commented May 18, 2022

deutschmn commented May 18, 2022

deutschmn commented May 18, 2022

mathislucka May 18, 2022

deutschmn May 19, 2022

mathislucka May 19, 2022

deutschmn May 19, 2022

deutschmn commented May 20, 2022

deutschmn commented May 23, 2022

bogdankostic commented Jun 2, 2022

JulianGerhard21 commented Jun 6, 2022

bogdankostic left a comment

bogdankostic Jun 7, 2022

julian-risch commented Jun 7, 2022

deutschmn commented Jun 7, 2022

julian-risch commented Jun 22, 2022

deutschmn commented Jun 23, 2022

deutschmn commented Jun 23, 2022 •

edited

Loading

bogdankostic left a comment

bogdankostic Jul 4, 2022

deutschmn Jul 5, 2022

bogdankostic Jul 4, 2022

deutschmn Jul 5, 2022

bogdankostic Jul 4, 2022

deutschmn Jul 5, 2022

bogdankostic Jul 4, 2022

deutschmn Jul 5, 2022

bogdankostic commented Jul 5, 2022

deutschmn commented Jul 5, 2022

bogdankostic left a comment

ss2342 commented Aug 15, 2023

bogdankostic commented Aug 15, 2023

chan2k commented Oct 10, 2023

Add support for Multi-Hop Dense Retrieval #2571

Add support for Multi-Hop Dense Retrieval #2571

Conversation

deutschmn commented May 18, 2022 • edited Loading

CLAassistant commented May 18, 2022 • edited Loading

ZanSara commented May 18, 2022

deutschmn commented May 18, 2022

deutschmn commented May 18, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

deutschmn commented May 20, 2022

deutschmn commented May 23, 2022

bogdankostic commented Jun 2, 2022

JulianGerhard21 commented Jun 6, 2022

bogdankostic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

julian-risch commented Jun 7, 2022

deutschmn commented Jun 7, 2022

julian-risch commented Jun 22, 2022

deutschmn commented Jun 23, 2022

deutschmn commented Jun 23, 2022 • edited Loading

bogdankostic left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

bogdankostic commented Jul 5, 2022

deutschmn commented Jul 5, 2022

bogdankostic left a comment

Choose a reason for hiding this comment

ss2342 commented Aug 15, 2023

bogdankostic commented Aug 15, 2023

chan2k commented Oct 10, 2023

deutschmn commented May 18, 2022 •

edited

Loading

CLAassistant commented May 18, 2022 •

edited

Loading

deutschmn commented Jun 23, 2022 •

edited

Loading