Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pipeline components reconnection issue #6359

Closed
1 task done
bilgeyucel opened this issue Nov 20, 2023 · 4 comments · Fixed by #6530
Closed
1 task done

Pipeline components reconnection issue #6359

bilgeyucel opened this issue Nov 20, 2023 · 4 comments · Fixed by #6530
Assignees
Labels
2.x Related to Haystack v2.0 topic:pipeline type:bug Something isn't working

Comments

@bilgeyucel
Copy link
Contributor

Describe the bug
Connecting components again raises an error.

Error message

Cannot connect 'builder.prompt' with 'generator.prompt': generator.prompt is already connected to ['builder'].

Expected behavior
As a user, I expect this error to disappear when re-initializing the pipeline object. However, I need to reinitialize every component in the pipeline to resolve this error. Possible solutions:

  • Not raise any errors
  • Improve the error message so that users understand that re-initializing pipeline instance is not enough and they need to re-initialize all components in the pipeline

Additional context
This is a problem when users make a connection mistake in their pipeline I/O and need to call .connect() again or when they want to use the same component in different pipelines e.g., using MetadataRouter in indexing and query pipeline.

To Reproduce
Colab notebook: https://colab.research.google.com/drive/1XiG5UWR9dDsu46zHv7o9pXJ_Uxih60sY?usp=sharing

FAQ Check

System:

  • OS:
  • GPU/CPU:
  • Haystack version (commit or version number):
  • DocumentStore:
  • Reader:
  • Retriever:
@bilgeyucel bilgeyucel added type:bug Something isn't working topic:pipeline 2.x Related to Haystack v2.0 labels Nov 20, 2023
@julian-risch
Copy link
Member

We could decide to not raise an error if a user adds a connection that already exists exactly like that. For example,
when the user calls query_pipeline.connect("text_embedder", "retriever") multiple times, we could decide to not raise
PipelineConnectError: Cannot connect 'text_embedder.embedding' with 'retriever.query_embedding': retriever.query_embedding is already connected to ['text_embedder']..
What I don't agree with is the part about "re-initializing the pipeline object". We're not really re-initializing here. We initialize a completely separate, second pipeline and reusing components in multiple pipelines is not intended. For that reason, users need to initialize new components for that. We could update the error message when users try to reuse components across multiple pipelines so that the error message says sth like:
PipelineConnectError: Cannot connect 'text_embedder.embedding' with 'retriever.query_embedding': retriever.query_embedding is already connected to ['text_embedder'] in a different pipeline.

@bilgeyucel
Copy link
Contributor Author

I understand better now why re-initalizing the pipeline doesn't work, thanks @julian-risch 🙌
In this case, I believe the ideal solution is not to raisePipelineConnectError: Cannot connect 'text_embedder.embedding' with 'retriever.query_embedding': retriever.query_embedding is already connected to ['text_embedder']. error if users try to connect the same components in the same pipeline again, so, when they call query_pipeline.connect("text_embedder", "retriever") twice.

However, if reusing components in different pipelines is not possible, the better error message would be the one you shared, explaining these components are connected in a different pipeline

@TuanaCelik
Copy link
Contributor

Is this issue being worked on? The challenges will see people try out code cells and re-run them. It's not ideal to be stuck with an error and having to restart runtime..

@Timoeller
Copy link
Contributor

From the first user interview it becomes very clear developing connections in colab is incredibly hard with this issue.
Please prioritize a fix

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
2.x Related to Haystack v2.0 topic:pipeline type:bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants