Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Utilize processing server proxy to mets servers #1220

Merged
merged 41 commits into from
Jun 7, 2024
Merged

Conversation

MehmedGIT
Copy link
Contributor

@MehmedGIT MehmedGIT commented May 3, 2024

Allow the Processing Server to accept general mets server-related TCP requests, and translate them to UDS requests. This feature is useful for cases when the worker is located on a remote host and wants to communicate with a UDS Mets Server. One main benefit of this approach is to avoid allocating separate ports for different mets servers.

This PR is still a draft and the implementation is still on a conceptual level. Please feel free to suggest any ideas.

Copy link
Collaborator

@bertsky bertsky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't fully understand: if the METS Server user (here: tcp_mets caller) is remote (relative to the workspace), then how is this useful? All the files that the METS updates will relate to will also be remote, and we concluded it's not a good design to squeeze file contents through the METS Server earlier.

src/ocrd_network/runtime_data/deployer.py Outdated Show resolved Hide resolved
@MehmedGIT
Copy link
Contributor Author

MehmedGIT commented May 17, 2024

I still don't fully understand: if the METS Server user (here: tcp_mets caller) is remote (relative to the workspace), then how is this useful? All the files that the METS updates will relate to will also be remote, and we #966 (review) earlier.

The idea is that the processing workers could send requests to any Mets server through the Processing Server without allocating separate ports per Mets server on the host where the Processing Server is running.

You are right that it will not work when the Processing Server (Mets Servers) and the Processing Workers are on different hosts with the current setup. You are also right that it has been decided to not transfer files over to the Mets Server as discussed in #966.

The forwarding through the Processing Server as a proxy is supposed to be used when:

  • there is a Network File System that is accessible by both the Mets Servers and Processing Workers
  • when running docker containers on the same host - to avoid extra allocation of ports for the Mets Servers

@joschrew, is there anything more to add that I have missed?

@joschrew
Copy link
Contributor

I imagine a setup where workers (and processing server) are on different vm's. The workspace is shared through NFS so that every processor has access to the same files. Currently this would not work with the Mets-Server as the unix-domain-sockets cannot be shared through NFS. With this PR it should be possible for workers on different vms to make requests to the Mets-Server.

@bertsky
Copy link
Collaborator

bertsky commented May 17, 2024

Thanks @MehmedGIT @joschrew for the explanation. Absolutely makes sense now – fantastic idea!

@joschrew
Copy link
Contributor

I did some commits. It is now working for me with processors in docker-containers. I still want to change little things and do additional tests. After that I would change the pr's status for reviews.

@MehmedGIT
Copy link
Contributor Author

I did some commits. It is now working for me with processors in docker-containers. I still want to change little things and do additional tests. After that I would change the pr's status for reviews.

Thanks.

@MehmedGIT
Copy link
Contributor Author

It is still unclear to me why one of the processors sent the request to /tcp_mets/workspace_path instead of /tcp_mets, leading to 404 errors. However, the error I was getting was related to an outdated ocrd_all local installation. All redirections work as expected with the latest version of ocrd_all and core.

@MehmedGIT MehmedGIT marked this pull request as ready for review June 4, 2024 12:05
@MehmedGIT MehmedGIT requested review from kba, joschrew and bertsky June 4, 2024 12:05
Copy link
Member

@kba kba left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, very thorough, ready for release

@kba kba merged commit 69c1ce6 into master Jun 7, 2024
21 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants