Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make dora cli connect to remote coordinator #513

Merged
merged 6 commits into from
May 30, 2024

Conversation

Gege-Wang
Copy link
Collaborator

@Gege-Wang Gege-Wang commented May 27, 2024

To make CLI coordinator connection configurable, such as
dora start dataflow.yaml --coordinator-addr IP
Allow users to pass a parameter when dora up, dora start, and dora list --coordinator-address to specify the address or the default address control_socket_addr().

To test this PR:

# In a remote terminal
dora up
# In a local terminal
dora list --coordinator-addr REMOTE_IP
dora start dataflow.yaml --coordinator-addr REMOTE_IP
dora logs [DATAFLOW] <NODE> --coordinator-addr REMOTE_IP
dora destroy --coordinator-addr REMOTE_IP
dora check --coordinator-addr REMOTE_IP

now, there are no CLI CI test to test it.

@Gege-Wang Gege-Wang force-pushed the multiple-daemon branch 2 times, most recently from 54169dd to af2798f Compare May 27, 2024 16:27
@Gege-Wang Gege-Wang changed the title Make dora cli connect to remote coordinatorol_nodes Make dora cli connect to remote coordinator May 28, 2024
@Gege-Wang Gege-Wang requested a review from XxChang May 28, 2024 09:36
Signed-off-by: Gege-Wang <[email protected]>

fix: localhost to control_socket_addr()
Copy link
Collaborator

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good to me overall, just one small suggestion.

@XxChang Thanks for reviewing!

binaries/cli/src/main.rs Outdated Show resolved Hide resolved
@haixuanTao
Copy link
Collaborator

Before merging, can we add some CI test so that we know that this features works.

Something in the likes of adding a bash script within https://github.com/dora-rs/dora/blob/ea47a556a77f6c64f29ab07939d747edc8573671/.github/workflows/ci.yml with something like:

export IP="$(curl ipinfo.io/ip)"     
dora coordinator --addr $IP

dora list --coordinator-addr $IP

# ...

Thanks!

@haixuanTao
Copy link
Collaborator

Also, there is some Clippy warnings worth checking: https://github.com/dora-rs/dora/pull/513/files

@Gege-Wang
Copy link
Collaborator Author

Gege-Wang commented May 28, 2024

Before merging, can we add some CI test so that we know that this features works.

Something in the likes of adding a bash script within https://github.com/dora-rs/dora/blob/ea47a556a77f6c64f29ab07939d747edc8573671/.github/workflows/ci.yml with something like:

export IP="$(curl ipinfo.io/ip)"     
dora coordinator --addr $IP

dora list --coordinator-addr $IP

# ...

Thanks!

ok, I will add it!

@haixuanTao
Copy link
Collaborator

Could we try to find what need to be changed for the CI to work with external IP?

@XxChang
Copy link
Collaborator

XxChang commented May 29, 2024

Could we try to find what need to be changed for the CI to work with external IP?

How about creating a docker image for testing coordinator services?

@haixuanTao
Copy link
Collaborator

haixuanTao commented May 29, 2024

I think that before trying docker, we need to be able to run with the same machine, same environment, just with the external IP instead of localhost.

We can have docker container testing as a follow up PR.

Does it work on your local machine?

You can try with a docker container but it's probably not going to work.

Copy link
Collaborator

@phil-opp phil-opp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me, thanks a lot for your work!

Could we try to find what need to be changed for the CI to work with external IP?

@haixuanTao You mean substituting the 127.0.0.1 in the CI job with the external IP of the runner? We might be able to do that using the third-party haythem/public-ip action.

@haixuanTao
Copy link
Collaborator

If i try:

export IP="0.0.0.0:6013"                                                                                                          (base)  ✘ 130 remotes/gege/multiple
dora coordinator --addr $IP &

dora list --coordinator-addr $IP

I'm having this warning issue:

  2024-05-29T14:20:56.596707Z  WARN dora_coordinator::listener: failed to deserialize node message

Caused by:
    invalid type: string "List", expected struct Timestamped at line 1 column 6

Location:
    /home/peter/Documents/work/dora/binaries/coordinator/src/listener.rs:38:48
    at binaries/coordinator/src/listener.rs:41

@phil-opp
Copy link
Collaborator

Could we try to find what need to be changed for the CI to work with external IP?

@haixuanTao You mean substituting the 127.0.0.1 in the CI job with the external IP of the runner? We might be able to do that using the third-party haythem/public-ip action.

The public IP is probably not reachable this way because of firewalls, NATs, etc. So we decided to postpone the creation of a proper CI test for now.

@Gege-Wang
Copy link
Collaborator Author

  2024-05-29T14:20:56.596707Z  WARN dora_coordinator::listener: failed to deserialize node message

Caused by:
    invalid type: string "List", expected struct Timestamped at line 1 column 6

Location:
    /home/peter/Documents/work/dora/binaries/coordinator/src/listener.rs:38:48
    at binaries/coordinator/src/listener.rs:41

dora coordinator --addr $IP1 &
dora list --coordinator-addr $IP2
I think $IP1 is not same as $IP2, $IP1 is the coordinator port, which is default DORA_COORDINATOR_PORT_DEFAULT. IP2 is the coordinator control port(default control_socket_addr()), almost all of the cli command create session using control port, but the coordinator task using the hardcode control_socket_addr(), we haven't make it configurable.

dora/binaries/coordinator/src/lib.rs
    let control_events = control::control_events(control_socket_addr(), tasks)
        .await
        .wrap_err("failed to create control events")?;

@haixuanTao
Copy link
Collaborator

haixuanTao commented May 29, 2024

I see. So, basically the port is not configurable yet. In that case, can we only specify IP address in the cli configuration. To avoid people using the PORT and realising it can´t be used.

@Gege-Wang
Copy link
Collaborator Author

I have tried in EC2 as the following:

#  EC2
ubuntu@ip-172-31-30-172:~/dora/test_rust_project$ dora up
started dora coordinator
started dora daemon

and local

# local
# check the EC2.   (right)
dora check --coordinator-addr 54.234.111.197:6012
Dora Coordinator: ok
Dora Daemon: ok

# check the default coordinator-adds.   (right)
$ dora check                                              1 ↵
Dora Coordinator: not running
Dora Daemon: not running

Environment check failed.

now, the control_socket_addr() is still 127.0.0.1:6012, but we can still connect to remote coordinator. I cannot understand, but it works, I think it should be 0.0.0.0:6012, but not.

dora/binaries/coordinator/src/lib.rs
    let control_events = control::control_events(control_socket_addr(), tasks)
        .await
        .wrap_err("failed to create control events")?;

@XxChang , yes, there are also some problem about dora start.

dora start dataflow.yaml --coordinator-addr 54.234.111.197:6012
Failed to read yaml dataflow: failed to open given file: No such file or directory (os error 2)

@Gege-Wang
Copy link
Collaborator Author

I see. So, basically the port is not configurable yet. In that case, can we only specify IP address in the cli configuration. To avoid people using the PORT and realising it can´t be used.

ok, I have done this in ce2b02f

@phil-opp
Copy link
Collaborator

dora coordinator --addr $IP1 & dora list --coordinator-addr $IP2 I think $IP1 is not same as $IP2, $IP1 is the coordinator port, which is default DORA_COORDINATOR_PORT_DEFAULT. IP2 is the coordinator control port(default control_socket_addr()), almost all of the cli command create session using control port, but the coordinator task using the hardcode control_socket_addr(), we haven't make it configurable.

Yes, exactly. The coordinator listens on two different ports:

  • On DORA_COORDINATOR_PORT_DEFAULT (0xD02A = 53290) for daemon<->coordinator connections.
    • This port be configured using dora coordinator --addr 0.0.0.0:<port>
    • The daemon needs to be started with the remote IP and port of the coordinator: dora list --coordinator-addr <ip:port>
  • On control_socket_addr (6012) for cli<->coordinator connections.
    • This port is not configurable yet on the coordinator side.
    • This PR adds a argument to the CLI commands to specify how the CLI can reach the coordinator.

@phil-opp
Copy link
Collaborator

I cannot understand, but it works, I think it should be 0.0.0.0:6012, but not.

Yes, the coordinator should listen on 0.0.0.0:6012 instead of 127.0.0.1:6012. The daemon should still connect to 127.0.0.1:6012 by default.

@haixuanTao haixuanTao merged commit f50478b into dora-rs:main May 30, 2024
16 of 17 checks passed
@haixuanTao
Copy link
Collaborator

Thanks! Great work!

@phil-opp
Copy link
Collaborator

phil-opp commented May 30, 2024

Yes, the coordinator should listen on 0.0.0.0:6012 instead of 127.0.0.1:6012. The daemon should still connect to 127.0.0.1:6012 by default.

I implemented this in #520, among some other improvements.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants