Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

coordinator crashes on bad logs access #648

Closed
Hennzau opened this issue Sep 9, 2024 · 2 comments · Fixed by #650
Closed

coordinator crashes on bad logs access #648

Hennzau opened this issue Sep 9, 2024 · 2 comments · Fixed by #650
Assignees
Labels
bug Something isn't working coordinator

Comments

@Hennzau
Copy link
Collaborator

Hennzau commented Sep 9, 2024

Describe the bug

When I try to view logs using dora logs, the coordinator crashes if I input incorrect information (e.g., an invalid UUID or name). The same happens when there is an error, such as a wrong node ID or when multiple dataflows have the same name.

Environment (please complete the following information):

  • System: macOS
  • Dora version: tag v0.3.6
@Hennzau Hennzau added bug Something isn't working coordinator labels Sep 9, 2024
@Hennzau
Copy link
Collaborator Author

Hennzau commented Sep 10, 2024

The same issue occurs with some other commands (dora stop [uuid] with an invalid UUID). This is due to the coordinator control event loop.

ControlRequest::Logs { uuid, name, node } => {
    let dataflow_uuid = if let Some(uuid) = uuid {
        uuid
    } else if let Some(name) = name {
        resolve_name(name, &running_dataflows, &archived_dataflows)?
    } else {
        bail!("No UUID provided")
    };

    let reply = retrieve_logs(
        &running_dataflows,
        &archived_dataflows,
        dataflow_uuid,
        node.into(),
        &mut daemon_connections,
        clock.new_timestamp(),
    )
    .await
    .map(ControlRequestReply::Logs);
    let _ = reply_sender.send(reply);
}

I don't think we should stop the coordinator if we can't resolve the dataflow UUID. I'll wrap it so it just sends an error reply instead.

Now it looks like that:

ControlRequest::Logs { uuid, name, node } => {
let dataflow_uuid = if let Some(uuid) = uuid {
    Ok(uuid)
} else if let Some(name) = name {
    resolve_name(name, &running_dataflows, &archived_dataflows)
} else {
    Err(eyre!("No uuid"))
};

match dataflow_uuid {
    Ok(uuid) => {
        let reply = retrieve_logs(
            &running_dataflows,
            &archived_dataflows,
            uuid,
            node.into(),
            &mut daemon_connections,
            clock.new_timestamp(),
        )
        .await
        .map(ControlRequestReply::Logs);
        let _ = reply_sender.send(reply);
    }
    Err(err) => {
        let _ = reply_sender.send(Err(err));
    }
}

And in case of bad logs name/uuid it looks like that:

[ERROR]
unexpected reply to daemon logs: Error("no dataflow with name `test`")

@haixuanTao
Copy link
Collaborator

Yes. I agree that this should not happened. And this looks good!

Could you open a PR?

Thanks in advance :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working coordinator
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants