Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: log enhancement for message reliability analysis #2640

Merged
merged 10 commits into from
May 1, 2024

Conversation

Ivansete-status
Copy link
Collaborator

@Ivansete-status Ivansete-status commented Apr 28, 2024

Description

The next modules are touched:

  • waku_node.nim
  • archive.nim
  • waku_filter_v2/protocol.nim
  • waku_lightpush/protocol.nim
  • waku_relay/protocol.nim

Issue

Add logging of hashes to all nodes - #2474

Copy link

github-actions bot commented Apr 28, 2024

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:2640-rln-v1

Built from c1793ad

Copy link

github-actions bot commented Apr 28, 2024

You can find the image built from this PR at

quay.io/wakuorg/nwaku-pr:2640-rln-v2

Built from c1793ad

@Ivansete-status Ivansete-status marked this pull request as ready for review April 28, 2024 19:04
Copy link
Contributor

@gabrielmer gabrielmer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uff amazing! This will help a lot!
Thanks so much 😍

waku/waku_relay/protocol.nim Outdated Show resolved Hide resolved
Copy link
Contributor

@NagyZoltanPeter NagyZoltanPeter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM thank you for it.

hash = messagePush.pubsubTopic.computeMessageHash(messagePush.wakuMessage).to0xHex()
target_peer_ids = peers.mapIt(shortLog(it)),
msg_hash =
messagePush.pubsubTopic.computeMessageHash(messagePush.wakuMessage).to0xHex()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It might be called twice per logging. There is a processing flow in chronicles that evaluates twice the message arguments. It is better as its in line 214 in this file.

@@ -228,16 +232,20 @@ proc handleMessage*(
if not await wf.pushToPeers(subscribedPeers, messagePush).withTimeout(
MessagePushTimeout
):
debug "timed out pushing message to peers",
info "timed out pushing message to peers",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isnt it an error? Or do we just want to notice this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isnt it an error? Or do we just want to notice this?

ah yes good point! I will change it to error

@SionoiS
Copy link
Contributor

SionoiS commented Apr 29, 2024

I understand the request since Nwaku logs are unusable BUT this "fix" will make it worst, we need to improve the logs not accommodate specific readers.

I disagree with this PR. We need clear guidelines and a concrete plan to improve the logs. Lets discuss this please!

pubsubTopic = pubsubTopic,
contentTopic = message.contentTopic,
peer = peer.peerId
target_peer_id = peer.peerId,
msg_hash = pubsubTopic.computeMessageHash(message).to0xHex()
return await node.wakuLightpushClient.publish(pubsubTopic, message, peer)

if not node.wakuLightPush.isNil():
debug "publishing message with self hosted lightpush",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this supposed to be debug?

Because in the "publishing message with lightpush" case, it is being logged at info level.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As per previous conversations we had in PM meeting, I will move back all info to debug

Copy link
Contributor

@chaitanyaprem chaitanyaprem left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Few comments

@@ -914,7 +914,7 @@ proc mountLightPush*(

if publishedCount == 0:
## Agreed change expected to the lightpush protocol to better handle such case. https://github.com/waku-org/pm/issues/93
debug "Lightpush request has not been published to any peers"
info "Lightpush request has not been published to any peers"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we log the hash here? It would be hard to trace otherwise with many messages going in parallel.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks I will submit a change with that


let insertStartTime = getTime().toUnixFloat()

(await self.driver.put(pubsubTopic, msg, msgDigest, msgHash, msgTimestamp)).isOkOr:
waku_archive_errors.inc(labelValues = [insertFailure])
debug "failed to insert message", err = error

info "message archived",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One suggestion i have is rather than making all these as info logs? Can we make it as part of compile flag or some config to log such messages for debugging/tracing? I remember Jakubs mentioning something similar being already there in nwaku which they enable in status fleets for tracing.

@Ivansete-status Ivansete-status merged commit d5e0e4a into master May 1, 2024
26 of 30 checks passed
@Ivansete-status Ivansete-status deleted the enhance-logs branch May 1, 2024 08:25
@Ivansete-status
Copy link
Collaborator Author

@NagyZoltanPeter - for any unknown reason I couldn't change any single line of waku_lightpush/protocol.nim because that caused a SIGSEGV crash on macos tests. Did you face a similar issue?

@NagyZoltanPeter
Copy link
Contributor

@NagyZoltanPeter - for any unknown reason I couldn't change any single line of waku_lightpush/protocol.nim because that caused a SIGSEGV crash on macos tests. Did you face a similar issue?

Hm... issue I had with MacOs tests I found in my PR and fixed. Nothing particular. But what change do you mean here?

@Ivansete-status
Copy link
Collaborator Author

@NagyZoltanPeter - for any unknown reason I couldn't change any single line of waku_lightpush/protocol.nim because that caused a SIGSEGV crash on macos tests. Did you face a similar issue?

Hm... issue I had with MacOs tests I found in my PR and fixed. Nothing particular. But what change do you mean here?

Sorry for late response. In the following PR it can be seen how the test fails on MacOS due to the changes in lighpush/protocol.nim: #2655
In this case there are many changes but it fail even with few log lines :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants