-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
convert youki to use tracing #1899
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## main #1899 +/- ##
==========================================
- Coverage 67.36% 67.19% -0.17%
==========================================
Files 126 126
Lines 14265 14287 +22
==========================================
- Hits 9609 9600 -9
- Misses 4656 4687 +31 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@utam0k A few clarifying question.
eng@main:~/youki$ sudo make hack/bpftrace
BPFTRACE_STRLEN=125 ./hack/debug.bt
./hack/debug.bt:14:12-14: WARNING: Addrspace mismatch
if ($s != "\n") {
~~
Attaching 13 probes...
ERROR: Could not resolve symbol: /proc/self/exe:BEGIN_trigger
make: *** [Makefile:125: hack/bpftrace] Error 255
eng@main:~/youki$ bpftrace --version
bpftrace v0.14.0
eng@main:~/youki$
Edit: Install the latest bpftrace fix the issue. When we are ready, we should provide detailed instruction to install and use bpftrace. Second, can you elaborate more on how you are using Lastly, I suspect logging to stderr may resolve this issue if you did not see log lines before this PR. I made the change. |
Signed-off-by: yihuaf <[email protected]>
Signed-off-by: yihuaf <[email protected]>
I am able to reproduce this now. First, the old behavior is the same as the new, with the log line being slightly different, but the overall length is similar. In fact, the tracing in this PR produced a shorter line since the file path and line number is not turned on by default. Therefore, I believe this is a separate issue not related to logging or tracing. To resolve this, it seems that bpftrace is omitting the long lines. I can confirm that my shell window will wrap the line correctly. So bpftrace is cutting off the lines. Likely this is due to this issue: bpftrace/bpftrace#305 My 2cents. The bpftrace is tracking the write syscalls. We actually don't care that log lines are truncated here because we care more what syscall are called. It may even be a feature here because the log line can potentially be really long. For example, we may decide to log the content of a large struct in case of an error. So no matter how short we make the log line, it will always be truncated. |
Fixed |
Signed-off-by: yihuaf <[email protected]>
@yihuaf bpftrace is probably currently the most usable way to fix bugs in kubernetes. I am not sure how to do it with other ways, because it is too deep in the hierarchy from kubelet, and it is difficult to do. I am also interested in your opinion. It would be great to get some color on this change, and I would love to merge this PR. |
This is a good point.
Oh, this is how you are using |
Signed-off-by: yihuaf <[email protected]>
I turned off the timestamp for the text format to stderr. This should produce a shorter log line with log level, log target, and the actual log line. I left the other format unchanged for now. Let me know if you think we should remove timestamp for other cases as well (JSON and logging to file).
Normally, this is not too much of an issue because we can make a long log line to be more human readable with nice formatting. In fact, the tracing crate has a really nice However, in this specific case, we are using In my view, assume debugging is your only concern, the long term solution is to use something more appropriate to view tracing/logging. For example, we can configure the trace/log to write to journald or syslog. Then, by observing syslog or journald or using As a more concrete example, tracing crate provides a https://docs.rs/tracing-journald/latest/tracing_journald/. We can use this to duplicate the tracing data to journald. The caveat is that this behavior is beyond what For now, let's keep it simple and see this the current setup works for us. But, the real super power of tracing is to log as much information as possible along with metrics. It can be pretty powerful beyond just logging and the amount of logs/data entry generated from tracing will become too much for |
@yihuaf I agree with you, logging with bpftrace is pretty hacky. It should be turned off. It would be best if it could be turned off in the future, but for now this debugging method is necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot 🙏
Agreed. We will work towards that. |
This PR converts youki to use
tracing
crate for logging. I did the minimal to make the transition, buttracing
crate allows for much richer logging and instrumentation. This is related to #1348 in that this would be a pre-requirement to implement open-telemetry.In addition, the reason I decide to do this now because error handling can be difficult without tracing/logging support. In general with error handling, the audience are either the user/operator (human) or for code to handle. It is important to log as much as possible of the context of an error. There are usually 2 choices. Either we store the context of the error into the error itself, or we log the context and keep the error size small. There is no right answer and usually we need a combination of both.
tracing
crate allows us to minimize the context we have to store inside the error itself, and instead opting for rich logging.Note, I tried to keep the change minimal, but switching logging infrastructure does have a wide impact. Hopefully this is not too disruptive to people.