Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

convert youki to use tracing #1899

Merged
merged 4 commits into from
May 12, 2023
Merged

convert youki to use tracing #1899

merged 4 commits into from
May 12, 2023

Conversation

yihuaf
Copy link
Collaborator

@yihuaf yihuaf commented May 10, 2023

This PR converts youki to use tracing crate for logging. I did the minimal to make the transition, but tracing crate allows for much richer logging and instrumentation. This is related to #1348 in that this would be a pre-requirement to implement open-telemetry.

In addition, the reason I decide to do this now because error handling can be difficult without tracing/logging support. In general with error handling, the audience are either the user/operator (human) or for code to handle. It is important to log as much as possible of the context of an error. There are usually 2 choices. Either we store the context of the error into the error itself, or we log the context and keep the error size small. There is no right answer and usually we need a combination of both. tracing crate allows us to minimize the context we have to store inside the error itself, and instead opting for rich logging.

Note, I tried to keep the change minimal, but switching logging infrastructure does have a wide impact. Hopefully this is not too disruptive to people.

@yihuaf yihuaf self-assigned this May 10, 2023
@yihuaf yihuaf requested review from utam0k and a team May 10, 2023 05:18
@codecov-commenter
Copy link

codecov-commenter commented May 10, 2023

Codecov Report

Merging #1899 (415dee8) into main (69efb97) will decrease coverage by 0.17%.
The diff coverage is 42.99%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1899      +/-   ##
==========================================
- Coverage   67.36%   67.19%   -0.17%     
==========================================
  Files         126      126              
  Lines       14265    14287      +22     
==========================================
- Hits         9609     9600       -9     
- Misses       4656     4687      +31     

Copy link
Member

@utam0k utam0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When debugging with bpftrace, there is a length limit to display output. So, how about deleting the timestamps?
image

@yihuaf
Copy link
Collaborator Author

yihuaf commented May 10, 2023

@utam0k A few clarifying question.

First, I can't run make hack/bpftrace:

eng@main:~/youki$ sudo make hack/bpftrace 
BPFTRACE_STRLEN=125 ./hack/debug.bt
./hack/debug.bt:14:12-14: WARNING: Addrspace mismatch
    if ($s != "\n") {
           ~~
Attaching 13 probes...
ERROR: Could not resolve symbol: /proc/self/exe:BEGIN_trigger
make: *** [Makefile:125: hack/bpftrace] Error 255
eng@main:~/youki$ bpftrace --version
bpftrace v0.14.0
eng@main:~/youki$ 

I am using Ubuntu 22. Do you have any suggestions?

Edit: Install the latest bpftrace fix the issue. When we are ready, we should provide detailed instruction to install and use bpftrace.

Second, can you elaborate more on how you are using bpftrace? Is tracing/logging important as part of the output during bpftrace? I ask because timestamp is an important part of logging and tracing, so I prefer not to turn them off except only in specific situation. So if we don't care about logging during bpftrace, we can turn off the log or log to file instead. Also, is the display output a limit of the bpftrace? Is there a way to work around the issue in bpftrace?

Lastly, I suspect logging to stderr may resolve this issue if you did not see log lines before this PR. I made the change.

@yihuaf
Copy link
Collaborator Author

yihuaf commented May 10, 2023

I am able to reproduce this now. First, the old behavior is the same as the new, with the log line being slightly different, but the overall length is similar. In fact, the tracing in this PR produced a shorter line since the file path and line number is not turned on by default. Therefore, I believe this is a separate issue not related to logging or tracing.

To resolve this, it seems that bpftrace is omitting the long lines. I can confirm that my shell window will wrap the line correctly. So bpftrace is cutting off the lines. Likely this is due to this issue: bpftrace/bpftrace#305

My 2cents. The bpftrace is tracking the write syscalls. We actually don't care that log lines are truncated here because we care more what syscall are called. It may even be a feature here because the log line can potentially be really long. For example, we may decide to log the content of a large struct in case of an error. So no matter how short we make the log line, it will always be truncated.

@yihuaf
Copy link
Collaborator Author

yihuaf commented May 10, 2023

The integration test is failing. I suspect this has to do with the tempfile change, not this PR. I will investigate.

Fixed

Signed-off-by: yihuaf <[email protected]>
@yihuaf yihuaf requested a review from utam0k May 11, 2023 03:09
@utam0k
Copy link
Member

utam0k commented May 11, 2023

@yihuaf bpftrace is probably currently the most usable way to fix bugs in kubernetes. I am not sure how to do it with other ways, because it is too deep in the hierarchy from kubelet, and it is difficult to do.
I would only need to be able to erase the timestamp part. At least the year, month, and date would not be needed. The container runtime does not work with this timescale.
I don't think it would be desirable to make the logs longer simply from the standpoint of readability.

I am also interested in your opinion. It would be great to get some color on this change, and I would love to merge this PR.

@yihuaf
Copy link
Collaborator Author

yihuaf commented May 11, 2023

The container runtime does not work with this timescale.

This is a good point.

bpftrace is probably currently the most usable way to fix bugs in kubernetes. I am not sure how to do it with other ways, because it is too deep in the hierarchy from kubelet, and it is difficult to do.

Oh, this is how you are using bpftrace. This makes sense.

@yihuaf
Copy link
Collaborator Author

yihuaf commented May 11, 2023

I turned off the timestamp for the text format to stderr. This should produce a shorter log line with log level, log target, and the actual log line. I left the other format unchanged for now. Let me know if you think we should remove timestamp for other cases as well (JSON and logging to file).

I don't think it would be desirable to make the logs longer simply from the standpoint of readability.

Normally, this is not too much of an issue because we can make a long log line to be more human readable with nice formatting. In fact, the tracing crate has a really nice pretty formatter to make the logs more readable. It breaks the long line into multiple lines and adds coloring to the log line.

However, in this specific case, we are using bpftrace. From this perspective, the log line is a single write syscall with a long line. It will also cut off at 225 ish char due to internal limits. This limits makes sense because bpftrace is not designed to read logs. It is designed to capture the syscall trace. In another word, you are hacking bpftrace as a log collector for youki, but I can see this is a nice way to debug youki under kubelet. Please correct me if I am misunderstanding you.

In my view, assume debugging is your only concern, the long term solution is to use something more appropriate to view tracing/logging. For example, we can configure the trace/log to write to journald or syslog. Then, by observing syslog or journald or using journalctl, we can view the logs from youki in a similar way on how you use bpftrace right now. Of course, there are other tracing viewers as well but they all work in similar ways. I believe these days, we can even ship the tracing to something like datadog if we really wants.

As a more concrete example, tracing crate provides a https://docs.rs/tracing-journald/latest/tracing_journald/. We can use this to duplicate the tracing data to journald. The caveat is that this behavior is beyond what runc can do, so how do we expose this behavior? Can we add an extra --log-journald or similar flag for this behavior?

For now, let's keep it simple and see this the current setup works for us. But, the real super power of tracing is to log as much information as possible along with metrics. It can be pretty powerful beyond just logging and the amount of logs/data entry generated from tracing will become too much for bpftrace to handle at some point. We are not there now, but we slowly will.

@utam0k
Copy link
Member

utam0k commented May 12, 2023

@yihuaf I agree with you, logging with bpftrace is pretty hacky. It should be turned off. It would be best if it could be turned off in the future, but for now this debugging method is necessary.
However, if OpenTelemetry tracing could be placed anywhere and the sampling rate could be set to 100%, I think it could be turned off.

Copy link
Member

@utam0k utam0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks a lot 🙏

@utam0k utam0k merged commit dcc13ff into youki-dev:main May 12, 2023
@yihuaf yihuaf deleted the yihuaf/tracing branch May 12, 2023 15:09
@yihuaf
Copy link
Collaborator Author

yihuaf commented May 12, 2023

However, if OpenTelemetry tracing could be placed anywhere and the sampling rate could be set to 100%, I think it could be turned off.

Agreed. We will work towards that.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants