-
Notifications
You must be signed in to change notification settings - Fork 712
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize /proc traversal (again) #497
Conversation
8c3ed08
to
eeb96d8
Compare
@tomwilkie @peterbourgon please take a look at this when you have some time. The only change since #450 is the cache.go file. |
creation time.Time | ||
} | ||
|
||
func (fce *filesCacheEntry) Pinned() bool { return fce.contents != nil } |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Thanks for the change. Can we have a unit test which fails using bluele/gcache and passes using hashicorp/golang-lru? Otherwise, it's hard to have confidence this is a fix — and hard to prevent regressions. Also, let's avoid introducing the tech debt of removing bluele/gcache from other packages, by proactively eliminating the dependency throughout the repo, and making the required changes in this PR. |
@inercia please give me a chance to test & review this before submitting. I should be able to look at it first think next week. |
@inercia I'm happy to remove the use of gcache in ids.go as part of this PR, if it helps. |
@peterbourgon The problem with adding a unit test is that I did never have a problem with @tomwilkie I will wait until a LGTM from you. |
@tomwilkie @peterbourgon would you like to replace gcache in this PR or in a different one? |
I would much rather replace gcache in this PR than another one, to avoid creating technical debt. I imagine @tomwilkie will have exactly the opposite opinion, to avoid conflating multiple concerns into a single PR. |
@peterbourgon The last two commits remove all the references to |
return "", errNotFound | ||
entry, ok := val.(*resolverCacheEntry) | ||
if !ok { | ||
panic(fmt.Sprintf("unknown entry type in cache: %+v", val)) |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
There are too many races / leaks in this PR. Going forward I'd drop all the caching, and focus on making a single walk of the proc filesystem, gathering both processes and connections at the same time, and porting the existing procspy code onto that. We can potentially add caching on top of that. |
@tomwilkie the usage pattern for this code would be to call the The performance gain of caching file handles and contents reduces the number of syscalls dramatically, and that is a main part of objective of this PR. I will fix the second |
I've just realized that moving the |
a1ea316
to
21446c5
Compare
Please take another look @tomwilkie |
) | ||
|
||
type filesCacheEntry struct { | ||
file File |
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
This comment was marked as abuse.
This comment was marked as abuse.
Sorry, something went wrong.
Whats the motivation for a separate proc and process directory in probe? |
The |
Yes please merge these two directories, keeping the original directory. |
3fd37ab
to
86595eb
Compare
…`probe`. Cache processes and connections when reading from the /proc Keep a cache of open files, reducing the number of open/close cycles in the /proc dir Also cache some file contents
processes in the /proc directory by the name.
I've merged the I've also added an experimental feature where the Linux @peterbourgon or @tomwilkie , would you mind taking a look at this? |
Alvaro - I've had a think about this PR, and a chat with Peter, and I think its best if we take a different approach. This is a big change to the very core of Scope, and the risks associated with getting it wrong are large. I want you to avoid rewriting large swathes of procspy as the current code is mature, and its behaviour is sometimes subtle. So instead of proceding this PR, I suggest we take a much more incremental approach:
From there we can evaluate other optimisations one PR at a time. Please avoid needless renaming to make this process go faster. Thank you for working on this, we do appreciate your contribution here. |
No problem, I'll change the code in several PRs. I'll close this PR then... |
This is the same code found in #450, but replacing the caching library. This should fix #486.
In addition to the changes in #450, this PR also caches some file contents. In particular,
/proc/<pid>/[comm, cmdline]
, as they do not change during the lifetime of a process (this is not always true, but I don't think we should care about these corner cases)From the original #450 description:
The objective of this PR is to reduce the number of syscalls when collecting info from
/proc
. This PR does that by:/proc/*/[cmdline, comm, stat, fd/*]
,/proc/net/tcp*
) once per cycle (spy.interval
), parsing and caching the processes/connections information/proc/*/[cmdline, comm, stat]
so we do notopen()
/close()
these files so often.stat()
s when reading/proc
directoriesChanges:
/proc
has been proved fromprocspy
to/probe/proc
Walker
has been replaced by aproc.Reader
that provides bothProcesses()
andConnections()
. Users can walk the list of processes/connections in the same way by providing a function to these functions. ACachingReader
can be used for caching and getting this info, updated whenUpdate()
. The code for reading and parsing connection and processes has been simplified.CachingReader
is initialized in/probe/main.go
and used in bothendpoint.Reporter
andprocess.Reporter
for reporting about processed and connections, and updated periodically by callingUpdate()
everyspy.interval
seconds.Proc
andProcess
have been unified, and many things from/probe/process
have been moved to/probe/proc
This is part of the solution to #284