-
Notifications
You must be signed in to change notification settings - Fork 481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix memory leak caused by saving stats counter to persist config #2279
Fix memory leak caused by saving stats counter to persist config #2279
Conversation
Sometimes we need to express that something is equal to TRUE (typically in a unit test). In this cases TRUE is a strict value, 1. Signed-off-by: Laszlo Budai <[email protected]>
Signed-off-by: Laszlo Budai <[email protected]>
91b2254
to
531c26e
Compare
Build FAILURE |
1 similar comment
Build FAILURE |
hmmm... the reason why Travis/Kira is timeouted is that I have a mutex in a locked state. |
Signed-off-by: Laszlo Budai <[email protected]>
As set is just an assignment, it breaks SRP. This is not the only reason of this separation: if we have a setter and an init method, then it is possible to init the counter only when it is first registered to stats. Signed-off-by: Laszlo Budai <[email protected]>
… time Signed-off-by: Laszlo Budai <[email protected]>
531c26e
to
74d74cd
Compare
...and the failing assert was a mistake by myself... now I think it is fixed |
Build SUCCESS |
… 1st time Signed-off-by: Laszlo Budai <[email protected]>
… 1st time Signed-off-by: Laszlo Budai <[email protected]>
The way how we are saving the values caused a memory leak. Of course it could be fixed, but there is conceptual issue with this approach: the stats-counters can be re-used after reload (as they are not removed from the global stats-counter table). Signed-off-by: Laszlo Budai <[email protected]>
74d74cd
to
10c11d7
Compare
Build SUCCESS |
I would depend on this patch in my threaded logthrdestdriver work, as it wasn't easy to resolve the same So this is not just a minor leak, but helps me a big deal. I'll try to review this tonight. Thanks for fixing this! |
@bazsi: there are some 'ugly' parts in the PR: I didn't want to lock/unlock the stats twice, so I extended the |
I like the approach in general, I might give some thought to naming. It is not immediately clear, why the existance of a counter should cause something to be initialized or not. What if we delegated the stats registration to LogQueue and took care of this initialization there? e.g. instead of log_queue_set_counters(), we would have log_queue_register_stats() that would get the initialized StatsClusterKey as argument. This way, the What do you think? OTOH, the patch by itself resolves an issue I am facing in the multithreaded driver, so I'd love to see this go in as soon as possible. |
I’m going to check it on tomorrow and move it to LogQueue. |
@bazsi: in case of dropped counter the ownership is at logwriter/logthrdstdrv/afsql (and I would not touch that part), but the other two counters are owned by the LogQueue, so I'll modify the code. |
53e2c00
to
145da81
Compare
@bazsi: I think (I'm sure...), later the dropped counter can be removed from the note: in afsql module the stats counter registration point changed, but I think it still works. |
Build SUCCESS |
Concept: queued_messages and memory_usage counters are used only by the LogQueue, so they can be owned by the LogQueue instances. The dropped_messages counter is owned by the LogWriter/LogThrDestDriver, or AFSqlDestDriver instances, so it won't be registered by this new method. Signed-off-by: Laszlo Budai <[email protected]>
Signed-off-by: Laszlo Budai <[email protected]>
Signed-off-by: Laszlo Budai <[email protected]>
Signed-off-by: Laszlo Budai <[email protected]>
Signed-off-by: Laszlo Budai <[email protected]>
145da81
to
6504f33
Compare
Build SUCCESS |
@kira-syslogng retest this please |
Build SUCCESS |
@kira-syslogng Do perftest |
Build SUCCESS |
|
@@ -244,15 +257,15 @@ stats_cluster_is_alive(StatsCluster *self, gint type) | |||
{ | |||
g_assert(type < self->counter_group.capacity); | |||
|
|||
return ((1<<type) & self->live_mask); | |||
return ((1<<type) & self->live_mask) == (1 << type); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For me: return !!(1<<type) & self->live_mask)
would be more readable
|
||
stats_register_counter(stats_level, sc_key, SC_TYPE_QUEUED, &self->queued_messages); | ||
|
||
if (stats_check_level(STATS_LEVEL1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this check? Can we just omit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, we cannot: if we wouldn't check this than in case of stats level 0 the need_to_reset_counters
would be set to FALSE
which would ended in an unitialized queued counter.
(note that stats_counter_set
handles the case when NULL ptr passed to it as counter).
stats_register_counter(stats_level, sc_key, SC_TYPE_QUEUED, &self->queued_messages); | ||
|
||
if (stats_check_level(STATS_LEVEL1)) | ||
need_to_reset_counters = !stats_contains_counter(sc_key, SC_TYPE_MEMORY_USAGE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could go this check into _reset_counters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it couldn't.
Why?
Because at that point the counter is registered... so it would return TRUE
(of course when stats-level is >= 1).
@@ -208,6 +208,19 @@ stats_cluster_track_counter(StatsCluster *self, gint type) | |||
return &self->counter_group.counters[type]; | |||
} | |||
|
|||
StatsCounterItem * | |||
stats_cluster_get_counter(StatsCluster *self, gint type) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the end, this function is not used. Can you drop the patch?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I want to keep it as I’ll need it later.
@furiel: thanks for the review notes. I’ll check them but... but I’m not sure I want to rebuild the whole patchset. It is true that I’m blocking on this PR, but for me it is just a marginal issue what I had to solve and I want to focus on other things now. |
@@ -1349,17 +1345,11 @@ _register_counters(LogWriter *self) | |||
stats_register_counter(self->options->stats_level, &sc_key, SC_TYPE_SUPPRESSED, &self->suppressed_messages); | |||
stats_register_counter(self->options->stats_level, &sc_key, SC_TYPE_DROPPED, &self->dropped_messages); | |||
stats_register_counter(self->options->stats_level, &sc_key, SC_TYPE_PROCESSED, &self->processed_messages); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are also common counters. Cant we move these under _register_common_counters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean under _common_counters
?
_common_counter
is at LogQueue
implementation, a private (static) method for that, and has nothing to do with LogWriter
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, all my questions are answered. Other than that, I had a few optional comments, nothing critical. imho this is ready to go in.
I can do some modifications...
but only in if the PR get merged on today. I still don't want to rebuild the whole patchset. |
@furiel: you were too fast :) |
Leak:
Each time a memory-usage counter is written into a variable that is dynamically allocated on the heap and not freeing up (on my machine 8 bytes are leaked for each reload).