-
-
Notifications
You must be signed in to change notification settings - Fork 31.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bpo-30962: Add caching to Logger.isEnabledFor() #2752
Conversation
Hello, and thanks for your contribution! I'm a bot set up to make sure that the project can legally accept your contribution by verifying you have signed the PSF contributor agreement (CLA). Unfortunately we couldn't find an account corresponding to your GitHub username on bugs.python.org (b.p.o) to verify you have signed the CLA (this might be simply due to a missing "GitHub Name" entry in your b.p.o account settings). This is necessary for legal reasons before we can look at your contribution. Please follow the steps outlined in the CPython devguide to rectify this issue. Thanks again to your contribution and we look forward to looking at it! |
Called when level changes are made | ||
""" | ||
|
||
_acquireLock() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure that a lock is useful here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't it needed to keep another thread from changing loggerDict or the cache dictionaries?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh ok, right, I first understood that the lock was supposed to protect the cache :-) I don't think that the cache deserves a lock.
Lib/logging/__init__.py
Outdated
self._cache[level] = level >= self.getEffectiveLevel() | ||
_releaseLock() | ||
|
||
return self._cache[level] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a risk of race condition if first you test if the key exists and then sets the key.
Why not using: "try: return self._cache[level] except KeyError: ..."?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The real race condition is if clear_cache() is called in-between. But, yes, your proposed idiom would fix it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Fixed that to use try/except
Lib/logging/__init__.py
Outdated
self._cache[level] = level >= self.getEffectiveLevel() | ||
_releaseLock() | ||
|
||
return self._cache[level] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a still a race condition here if _clear_cache() is called just after _releaseLock(). You will need to store the value in a local variable, e.g:
_acquireLock()
is_enabled = self._cache[level] = ...
_releaseLock()
return is_enabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense. Updated
I've requested a review from Vinay, who is the official author and maintainer of the logging module. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs tests which will confirm that the caching works correctly. Nothing too complicated, just a test which works on both root and non-root loggers, sets levels and gets/asserts the effective levels, sets levels again and confirms effective levels are as expected, clears the cache manually and confirms the levels again etc.
@vsajip Test added. Let me know if this is what you're looking for. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Detail is in comments against the code changes.
Lib/test/test_logging.py
Outdated
class CachingTest(BaseTest): | ||
def test_caching(self): | ||
root = self.root_logger | ||
level1 = logging.getLogger("abc") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are loggers, not levels, and the names should reflect that. I understand you're talking about level in the logger hierarchy, but still, the use of level
for the loggers is potentially confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Will fix
Lib/test/test_logging.py
Outdated
self.assertEqual(level2._cache, {}) # Cache is empty | ||
|
||
self.assertTrue(level2.isEnabledFor(logging.ERROR)) | ||
self.assertEqual(level2._cache, {logging.ERROR: True}) # Cache is populated |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What values would the logger level cache dict have? Either empty or a level pointing to True
? Could there be multiple keys in this dict?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Keys are the levels passes to isEnabledFor(), values are always True or False. I can lookup more levels to get a more well-rounded test.
Lib/test/test_logging.py
Outdated
self.assertFalse(level2.isEnabledFor(logging.ERROR)) | ||
self.assertTrue(root.isEnabledFor(logging.ERROR)) | ||
self.assertEqual(root._cache, {logging.ERROR: True}) # Root cache is populated | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should do level2.setLevel(logging.NOTSET)
. and see that the isEnabledFor()
for the two loggers and the root logger are as expected.
Also, it makes sense to add some tests for getEffectiveLevel()
to ensure that the values are as expected without the presence of caching.
return self._cache[level] | ||
except KeyError: | ||
_acquireLock() | ||
if self.manager.disable >= level: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it not make more sense to cache the result of getEffectiveLevel()
, rather than isEnabledFor()
? From what I can see, the time saving is really from cutting out the hierarchy walk in the former method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That is where most of the time savings is, but it's still faster (Something like 5%) than checking self.manager.disable every time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does have me thinking, in the LoggerAdapter class, there's currently no caching. LoggerAdapter.isEnabledFor() reimplements the logic from Logger.isEnabledFor(). If we point to Logger.isEnabledFor() instead of reimplementing, we get caching. I don't see a functional difference. Is there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vsajip The only thing I see is LoggerAdapter.getEffectiveLevel()
could be overwritten and that would change the behavior of LoggerAdapter.isEnabledFor()
. That would not be the case if LoggerAdapter.isEnabledFor()
pointed to Logger.isEnabledFor()
. So if we want caching on adapters (and we probably do), the caching logic should be duplicated on the adapter. The alternative is to give up some performance and move caching to getEffectiveLevel()
. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see a functional difference. Is there?
The manager.disable
check is [duplicated] there because it would potentially allow skipping the hierarchy walk of getEffectiveLevel()
.
The alternative is to give up some performance and move caching to
getEffectiveLevel()
Although it slows down isEnabledFor()
because of still needing the manager.disable
check, I think it would be better to cache getEffectiveLevel()
. It's easier to reason about. The manager.disable
level shouldn't be conflated with the effective levels of individual loggers, IMO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand why the manager.disable
check is in Logger.isEnabledFor()
. What I was trying to say is having LoggerAdapter.isEnabledFor()
reimplement Logger.isEnabledFor()
only makes sense if someone is going to subclass LoggerAdapter
and overwrite LoggerAdapter.getEffectiveLevel()
.
I wasn't associating the caching with the levels, but rather the isEnabledFor()
call. I guess another benefit of moving it would be it also speeds up calls to getEffectiveLevel()
. I'm not sure how often that's used outside of this module, but I'm sure it is.
Lib/test/test_logging.py
Outdated
@@ -4406,6 +4406,29 @@ def test_basic(self): | |||
self.assertTrue(found, msg=msg) | |||
|
|||
|
|||
class CachingTest(BaseTest): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could add this to an existing test case class rather than adding a separate class just for this one method.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I'll move it to the LoggerTest class
@vsajip I was wrong about how much slower it would be. I ran some benchmarks to compare putting the caching in Here's how I implemented caching in def getEffectiveLevel(self):
try:
return self._cache['_elevel']
except KeyError:
_acquireLock()
logger = self
effective_level = NOTSET
while logger:
if logger.level:
effective_level = logger.level
break
logger = logger.parent
self._cache['_elevel'] = effective_level
_releaseLock()
return effective_level I expanded my original benchmarking code to test at all the loggers in the hierarchy, not just the third level import timeit
count = 10000000
set_up = """
root = logging.getLogger()
logger1 = logging.getLogger('1')
logger2 = logging.getLogger('1.2')
logger3 = logging.getLogger('1.2.3')
"""
for logger in ('root', 'logger1', 'logger2', 'logger3'):
statement = "%s.debug('Debug message')" % logger
orig = timeit.timeit(statement, number=count,
setup='import logging_orig as logging\n' + set_up)
ief = timeit.timeit(statement, number=count,
setup='import logging_cache_ief as logging\n' + set_up)
gel = timeit.timeit(statement, number=count,
setup='import logging_cache_gel as logging\n' + set_up)
print('--- %s ---' % logger)
print('Non-caching: ', orig)
print('With Caching (isEnabledFor): ', ief)
print('With Caching (getEffectiveLevel): ', gel)
print('Time decrease (isEnabledFor): %.2f%%' % ((orig - ief) / orig * 100))
print('Time decrease (getEffectiveLevel): %.2f%%' % ((orig - gel) / orig * 100)) From the results you can see that as the logger hierarchy grows, the difference becomes less apparent, but caching at --- root ---
Non-caching: 5.672386677993927
With Caching (isEnabledFor): 3.272040506999474
With Caching (getEffectiveLevel): 5.023688525019679
Time decrease (isEnabledFor): 42.32%
Time decrease (getEffectiveLevel): 11.44%
--- logger1 ---
Non-caching: 6.022408013988752
With Caching (isEnabledFor): 3.288163874996826
With Caching (getEffectiveLevel): 5.099931188975461
Time decrease (isEnabledFor): 45.40%
Time decrease (getEffectiveLevel): 15.32%
--- logger2 ---
Non-caching: 7.034587770991493
With Caching (isEnabledFor): 3.271738503011875
With Caching (getEffectiveLevel): 5.257907326973509
Time decrease (isEnabledFor): 53.49%
Time decrease (getEffectiveLevel): 25.26%
--- logger3 ---
Non-caching: 7.899996431020554
With Caching (isEnabledFor): 3.420868766028434
With Caching (getEffectiveLevel): 5.191981957992539
Time decrease (isEnabledFor): 56.70%
Time decrease (getEffectiveLevel): 34.28% |
Good stuff. Thanks! |
This is a simple and brute-force caching method, but cuts the time required to log non-enabled levels in half.
Another approach would be to move the cache dictionary to the manager. This would speed up cache clearing, but take additional lookups during normal operation.
https://bugs.python.org/issue30962