-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API Review Feedback: Logging and DateTime updates #1752
Conversation
Please update the PR title to be more descriptive with the details of the change and intent of the PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding few comments
* @details Logs every HTTP request, response, or retry attempt. | ||
* @remark See #logging.hpp | ||
* @details Logs every HTTP request and response. | ||
* @remark See Azure::Core::Logger. | ||
*/ | ||
class LoggingPolicy : public HttpPolicy { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renaem to LogPolicy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we prefer the name LogPolicy
over LoggingPolicy
?
.Net: LoggingPolicy
Java: HttpLoggingPolicy
Python: HttpLoggingPolicy
JS: LogPolicy
} | ||
} | ||
|
||
static void ToggleLogging(bool isEnabled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is used internally by Logger::SetListener()
:
void Logger::SetListener(Logger::Listener listener)
{
std::unique_lock<std::shared_timed_mutex> loggerLock(g_logListenerMutex);
g_logListener = std::move(listener);
Log::ToggleLogging(g_logListener != nullptr);
}
ToggleLogging is inline function that sets an atomic boolean flag:
inline void Log::ToggleLogging(bool isEnabled) { g_isLoggingEnabled = isEnabled; }
static AZ_CORE_DLLEXPORT std::atomic<bool> g_isLoggingEnabled;
static AZ_CORE_DLLEXPORT std::atomic<LogLevelInt> g_logLevel;
That boolean flag ends up being used by Log::ShouldWrite()
inline:
static bool ShouldWrite(Logger::Level level) { return GetLogListener(level) != nullptr; }
GetLogListener(Logger::Level level)
is inline and does lock-free check, based on that atomic boolean flag, and atomic logLevel.
static Logger::Listener GetLogListener(Logger::Level level)
{
return g_isLoggingEnabled && static_cast<LogLevelInt>(level) <= g_logLevel
? GetLogListener()
: Logger::Listener(nullptr);
}
Finally, GetLogListener()
(without arguments) uses mutex and is non-inline.
Logger::Listener Log::GetLogListener()
{
std::unique_lock<std::shared_timed_mutex> loggerLock(g_logListenerMutex);
return g_logListener;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand. If you're already using the read/writer lock, then why is toggling necessary? Just get the shared lock when getting a copy of the value and use the exclusive lock when setting the value - this is all you need.
/** | ||
* @brief Log message handler. | ||
*/ | ||
class Logger { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Something went wack y here. Why do we have both Log and Logger classes? We should NOT have these 2 things - this is so confusing!
I thought our plan was to have a Log class with private methods (ShouldWrite & Write) for our internal stuff and public methods for customer-callable stuff (SetLevel & SetListener). Then, an internal header file that made the 2 internal functions (ShouldWrite & Write) callable using just global functions that were freinds of the class.
This seems to have gotten overly complex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Log class is for internal functions - ShouldWrite()
and Write()
, the same way as Logger is for SetListener()
and SetLevel()
.
Our team really dislikes friend class declarations. This is why there are no private methods in the Logger
class.
[Other side effect, if we have Write()
and ShouldWrite()
as free functions in Internal namespace, Level
cannot be declared inside Logger
class, it has to be outside. That's because to declare a function from another namespace, it has to be forward-declared before the class declaration. And in order to even forwarddeclare ShouldLog()
, Logger::Level
needs to be known at the point of forward-declaration (because ShouldLog()
takes it as argument), which is not possible - circular dependecny. But it is always possible to forward-declare Internal::Log
class, and make it a friend.]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@JeffreyRichter, internal functions look better when inside a static class, especially that now it has data members and typedefs ().
I don't think that names are that confusing, given that they do live in different namespaces, and SDK code does not need to interact with the Logger
class - only with the Internal::Log
class.
(Internal::)Log::ShouldWrite()
and (Internal::)Log::Write()
looks readable to me.
Logger::SetListener()
, Logger::SetLevel()
, Logger::Level
, Logger::Level::Error
- readable as well.
We write to log, users listen to loggers.
sdk/core/azure-core/src/logger.cpp
Outdated
std::atomic<bool> Log::g_isLoggingEnabled(false); | ||
std::atomic<Log::LogLevelInt> Log::g_logLevel(static_cast<LogLevelInt>(Logger::Level::Warning)); | ||
|
||
inline void Log::ToggleLogging(bool isEnabled) { g_isLoggingEnabled = isEnabled; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we keep this (and I need to undersatnd the usage of it), then it should be renamed sisnce it doesn't actually Toggle; it sets, so EnableLogging(bool isEnabled)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with the naming, it is probably better (I was thinking of it as "toggling on" and "toggling off", but "enable on"/"enable off" is less confusing).
This is needed, because it sets an atomic boolean variable, which is lock free.
The moment when our code tries to determine whether it should log:
- It "calls" an inline function
ShouldWrite(Logger::Level logLevel)
ShouldLog(Level)
"calls" an inline functionGetLogListener(Logger::Level logLevel)
, and compares it to nullptr.GetLogListener(Logger::Level logLevel)
checks an atomic and lock free boolean static variable,g_isLoggingEnabled
. If it is false, returns nullptr.- If
g_isLoggingEnabled
is true, it checks an atomic and lock free variableg_logLevel
: logLevel of the message that we are trying to log should be<= g_logLevel
.
As you can see, ShouldWrite()
check is lock-free. Plus, it gets inlined, so the code ends up being as simple as checking a boolean variable, which is as fast as you can get. Maximum, you have two checks, when g_isLoggingEnabled == true
. A second check is integer =<
comparison, and is also inlined.
g_isLoggingEnabled
gets set by Logger::SetListener(listener)
: if (listener) is nullptr, g_isLoggingEnabled
gets set to false, otherwise it is true. g_isLoggingEnabled
can be read as "is g_logListener a non-null pointer"? Only that it is lock-free (we can't check g_logListener
for being null without a lock. Even though it is an inexpensive lock, especially now, because we now use shared lock when reading, it is still not as fast as lock-free inline comparison (probably something like CMP g_isLoggingEnabled, 0; JE :should_not_write; CMP EAX, g_logLevel; JG :should_not_write
). @ahsonkhan was suggesting that in WinHttp, logging may get responsible for 80% slowdown due to mutex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
was suggesting that in WinHttp, logging may get responsible for 80% slowdown due to mutex.
Just to clarify, I think @vhvb1989 mentioned something about the perf framework showing a regression when we use Context
(see #1589). I didn't measure with winhttp yet (it doesn't log anything atm anyway), so I am not sure where that 80% came from.
} | ||
} | ||
|
||
static void ToggleLogging(bool isEnabled); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't understand. If you're already using the read/writer lock, then why is toggling necessary? Just get the shared lock when getting a copy of the value and use the exclusive lock when setting the value - this is all you need.
This pull request is protected by Check Enforcer. What is Check Enforcer?Check Enforcer helps ensure all pull requests are covered by at least one check-run (typically an Azure Pipeline). When all check-runs associated with this pull request pass then Check Enforcer itself will pass. Why am I getting this message?You are getting this message because Check Enforcer did not detect any check-runs being associated with this pull request within five minutes. This may indicate that your pull request is not covered by any pipelines and so Check Enforcer is correctly blocking the pull request being merged. What should I do now?If the check-enforcer check-run is not passing and all other check-runs associated with this PR are passing (excluding license-cla) then you could try telling Check Enforcer to evaluate your pull request again. You can do this by adding a comment to this pull request as follows: What if I am onboarding a new service?Often, new services do not have validation pipelines associated with them, in order to bootstrap pipelines for a new service, you can issue the following command as a pull request comment: |
…sets to nullptr. Fix for using shared_lock instead of unique_lock when reading.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding comments and questions.
I would like to understand about the expectation of logging a URL before giving my full approval. I will be pending to your reply
sdk/core/azure-core/src/http/url.cpp
Outdated
full_url += GetRelativeUrl(); | ||
return full_url; | ||
} | ||
std::string Url::GetAbsoluteUrl() const { return GetUrlWithoutQuery() + GetRelativeUrl(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❤️
sdk/core/azure-core/src/http/url.cpp
Outdated
if (!m_encodedPath.empty()) | ||
{ | ||
full_url += "/"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to add /
depending if the url has or not a path?
This integrates fine with the current implementation for calling GetUrlWithoutQuery() + GetRelativeUrl() for the absolute URL.
However, if you are calling GetUrlWithoutQuery()
for logging, are expecting show the path as well?
Right now, the path won't be displayed and only /
when it has a path or nothing for no path.
Is this what you expect for logging?
For example, a request to http://host.com/resource/foo
would be logged as http://host.com/
and
http://host.com
is logged as http://host.com
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the behavior, this is not what I expect. I'll fix.
} | ||
else | ||
{ | ||
url += "[hidden]"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are other languages adding this hidden
placeholder? or are they just not logging the header at all? We could try to follow the same approach here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I'll unify it.
.NET says "REDACTED" - https://github.com/Azure/azure-sdk-for-net/blob/41a7cf2cca7badc4be320ce57aa8b47adcdff137/sdk/core/Azure.Core/src/Shared/HttpMessageSanitizer.cs
for (auto const& header : request.GetHeaders()) | ||
{ | ||
log << "\n\t" << header.first; | ||
|
||
if (header.second.empty()) | ||
{ | ||
log << " [empty]"; | ||
} | ||
else if ( | ||
options.AllowedHttpRequestHeaders.find(header.first) | ||
!= options.AllowedHttpRequestHeaders.end()) | ||
{ | ||
log << " : " << TruncateIfLengthy(header.second); | ||
} | ||
else | ||
{ | ||
log << " [hidden]"; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: This code is duplicated 3 times,
- Request -> query
- Request -> headers
- Response -> headers
Consider creating a helper method which takes the map, and the allowed values set and updates the ofstream.
Then if we want to change the [hidden]
value or anything in the logic it would be just one place
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I will try to unify.
@@ -22,6 +22,10 @@ | |||
- Renamed `TelemetryPolicyOptions` to `TelemetryOptions`. | |||
- Renamed `ValuePolicyOptions` to `ValueOptions`. | |||
- Removed `StartTry()` from `Azure::Core::Http::Request`. | |||
- Changed type of `Azure::Core::Http::RetryOptions::StatusCodes` from `std::vector` to `std::set`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why? Please move this to a separate PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's because repetition doesn't do anything, and order does not matter. Also, lookup is faster.
This is a tiny change, and I touched related code/neighboring lines.
@@ -87,7 +91,7 @@ set( | |||
src/cryptography/md5.cpp | |||
src/http/bearer_token_authentication_policy.cpp | |||
src/http/http.cpp | |||
src/http/logging_policy.cpp | |||
src/http/log_policy.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
revert
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like the name that Jeffrey proposed more. Also, JS SDK uses LogPolicy.
@@ -22,4 +23,9 @@ namespace Azure { namespace Core { | |||
using CaseInsensitiveMap | |||
= std::map<std::string, std::string, Internal::Strings::CaseInsensitiveComparator>; | |||
|
|||
/** | |||
* @brief A type alias of `std::set<std::string>` with case-insensitive key comparison. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sets don't have a concept of keys
, do they?
{ | ||
Logging::Internal::Log(Logging::LogLevel::Verbose, GetRequestLogMessage(request)); | ||
Log::Write(Logger::Level::Verbose, GetRequestLogMessage(m_options, request)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please confirm that other languages also only log request/responses at verbose log level and not at info.
if (Log::ShouldWrite(Logger::Level::Error)) | ||
{ | ||
Log::Write(Logger::Level::Error, std::string("HTTP Transport error: ") + e.what()); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Recoverable exceptions are supposed to be logged as warning, not error.
https://azure.github.io/azure-sdk/general_implementation.html#logging
✅ DO use the Error logging level for failures that the application is unlikely to recover from (out of memory, etc.).
@@ -1,226 +0,0 @@ | |||
// Copyright (c) Microsoft Corporation. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: leave this test file name as is.
this will also make it easier to review any test changes.
@@ -28,6 +28,8 @@ class DllExportTest { | |||
|
|||
TEST(SimplifiedHeader, core) | |||
{ | |||
EXPECT_NO_THROW(Azure::Core::CaseInsensitiveMap imap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test for the logging header in core.hpp.
* @brief HTTP query parameters that are allowed to be logged. | ||
* @remark URL-encoded parameter names are expected. | ||
*/ | ||
std::set<std::string> AllowedHttpQueryParameters; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have an example code where some client library or unit tests would set these and what the outcome would be?
Presumably we want to follow the similar defaults for storage and other client libraries like .NET, once storage starts adding logging. For example: https://github.com/Azure/azure-sdk-for-net/blob/6d89936a1592996bd64fc402eea9c2a732b3bd4b/sdk/storage/Azure.Storage.Blobs/src/BlobClientOptions.cs#L128
How would the end user customize which headers to log?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we turn the following headers on by default, following .NET?
@tg-msft what do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, and Storage should add https://github.com/Azure/azure-sdk-for-net/blob/master/sdk/storage/Azure.Storage.Blobs/src/BlobClientOptions.cs#L128 which we worked carefully with their team on (and used across all the other languages).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tg-msft do you expect users to pass in query parameter strings for the allow list that already URL-encoded? The docs don't specify.
https://github.com/Azure/azure-sdk-for-net/blob/d9a83c970d456c4d93c308869d44a84b531f44a2/sdk/core/Azure.Core/src/DiagnosticsOptions.cs#L85-L88
And why do you ignore casing when comparing headers and query parameter names?
https://github.com/Azure/azure-sdk-for-net/blob/675631c134f1ae01fe170bb3815f29cec96ea1fc/sdk/core/Azure.Core/src/Shared/HttpMessageSanitizer.cs#L27
https://github.com/Azure/azure-sdk-for-net/blob/675631c134f1ae01fe170bb3815f29cec96ea1fc/sdk/core/Azure.Core/src/Shared/HttpMessageSanitizer.cs#L100
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if Azure has any query parameter names that need to be URL encoded so I'm not sure if there's a difference in practice. I'm guessing our default is unencoded but @pakrym could correct me if I'm wrong. HTTP header names are case insensitive. Query parameter names are case sensitive, but I'm guessing we erred on the side of overabundant caution so nothing gets leaked if customers accidentally specified the wrong case (and again, I don't know of any distinct Azure query parameter names that are the same modulo case so I don't think it's a problem in practice).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I never really thought about urlencoding the query parameter names so both the UrlBuilder and sanitizer expect already pre-encoded names.
Query parameter names are case sensitive, but I'm guessing we erred on the side of overabundant caution so nothing gets leaked if customers accidentally specified the wrong case (and again, I don't know of any distinct Azure query parameter names that are the same modulo case so I don't think it's a problem in practice).
Yep, we are yet to see a scenario where you would like to make an opposite decision for headers that differ only in the casing. Missing a header during the sanitization feels like a more common case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HTTP header names are case insensitive.
OK, that makes sense for headers.
Query parameter names are case sensitive, but I'm guessing we erred on the side of overabundant caution so nothing gets leaked if customers accidentally specified the wrong case (and again, I don't know of any distinct Azure query parameter names that are the same modulo case so I don't think it's a problem in practice).
Regarding query parameters, I would have expected the opposite, if we wanted to avoid accidental leaks. Wouldn't having strict matching in the allow list ensure that some other parameters don't get leaked that weren't explicitly allowed? With case insensitive matching, we are allowing more today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, sorry, these are allow lists rather than deny lists. I'm guessing we just did that for usability then. I'm not particularly worried about it though because I've never seen distinct query parameter names in Azure that only differ by case. The REST API review board would never allow them through.
Per agreement with the team, merging together all the API review changes otherwise extractable to several PRs, to win us time on namespace/file moving changes. If there's any unaddressed feedback left, will be fixed in subsequent PRs. I addressed all the feedback that I took to my best knowledge, if something is not reflected in the code changes, I'll reply with the comment mentioning the fix was not taken intentionally. |
This takes the API review feedback, and I reviewed the design with Jeffrey, took his feedback.
Closes #1639.
Closes #1722.
AZURE_LOG_LEVEL
is set in the environment to a corresponding level, init the default log listener with console logger (see https://github.com/Azure/azure-sdk-for-java/wiki/Logging-with-Azure-SDK) || unless a build flag is specified, so the env-listening console logger functionality won't be built in.Write()
now usesstd::shared_timed_mutex
, which is more effective than the plainstd::mutex
in our case - we get shared reads for cheaper price. Atomic is not possible withstd::function
, and we do want to usestd::function
, Jeffrey agrees.ShouldWrite()
is 100% lock-free. It is an inlined atomic bool variable check, then, sometimes, another atomic variable check.std::set
instead ofstd::vector
.