HPCC-32452 Time security manager auth calls #19007

timothyklemm · 2024-08-19T10:54:28Z

Create an ISecManager decorator that can add internal timing spans for any security manager interface call.

Replace direct manager authorization calls with calls through the decorator. Disregard CLdapSecManager-specific calls for now.

Type of change:

This change is a bug fix (non-breaking change which fixes an issue).
This change is a new feature (non-breaking change which adds functionality).
This change improves the code (refactor or other change that does not change the functionality)
This change fixes warnings (the fix does not alter the functionality or the generated code)
This change is a breaking change (fix or feature that will cause existing behavior to change).
This change alters the query API (existing queries will have to be recompiled)

Checklist:

Smoketest:

Send notifications about my Pull Request position in Smoketest queue.
Test my draft Pull Request.

Testing:

github-actions · 2024-08-19T10:54:48Z

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-32452

Jirabot Action Result:
Workflow Transition: Merge Pending
Updated PR

timothyklemm · 2024-08-19T12:08:04Z

The complex use of template parameters is to handle the LDAP dependency, if not now then in the future. There are a couple of auth calls not handled by this PR due to this dependency.

@kenrowland how likely are we to standardize the security manager interface, removing this dependency? Is it more likely we may need and LDAP decorator first?

@rpastrana should I decorate every manager method, so instrumentation is available when we want it, or only decorate the specific calls being handled? The only risk of overdoing decoration now is if we do eliminate the LDAP dependency and can apply a decorator to each manager as it is created, automating instrumentation.

rpastrana

@timothyklemm good looking code. left you a couple of minor comments and some high level questions.

system/security/shared/secmanagertracedecorator.hpp

rpastrana · 2024-08-19T17:24:02Z

system/security/shared/secmanagertracedecorator.hpp

+    {
+        return decorated->unsubscribe(events, secureContext);
+    }
+    virtual bool authorize(ISecUser & user, ISecResourceList * resources, IEspSecureContext* secureContext = nullptr) override


what's the pattern for customizing the decoration behavior?
Let's say the esp knows it doesn't want to trigger a span for a specific secmanager call. Would that require a second TSecManagerTraceDecorator with a different method def?

This doesn't sound like a decorator question. As long as the ESP is making the decision to instrument, the question is how does the ESP decide which managers do or do not require instrumentation.

If we assume LDAP will always require instrumentation, we could define a new plugin configuration value (likely in the SecurityManager element wrapping the manager's configuration). This value could be passed into the decorator constructor, allowing the decorator to do what it needs to do. We could create a threaded active decorator at manager construction, or use a "null decorator" when instrumentation is not needed.

A second option could be to make the determination a function of the manager itself. I'm not sure there's an existing API to make this decision, but one could be added.

Let me ask it a different way.
How would we suppress the spans added by the decorator if a very simple secmanager is used which doesn't need to trace its calls.

The ESP doesn't currently know if it is a very simple manager. There is an enumerated value that managers may return, which we could use to decide, but there are no values defined for managers outside the platform and could be unreliable. There is a label associated with the enumeration, but there are no naming conventions in place through which we could extract anything like this.

The only way the ESP will know that it is a simple manager is to ask it, or look for a new configuration value. I've intentionally omitted the third option which is dynamically casting.

@timothyklemm I didn't ask how the ESP can determine if a given call should be traced or not, but I think I can infer that we're stuck w/ the spans declared by the decorator for all sec manager.Thanks.

rpastrana · 2024-08-19T17:32:00Z

system/security/shared/secmanagertracedecorator.hpp

+ * - If nothing else changes, create a subclasses of TSecManagerTraceDecorator, with template
+ *   parameters CLdapSecManager and ISecManager, to decorate the LDAP-specific interfaces.
+ */
+template <typename secmgr_t = ISecManager, typename secmgr_interface_t = ISecManager>


what would be the advantage/disadvantage of this approach vs a base implementation which all sec managers could extend?

Without the template, the base class holds a reference to an ISecManager instance, and any subclass holds a second reference to the same object but typed as the manager implementation class. Very minor, but I'm not a fan of holding the same value twice. The first template parameter allows the base class to hold the type needed by the subclass.

The two template parameters could be consolidated into one, reducing the complexity, if either:

The ISecManager and CLdapSecManager interfaces are standardized.

A new extension of ISecManager declaring the LDAP extensions is used.

I'm not married to either approach.

rpastrana · 2024-08-19T18:03:01Z

hould I decorate every manager method,

@timothyklemm I don't think decorating every method would be helpful.
In fact as we've discussed, for some managers, the 2 methods decorated might result in too many spans...

kenrowland

Generally looks like what we discussed. Just a few comments.

kenrowland · 2024-08-19T17:54:08Z

common/workunit/workunit.cpp

@@ -113,7 +114,7 @@ static bool checkWuScopeSecAccess(const char *wuscope, ISecManager *secmgr, ISec
 {
    if (!secmgr || !secuser)
        return true;
-    bool ret = secmgr->authorizeEx(RT_WORKUNIT_SCOPE, *secuser, wuscope)>=required;
+    bool ret = CSecManagerTraceDecorator(*secmgr).authorizeEx(RT_WORKUNIT_SCOPE, *secuser, wuscope)>=required;


Concern that each time you use CSecManagerTraceDecorator that it is constructed and destructed. Any thought on making the decorated calls static?

It may be a best practice to use a single security manager configuration (implying a single implementation) within an ESP process. ESPs can be configured to use multiple managers and configurations.

A static interface would require either passing the manager to each decorated call, negating the benefit of extending ISecManager, or a threaded security manager (similar to the threaded active span). The latter option would likely complicate the implementation by requiring manager checks in every decorated method.

I'm not sure which is preferred. Also, one of @rpastrana's comments might indirectly justify a threaded active decorator, so only one decorator would be created per service.

I'm concerned about the overhead of creating a new decorator every time a secmanager method is called. What's the overall overhead added and what is the overriding reason to implement this way.

kenrowland · 2024-08-19T18:02:27Z

esp/bindings/http/platform/httpbinding.cpp

@@ -972,7 +973,7 @@ bool EspHttpBinding::basicAuth(IEspContext* ctx)
        return false;
    }

-    bool authenticated = m_secmgr->authorize(*user, rlist, ctx->querySecureContext());
+    bool authenticated = CSecManagerTraceDecorator(*m_secmgr).authorize(*user, rlist, ctx->querySecureContext());


Concern here that m_secmgr is an OWNED and you are giving the CSecManagerTraceDecorator instance the raw pointer and it's creating a LINKED pointer. Not sure how those would interact. I see the same in at least one other instance.

The decorator always adds, and releases, its own reference to the object, without affecting ownership of the reference passed to it. I believe it is safe.

As it's being used, linking might not be needed. If the decorator ever has a longer life than one manager call, it will be.

Not being an expert with Owned and Linked, but is the reference count for the object shared between the original Owned and the Linked pointers?

kenrowland · 2024-08-19T18:05:04Z

system/security/shared/secmanagertracedecorator.hpp

+    }
+    virtual bool authorizeEx(SecResourceType rtype, ISecUser & user, ISecResourceList * resources, IEspSecureContext* secureContext = nullptr) override
+    {
+        START_SEC_MANAGER_TRACE_BLOCK("security.authorize_ex");


Should this be "security.authorize_ex_list" to distinguish between a single authorization and a list? It would seem that ia a long list was passed that it could skew results if it's measured the same as a single authorization. There is at least one other case where a method is overloaded to accept both a single instance and a list.

It can be. I thought about it because it's already a distinction made in the feature flags.

system/security/shared/secmanagertracedecorator.hpp

kenrowland · 2024-08-19T18:09:14Z

system/security/shared/secmanagertracedecorator.hpp

+ * interface. By implementing the same interface, a decorator is interchangeable with the object
+ * it decoratos.
+ *
+ * In less than ideal situations, the decorated object's interface is an extension of a named


Trivial, should "an extension" be "a subclass"

I think they're synonyms, but I'll change it.

system/security/shared/secmanagertracedecorator.hpp

kenrowland · 2024-08-19T19:45:04Z

@timothyklemm There are no plans at this time to make the LDAP specific interface part of ISecManager. If having them the same significantly reduces the decorator solution and the need exists to decorate and trace LDAP security manager calls, I can investigate further. Consolidating them would be a good interim step to eventual replacement.

timothyklemm · 2024-09-24T17:34:21Z

@rpastrana and @kenrowland, this is still a work in progress. There are two commits, each representing a separate Jira. The first commit sets up the ESP for using trace flags. The second has the decorator using trace flags to control span creation. Do these address prior concerns?

rpastrana

@timothyklemm a few comments on the traceflag commit

rpastrana · 2024-10-01T17:29:12Z

common/workunit/workunit.cpp

@@ -113,7 +114,7 @@ static bool checkWuScopeSecAccess(const char *wuscope, ISecManager *secmgr, ISec
 {
    if (!secmgr || !secuser)
        return true;
-    bool ret = secmgr->authorizeEx(RT_WORKUNIT_SCOPE, *secuser, wuscope)>=required;
+    bool ret = CSecManagerTraceDecorator(*secmgr).authorizeEx(RT_WORKUNIT_SCOPE, *secuser, wuscope)>=required;


I'm concerned about the overhead of creating a new decorator every time a secmanager method is called. What's the overall overhead added and what is the overriding reason to implement this way.

rpastrana · 2024-10-01T17:58:24Z

esp/platform/application_config.cpp

@@ -436,6 +436,16 @@ void setLDAPSecurityInWSAccess(IPropertyTree *legacyEsp, IPropertyTree *legacyLd
    }
 }

+void copyTraceFlags(IPropertyTree *legacyEsp, IPropertyTree *appEsp)


when would this method be useful?
Is it only capitalizing the traceflags name?
Perhaps a method comment header would help

rpastrana · 2024-10-01T18:25:49Z

esp/platform/espp.cpp

@@ -354,6 +358,74 @@ static void usage()

 IPropertyTree *buildApplicationLegacyConfig(const char *application, const char* argv[]);

+// Modified version of jlib's loadTraceFlags. The modification adds special handling for traceLevel.


Let's elaborate on the reasoning for this alternative loadTraceFlags implementation
Including examples of the proposed config format and the pre-existing format

rpastrana · 2024-10-01T18:29:27Z

esp/platform/espp.cpp

+// Modified version of jlib's loadTraceFlags. The modification adds special handling for traceLevel.
+static TraceFlags loadEspTraceFlags(const IPropertyTree *ptree, const std::initializer_list<TraceOption> &optNames, TraceFlags dft)
+{
+    for (auto &o: optNames)


rename o with a self documenting name

rpastrana · 2024-10-01T18:41:19Z

esp/platform/espp.cpp

+    {
+        VStringBuffer attrName("@%s", o.name);
+        const char* value = nullptr;
+        if (!(value = ptree->queryProp(attrName)) && !(value = ptree->queryProp(attrName.setf("_%s", o.name))))


what's the significance of the underscore prefix? is this a new convention you're setting?

The underscore prefix is an alternate xpath used in loadTraceFlags. I don't know when, or if, it's used but retained it to minimize functional differences between the two functions.

rpastrana · 2024-10-01T19:17:03Z

esp/platform/espp.cpp

@@ -577,6 +649,7 @@ int init_main(int argc, const char* argv[])
            config->bindServer(*server.get(), *server.get());
            config->checkESPCache(*server.get());

+            initializeTrace(config);


I like the symmetry w/ the initializeMetrics call, but this name can be confused with Tracing.
InitializeTraceLevelSettings ?

asselitx

Overall this looks like a good solution to getting the trace configuration into the ESP and for timing the secmgr calls.

asselitx · 2024-10-01T16:15:46Z

esp/platform/espp.cpp

@@ -354,6 +358,74 @@ static void usage()

 IPropertyTree *buildApplicationLegacyConfig(const char *application, const char* argv[]);

+// Modified version of jlib's loadTraceFlags. The modification adds special handling for traceLevel.
+static TraceFlags loadEspTraceFlags(const IPropertyTree *ptree, const std::initializer_list<TraceOption> &optNames, TraceFlags dft)


Might it be better to just use the standard load from jlib? Names are more expressive than an integer, but then we start a habit of ESP diverging from the standard (as in the ESPLOG) that might cause us more frustration in the long term than if we had uniform configuration of common properties across components.

asselitx · 2024-10-01T17:52:18Z

system/security/shared/secmanagertracedecorator.hpp

+ * only the first.
+ * - decorated_t is the type of the object to be decorated. If the the decorated object conforms
+ *   to an interface, use of the interface is preferred.
+ * - secorated_interface_t is the interface implemented by the decorated object. If not the same


spelling: "secorated_interface_t" should be "decorated_interface_t"

asselitx · 2024-10-01T17:55:32Z

system/security/shared/secmanagertracedecorator.hpp

+ * @brief Macro used start tracing a block of code in the security manager decorator.
+ *
+ * Create a new named internal span and enter a try block. Used with END_SEC_MANAGER_TRACE_BLOCK,
+ * provides consistent timing and exception handling for the inned code block.


spelling: should be "handling for the inner code block"

- Define an ESP process configuration node that supports specification of global TraceFlags values for each ESP. - Reserve traceLevel to request a specific trace level (0: none, 1: standard, 2: detailed, 3: max). This replaces acting on the last observed occurrence of either traceNone, traceStandard, traceDetailed, and traceMax. - Override default flag settings when not configured and a debug build. Signed-off-by: Tim Klemm <[email protected]>

…tion calls. - Define a new, common, trace option for security manager tracing. Both ESP and Roxie use security managers, and the flag can be shared by both. - Refactor CEspHttpServer to create server spans before initial authorization requests. - Create a security manager decorator encapsulating the logic of what to trace and when. - Replace security manager authorization calls with decorated calls. This change excludes calls specific to the LDAP security manager. Signed-off-by: Tim Klemm <[email protected]>

Introduce a persistent decorator instance to the HTTP binding, ESP context, and security handler to avoid repeated construction and destruction. Improve the binding's handling of security manager creation failures to avoid null dereferences when creating the decorator, or resource auth maps.

ghalliday

This does not seem to be the correct approach to me. It is more complicated that I would expect for the problem that I think it is trying to solve.

The jira could do with a discussion of the potential designs, exactly what it would be beneficial to trace and why. Adding the code in this place means that all security managers - including fixed username, or simple table of users would be tracing all authentication calls.

Similarly for ldap security managers it will be creating spans when the value is returned immediately from the cache. We do not want to generate lots of short-lived spans, especially if they do not involve a server.

What is potentially important is tracing the requests to the LDAP server. I imagine that can be best achieved by adding code to create a client span within the ldap code itself. That also allows ldap-specific details to be added to the span (if there are any).

timothyklemm · 2024-11-22T13:00:59Z

@ghalliday the code currently included in the PR is more complex than it should be. If #19292 is accepted, we will be able to decorate managers as they are constructed, and no additional awareness of the decorator is required.

@rpastrana and I discussed whether it would be preferable for the ESP or the manager to be responsible for instrumentation. With the potential for 3rd party managers (not just mine, but an OAuth 2 manager is being developed), we concluded it would be better for the ESP to produce consistent output instead of hoping managers will do it for us.

To reduce the number of unnecessary spans, the feature must be explicitly turned on and only "implemented" methods, as identified by the manager, will be instrumented. Unfortunately, there doesn't seem to be a reliable way, given the current interface, to identify which managers warrant instrumentation. Also, I can't predict whether something will be cached before creating a span.

It could be possible to replace spans with timing attributes. This could be start and end times (similar to current TxSummary content), start time and duration, or just duration. Inclusion of the attributes could be conditional on a minimum duration. Retroactive span creation given a start and duration, if possible, could also be conditional on a minimum duration.

Both options would prevent grouping any spans created by the manager itself. if I was actively working on my plugin, I would probably have it creating child spans for each of its multiple HTTP and MySQL requests. Any authorize request to one of my managers could require at least 3 HTTP requests and 2 MySQL requests to complete.

rpastrana · 2024-11-22T16:51:29Z

@timothyklemm as I recall there were several long conversations regarding this feature. I don't recall what might have convinced me to agree to the current proposal, but in general I'd expect the ESP to span potentially lengthy/complex calls. I'd also expect the call to have the ability to declare its own child span to represent some interesting subtask (I think you pointed out why this might be difficult to achieve, but I don't recall). Another important aspect we've emphasized is to avoid affecting existing functionality for the sake of tracing.

timothyklemm marked this pull request as draft August 19, 2024 11:26

timothyklemm marked this pull request as ready for review August 19, 2024 11:27

timothyklemm marked this pull request as draft August 19, 2024 11:28

timothyklemm requested review from kenrowland and rpastrana August 19, 2024 11:29

rpastrana reviewed Aug 19, 2024

View reviewed changes

kenrowland requested changes Aug 19, 2024

View reviewed changes

timothyklemm force-pushed the hpcc-32452-secmgr-trace-decorator branch 2 times, most recently from 18e7e08 to 3e2f130 Compare September 24, 2024 17:24

rpastrana reviewed Oct 1, 2024

View reviewed changes

asselitx reviewed Oct 1, 2024

View reviewed changes

Tim Klemm added 3 commits October 23, 2024 13:56

timothyklemm force-pushed the hpcc-32452-secmgr-trace-decorator branch from 3e2f130 to f23b4e4 Compare October 29, 2024 14:36

timothyklemm changed the base branch from candidate-9.8.x to master October 29, 2024 15:14

timothyklemm mentioned this pull request Oct 31, 2024

HPCC-32465 Add ESP support for trace level #19246

Open

39 tasks

ghalliday requested changes Nov 22, 2024

View reviewed changes

		@@ -354,6 +358,74 @@ static void usage()

		IPropertyTree buildApplicationLegacyConfig(const char application, const char* argv[]);

		// Modified version of jlib's loadTraceFlags. The modification adds special handling for traceLevel.

HPCC-32452 Time security manager auth calls #19007

Are you sure you want to change the base?

HPCC-32452 Time security manager auth calls #19007

Conversation

timothyklemm commented Aug 19, 2024 • edited Loading

Type of change:

Checklist:

Smoketest:

Testing:

github-actions bot commented Aug 19, 2024

timothyklemm commented Aug 19, 2024

rpastrana left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rpastrana commented Aug 19, 2024

kenrowland left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kenrowland commented Aug 19, 2024

timothyklemm commented Sep 24, 2024

rpastrana left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

asselitx left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ghalliday left a comment

Choose a reason for hiding this comment

timothyklemm commented Nov 22, 2024

rpastrana commented Nov 22, 2024

timothyklemm commented Aug 19, 2024 •

edited

Loading