Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Audit Logging] Add AuditTrail service #69278

Merged
merged 45 commits into from
Jul 7, 2020

Conversation

mshustov
Copy link
Contributor

@mshustov mshustov commented Jun 16, 2020

Summary

First part of #60119
Adds:

  • general-purpose AuditTrail core service.
    Plugins can get scoped Auditor from the core service to add events to introspect.
// in a plugin
const auditor: Auditor = coreStart.auditTrail.asScoped(request);
auditor.add({...});
  • AuditTrail plugin in x-pack.
    This plugin provides an example implementation of the AuditLogging plugin. It depends on Security, Spaces, and (in theory) on other plugins. It allows enhancing auditable data with an additional context (the current user, space, etc.).
    It's going to be overtaken by the Security team for further enhancements.

Integration

Elasticsearch service is the only core service with AuditTrail service integration at the moment. It logs callAsCurrentUser, callAsInternalUser scoped client calls but it cannot track callAsInternalUser calls of un-scoped client. The latter is used by SO client factory provider in Security plugin Thus we have a blind spot in SO calls from this client. Considering this drawback, it looks like the current implementation is a no-go option. Want to hear from @jportner @thomheymann
We can consider changing createInternalRepository signature to pass scoped auditor as an argument, but we cannot make this argument required since it's internal repository. We cannot make the argument optional either since it's easy to forget to pass one.
Another not-covered case is ES requests from outside of RequestHandlerContext scoped to FakeRequest. They don't have X-Opaque-Id or any other way to identify a request, so I skipped this case at the moment. We can get back to the problem after unifying the Scope-able interface. #69278 (comment) Let me know if it cannot wait.

Actions points:

Usage example

// consume via RequestHandlerContext in /login
try {
  // adds an upper-level scope name to identify low-level operations with
  context.core.withScope('login');
  context.core.auditor.add({
    message: 'try to auth user',
    type: 'login',
   });
  const authenticationResult = await authc.login(request, loginAttempt);
  if (!authenticationResult.succeeded()) {
    context.core.auditor.add({
      message: 'user authentication failed',
      type: 'login.failed',
    });
    return response.unauthorized({ body: authenticationResult.error });
  }
  context.core.auditor.add({
    message: 'user authenticated',
    type: 'login.success',
});

// consume via CoreStart in Security SOC 
savedObjects.addClientWrapper(Number.MAX_SAFE_INTEGER - 1, 'security', ({ client, request }) => {
  const auditor = getAuditTrail().asScoped(request);
  new SOCWrapper({auditor, ... });
}

class SOCWrapper {
  bulkGet(...) {
    // check privileges
    this.auditor.add({
      message: 'audit authz',
      type: 'authz.privileges.granted',
      privileges: ...
    });
    //  data fetching
    this.auditor.add({
      message: 'audit bulk_get',
      type: 'data.bulk_get',
      data: ....
    });
  }

@mshustov mshustov added release_note:plugin_api_changes Contains a Plugin API changes section for the breaking plugin API changes section. Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.9.0 labels Jun 16, 2020
@joshdover
Copy link
Contributor

I decided not to use auditTrail service in other core services (ES, SO, UiSettings) since they implement low-level mechanics, that do not aware of the execution context

I think I'm missing the connection between this design and the linked comment which explored logging both low-level and high-level events.

Would it not make sense for the scoped versions of these services to accept an Auditor by default?

In all of those cases, we do know the execution context (an HTTP request). If the specific HTTP route would like to set the logical execution context (eg. import_saved_objects) to bind the low-level details to, it seems like we could have an API like:

router.post(
  { path: '/api/saved_objects/import' },
  async (context, req, res) => {
    const { auditor, savedObjects } = context.core;
    const result = await auditor.withContext(
      'import_saved_objects',
      async () => {
        // any operations against context.core.savedObjects.client would log audit
        // events that are tied to the `import_saved_objects` context.
        return savedObjects.client.create(...);
      }
    );

    return res.ok({ body: result });
  }
);

This would just require that the same Auditor instance be piped to the coreStart.savedObjects.getScopedClient(req, auditor) call in the request context provider and returned on the context.core.auditor key.

Advantages:

  • Low-level details get logged without any extra work
  • Low-level events can be correlated to the logical user operation
  • SOC wrappers can still add additional events (such as authz events in your example above)

Disadvantages:

  • More low-level events may be less meaningful

IMO having narrow coverage of audit logging is worse than having too many low-level events.

Questions:

  • Would we need to support nested calls to auditor.withContext?

Should I rename AuditTrail plugin to reduce the possibility of confusion with AuditTrail service?

With the current design, I actually wonder if the AuditTrail service should be in Core at all. If we don't provide integrations out of the box, it seems like this service simply serves as a global registry. Could probably be implemented as a (OSS?) plugin entirely?

However, if we do go with the out-of-the-box integrations with SOC, UiSettings, ES client, then we would need this in Core.

@mshustov
Copy link
Contributor Author

mshustov commented Jun 24, 2020

I think I'm missing the connection between this design and the linked comment which explored logging both low-level and high-level events.

In my example, both events in SOCWrapper are still high-level. By low level events I meant requests to ES. Giving the same example: every privilege check performs several separate requests to ES, plus there are some other ES requests performed at the same time. The number of such records could be overwhelming. That's why I want to hear from @elastic/kibana-security what events exactly we want to log. There are Elasticsearch queries listed in the Kibana Audit Logging Proposal, but I believe this is related to data requests only (via data plugin?).

Would it not make sense for the scoped versions of these services to accept an Auditor by default?

That's one of the options. However, as long as not all of Kibana plugins use Context pattern (namely: Actions, Alerts, etc.), we have to provide API to instantiate Auditor in any place. We could bake auditor instantiation in ES / SO clients, but it doesn't play well with SOC wrappers since they are created outside by external plugins. If we pass auditor in SOC wrapper as an argument during instantiation

savedObjects.addClientWrapper(Number.MAX_SAFE_INTEGER - 1, 'security', ({ client, request, auditor }) => {
  new SOCWrapper({auditor, ... });
}

it could solve the problem, but:

  • it still doesn't enforce plugins to use auditor
  • we will have several different ways to get an auditor within a plugin (prefer the one way to do things!)

With the current design, I actually wonder if the AuditTrail service should be in Core at all. If we don't provide integrations out of the box, it seems like this service simply serves as a global registry. Could probably be implemented as a (OSS?) plugin entirely?

  1. It depends on the decision on how deep we want to integrate Auditor. If core API uses it, we will have to keep it in the core (as you already noticed)
  2. If the AuditTrail plugin should have access to any plugin API (it already uses Security & Spaces), it makes circular dependencies possible, so we will have to adopt plugin API accordingly.

@jportner
Copy link
Contributor

In all of those cases, we do know the execution context (an HTTP request). If the specific HTTP route would like to set the logical execution context (eg. import_saved_objects) to bind the low-level details to, it seems like we could have an API like:

@joshdover I like the idea of this, and the example you gave.

IMO having narrow coverage of audit logging is worse than having too many low-level events.

Agreed.


Giving the same example: every privilege check performs several separate requests to ES, plus there are some other ES requests performed at the same time. The number of such records could be overwhelming. That's why I want to hear from @elastic/kibana-security what events exactly we want to log. There are Elasticsearch queries listed in the Kibana Audit Logging Proposal, but I believe this is related to data requests only (via data plugin?).

@restrry In the document you mention, the question of "exactly what events do we need to audit?" is still partially unanswered; that's basically where I left off before I got pulled to other tasks. I haven't undertaken a full audit (ha!) of all of the various plugins in Kibana, but it's my understanding that many plugins rely on using the raw Elasticsearch client instead of the SavedObjectsClient. So we'll probably want to audit those low-level events, too.

It's worth noting here that we might have those low-level events excluded from the audit log by default (depending on just how noisy they are). The assumption is that organizations generally don't want to "audit all users, all events, all the time". E.g., we should send everything to the auditTrail, and a subset of that (depending on config) will actually get written to the audit log. I'd rather err on the side of adding the low-level events to the auditTrail in case an organization wants to audit a user (or group of users) in-depth for a period of time.


Side note: it could be really useful if we could do some sort of conditional exclusions of event types based on the execution context. So we may want to exclude only the low-level Elasticsearch client events that are sourced from one of the SavedObjectsClient execution contexts. Then we sort of have the best of both worlds; when plugins aren't using the SavedObjectsClient, we would be able to "fall back" to the low-level Elasticsearch client events.

@mshustov
Copy link
Contributor Author

mshustov commented Jun 25, 2020

the question of "exactly what events do we need to audit?" is still partially unanswered;

Ok, what we want to get as a result of this task? I supposed that it should be a generic service allowing us to log an audit event in any place of plugin code.
I added logging for all ES requests done on behalf of the current user (ignore internal user ones) as an example of integrating it into the platform code. We can do the same for other Core services: SavedObjects, UiSettings, HTTP resources. I don't understand whether we should though until there are clear requirements for this. WDYT?
The current implementation allows to use auditor from:

// route handler context
router.get(..., handler(context, req, res) => {
  return context.core.auditor.withScope('saved object export', async () => {...});
});

// from start contract
savedObjects.addClientWrapper(Number.MAX_SAFE_INTEGER - 1, 'security', ({ client, request }) => {
  const auditor = getAuditTrail().asScoped(request);
  new SOCWrapper({ auditor, ... });
}

class SOCWrapper {
  bulkGet(...) {
    // check privileges
    this.auditor.add({
      message: 'audit authz',
      type: 'authz.privileges.granted',
      privileges: ...
    });
    //  data fetching
    this.auditor.add({
      message: 'audit bulk_get',
      type: 'data.bulk_get',
      data: ....
    });
  }

All interactions with ES logged automatically.

if we don't want to support cases outside of RequestHandlerContext, I'm inclining towards removing access to remove access to auditor API from plugins and create only one instance during context instantiation. as proposed by @joshdover in #69278 (comment)

Side note: it could be really useful if we could do some sort of conditional exclusions of event types based on the execution context

We should extend the AuditTrail plugin with this functionality based on type/scope/whatever else. I don't think it's the necessity for the current task. Is it? We also have #57547 to support filtering on logging level.


async add(event: AuditableEvent) {
const user = this.deps.getCurrentUser(this.request);
const spaceId = this.deps.getSpaceId(this.request);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cannot use any API (getActiveSpace) calling ES otherwise we stuck in a loop

@@ -209,7 +213,12 @@ export class ClusterClient implements IClusterClient {
})
);
}

if (request && isRealRequest(request)) {
Copy link
Contributor Author

@mshustov mshustov Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we have this to narrow lax asScoped interface

public asScoped(request?: ScopeableRequest): IScopedClusterClient 

from ScopeableRequest to KibanaRequest

getAuthHeaders
);
}

private getAuditorFactory = () => {
if (!this.auditorFactory) {
Copy link
Contributor Author

@mshustov mshustov Jun 25, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a workaround for the Legacy ES client being created during the setup lifecycle. This is not necessary for the new ES client instantiating during the start lifecycle.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this doesn't cause failures. That means we don't have any KP plugin performing ES calls during the setup phase?

Copy link
Contributor Author

@mshustov mshustov Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They shouldn't. It could cause data loss since data aren't migrated at this point.
This method is called from asScoped that should be called with KibanaRequest, which can be created only when the server bound to a port.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh right, that's only for scoped clients.

getAuthHeaders
);
}

private getAuditorFactory = () => {
if (!this.auditorFactory) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm surprised this doesn't cause failures. That means we don't have any KP plugin performing ES calls during the setup phase?

"configPath": ["xpack", "audit_trail"],
"server": true,
"ui": false,
"requiredPlugins": ["licensing", "security", "spaces"],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should security and spaces be optional? I mean, shouldn't the auditTrail factory be capable of emitting 'degraged' events with only the info from the enabled sources?

Copy link
Contributor Author

@mshustov mshustov Jul 2, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pgayvallet probably. I left the detailed implementation for the Security team. For example, it's not obvious for me how to handle security plugin disabled...would audit logging make sense at all without any information about an end-user?

Comment on lines 26 to 31
public withScope(name: string) {
if (this.scope !== undefined) {
throw new Error(`AuditTrail scope is already set to: ${this.scope}`);
}
this.scope = name;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is only one auditor for a specific request, that means we can only have a single scope for a request's whole lifecycle. Is that the expected granularity, or just a limitation caused by the fact that changing the scope of the auditor would just have side effect with asynchronous calls?

to adapt an example from the FTR test plugin, this is something that can will just not be able to do:

 router.get(
      { path: '/audit_trail_test/context/as_current_user', validate: false },
      async (context, request, response) => {
        context.core.auditor.withScope('audit_trail_test/context/as_current_user/ping');
        await context.core.elasticsearch.legacy.client.callAsCurrentUser('ping');
        context.core.auditor.withScope('audit_trail_test/context/as_current_user/pong'); // BOOM.
        await context.core.elasticsearch.legacy.client.callAsCurrentUser('pong');
        return response.noContent();
      }
    );

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As there is only one auditor for a specific request, that means we can only have a single scope for a request's whole lifecycle. Is that the expected granularity, or just a limitation caused by the fact that changing the scope of the auditor would just have side effect with asynchronous calls?

It's the desired behavior. @jportner pointed out that we are interested in the root cause of an event, so I rolled back the possibility to stack 'scopes'.

@mshustov mshustov requested a review from a team July 2, 2020 08:41
@mshustov mshustov requested a review from pgayvallet July 2, 2020 12:21
@joshdover
Copy link
Contributor

If we want to track API calls outside of RequestContextHandler we need to consider X-Opaque-Id implementation for FakeRequests as well. @joshdover Should it be done at #39430 or #62018?

I lean towards doing this as part of #39430 which we do not have prioritized very soon. I think it makes sense to couple this feature to the scope-able clients because the generated "execution ID" should be part of the same "execution context" as the authentication credentials.

I'll defer to @jportner on the requirement here and whether or not we should prioritize this sooner. Generating execution IDs for system-level logic would make it easier to correlate individual events for a single logical unit, but we may already get that in a roundabout fashion by associating a user and a audittrail scope to these events. Users of the audit log just won't be able to tease apart different instances of the same audittrail scope for the same user if they are executed around the same time. For example, if two alerts created by the same user are processed at the same time, it may not be possible to separate the audit log events for each alert without an "execution ID".

@mshustov
Copy link
Contributor Author

mshustov commented Jul 6, 2020

@elasticmachine merge upstream

Comment on lines +69 to +70
// plugins.auditTrail prepended automatically
context: '',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if we should make context optional for this use case? Though that would make the Map version of this strange.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not do that because:

  • it diverges interface from the one defined in config
  • it requires to override complex validation schema since it's a union of different schemas

x-pack/plugins/audit_trail/server/config.ts Outdated Show resolved Hide resolved
type: event.type,
user: user?.username,
space: spaceId,
scope: this.scope,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the name of the ECS field that the scope parameter will be mapped to? If it uses a different term other than scope, then maybe we should use that here too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

from #52125 (comment) it seems to be document.space, but I'd rather the Security team formalised the interface later

* Add a high-level scope name for logged events.
* It helps to identify the root cause of low-level events.
*/
withScope(name: string): void;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do wonder if we should name this something else than withScope since it sort of overlaps with asScoped. Ideas:

  • withAuditScope
  • setAuditScope
  • setAuditableSpan
  • setAuditableEvent

Also see below on my question about how this field will map to the ECS output. I think we should use the same terminology here if it's different than scope.

import { PluginConfigDescriptor, config as _config } from '../../../../src/core/server';

const configSchema = schema.object({
enabled: schema.boolean({ defaultValue: true }),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we default to false while we're iterating on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the logging disabled anyway, but it doesn't hurt to disable the whole plugin

@mshustov mshustov force-pushed the issue-60119-audit-trail branch from 08d9e94 to 1133469 Compare July 7, 2020 17:11
@mshustov mshustov requested a review from a team as a code owner July 7, 2020 17:11
@mshustov mshustov requested a review from joshdover July 7, 2020 17:11
@mshustov mshustov removed the request for review from a team July 7, 2020 17:15
@kibanamachine
Copy link
Contributor

💚 Build Succeeded

Build metrics

✅ unchanged

History

To update your PR or re-run it, just comment with:
@elasticmachine merge upstream

@mshustov mshustov merged commit aeff8c1 into elastic:master Jul 7, 2020
mshustov added a commit to mshustov/kibana that referenced this pull request Jul 7, 2020
* add generic audit_trail service in core

* expose auditTraik service to plugins

* add auditTrail x-pack plugin

* fix type errors

* update mocks

* expose asScoped interface via start. auditor via  request context

* use type from audit trail service

* wrap getActiveSpace in safeCall only. it throws exception for non-authz

* pass message to log explicitly

* update docs

* create one auditor per request

* wire es client up to auditor

* update docs

* withScope accepts only one scope

* use scoped client in context for callAsInternalUser

* use auditor in scoped cluster client

* adopt auditTrail plugin to new interface. configure log from config

* do not log audit events in console by default

* add audit trail functional tests

* cleanup

* add example

* add mocks for spaces plugin

* add unit tests

* update docs

* test description

* Apply suggestions from code review

apply @jportner suggestions

Co-authored-by: Joe Portner <[email protected]>

* add unit tests

* more robust tests

* make spaces optional

* address comments

* update docs

* fix WebStorm refactoring

Co-authored-by: Joe Portner <[email protected]>
@mshustov mshustov deleted the issue-60119-audit-trail branch July 8, 2020 07:10
mshustov added a commit that referenced this pull request Jul 8, 2020
* add generic audit_trail service in core

* expose auditTraik service to plugins

* add auditTrail x-pack plugin

* fix type errors

* update mocks

* expose asScoped interface via start. auditor via  request context

* use type from audit trail service

* wrap getActiveSpace in safeCall only. it throws exception for non-authz

* pass message to log explicitly

* update docs

* create one auditor per request

* wire es client up to auditor

* update docs

* withScope accepts only one scope

* use scoped client in context for callAsInternalUser

* use auditor in scoped cluster client

* adopt auditTrail plugin to new interface. configure log from config

* do not log audit events in console by default

* add audit trail functional tests

* cleanup

* add example

* add mocks for spaces plugin

* add unit tests

* update docs

* test description

* Apply suggestions from code review

apply @jportner suggestions

Co-authored-by: Joe Portner <[email protected]>

* add unit tests

* more robust tests

* make spaces optional

* address comments

* update docs

* fix WebStorm refactoring

Co-authored-by: Joe Portner <[email protected]>

Co-authored-by: Joe Portner <[email protected]>
gmmorris added a commit to gmmorris/kibana that referenced this pull request Jul 8, 2020
* master: (36 commits)
  fixed api url in example plugin (elastic#70934)
  [data.search.aggs]: Remove remaining client dependencies (elastic#70251)
  [Security Solution][Endpoint] Fix base64 download bug and adopt new user artifact/manifest format (elastic#70998)
  [Security Solution][Exceptions] - Exception Modal Part I (elastic#70639)
  [SIEM][Detection Engine][Lists] Adds additional data types to value based lists
  [SIEM][Detection Engine][Lists] Removes feature flag for lists
  [APM] Show license callout in ML settings (elastic#70959)
  Migrate service settings test to jest (elastic#70992)
  [APM] Add cloud attributes to data telemetry (elastic#71008)
  Fix breadcrumb on panels for visibility / round corners (elastic#71010)
  Improve search typescript (elastic#69333)
  [savedObjects field count] run in baseline job (elastic#70999)
  [Security Solution] [Timeline] Timeline manager tweaks (elastic#69988)
  [Endpoint] Support redirect from Policy Details to Ingest when user initiates Edit Policy from Datasource Edit page (elastic#70874)
  [APM] Add API tests (elastic#70740)
  [Security Solution][Exceptions] - Tie server and client code together (elastic#70918)
  [Audit Logging] Add AuditTrail service (elastic#69278)
  [Usage Collection] Ensure no type duplicates (elastic#70946)
  [Security Solution] [Timeline] Bugfix for timeline row actions disappear sometimes (elastic#70958)
  [CI] Add pipeline task queue framework and merge workers into one (elastic#64011)
  ...
@mshustov mshustov added release_note:skip Skip the PR/issue when compiling release notes and removed release_note:plugin_api_changes Contains a Plugin API changes section for the breaking plugin API changes section. labels Aug 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
NeededFor:Security release_note:skip Skip the PR/issue when compiling release notes Team:Core Core services & architecture: plugins, logging, config, saved objects, http, ES client, i18n, etc v7.9.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants