RFD 68: Session recording modes #11631

gabrielcorado · 2022-03-31T20:55:55Z

klizhentas

This looks good to me

rfd/0064-audit-modes.md

smallinsky · 2022-04-01T07:26:33Z

rfd/0064-audit-modes.md

+**Note**: Currently, the option  `record_session.desktop` uses a boolean value to
+disable/enable session recording. It will have to be converted into a new
+format: `true` will be charged to the default audit mode, and `false` will be
+`off`. Disabling session recording is out of the scope of this RFD, meaning that
+the `off` value will only be present on the `desktop` kind.


I'm not sure if switching desktop field type from bool to string will be backward comparable change.

rfd/0064-audit-modes.md

marcoandredinis · 2022-04-01T08:28:27Z

rfd/0064-audit-modes.md

+      app: strict|best_effort
+      db: strict|best_effort
+      k8s: strict|best_effort
+      desktop: strict|best_effort|off


Just a technical note, probably not relevant here but may cause some confusion during development
on/off will be treated as bool in yaml: https://yaml.org/type/bool.html
We must surround it with " to ensure it is treated as string: "off"

+1

A few questions:

Can we change all to default? Then it would look and feel more like a switch statement:

options: record_sessions: desktop: strict ssh: strict default: best-effort

Is all / default required? What happens if it isn't specified? I guess we could assume there's an implicit default of best-effort (today's behavior) if not specified.

I think we should be clear about precedence, since this is role-based and a user can have multiple roles. For example, what would the behavior be if my user has these two roles:

role1:

options: record_sessions: desktop: best-effort ssh: best-effort default: strict

role2:

options: record_sessions: kube: strict ssh: strict

My guess - if they:

start a kube session: role2's strict option should apply

start an ssh session: role2's strict option should apply, since it is stricter than role1's best-effort option

start an app session, role1's default of strict applies, because it is stricter than role2's "implicit" default of "best-effort"

I agree about the change and making it like a switch statement;

It is not going to be required. Currently, the current behavior is kind of mixed. It is strict when starting new sessions, but it is the best effort for ongoing sessions. So, I'm not sure what value should be the default. Do you have any suggestions?

That's correct. Thanks for your example. I'll add to the RFD to make it more clear.

Updated the RFD with the explanation and example.

zmb3

I'm a little bit concerned that we're taking an option that was meant to enable/disable and making it mean something else.

There's already a lot of complexity for reconciling different roles with the strict and best-effort options. If/when we add 'off' as a state, things get even trickier.

rfd/0064-audit-modes.md

zmb3 · 2022-03-31T22:14:33Z

rfd/0064-audit-modes.md

+| `node` and `proxy`             | I/O errors while writing audit logs/session recording data on the server disk. |
+| `node-async` and `proxy-async` | Connection errors with Proxy/Auth server while streaming audit logs/session recording data. |
+
+There are going to be two audit modes:


I think this RFD should really be called "recording modes" not "audit modes".

Audit events are always emitted directly to the auth server and never written to disk first, right? If this is true, then running out of disk isn't so much of an issue for audit events.

Audit events are always emitted directly to the auth server and never written to disk first, right?

Not really. In some cases, they're also written into the disk. We use a TeeStreamer when the recording mode is not sync (like in the SSH sessions), so those events are sent to the disk and auth server (except for a few events). For example, looking at the session recording we can see the start and end events:

$ tsh play -f json <session-id> {"ei":0,"event":"session.start", ...} {"ei":1,"event":"print", ...} ... {"ei":10,"event":"print", ...} {"ei":11,"event":"session.end", ...}

However, failing to emit an audit event will only log an error message since they're already sent to auth server.

If this is true, then running out of disk isn't so much of an issue for audit events.

Right, this is not an audit events issue. But the RFD mentions it will also cover errors while emitting events to auth server (we can consider removing this part if it makes sense).

zmb3 · 2022-03-31T22:14:54Z

rfd/0064-audit-modes.md

+### Configuration
+
+The configuration is going to be done at the role level. The role option
+`record_session` will be extended to hold the audit mode values: `strict` and


Suggested change

`record_session` will be extended to hold the audit mode values: `strict` and

`record_sessions` will be extended to hold the audit mode values: `strict` and

The option is called record_session. Do you think we should rename it?

zmb3 · 2022-04-01T15:56:18Z

rfd/0064-audit-modes.md

+      app: strict|best_effort
+      db: strict|best_effort
+      k8s: strict|best_effort
+      desktop: strict|best_effort|off


+1

A few questions:

Can we change all to default? Then it would look and feel more like a switch statement:

options: record_sessions: desktop: strict ssh: strict default: best-effort

Is all / default required? What happens if it isn't specified? I guess we could assume there's an implicit default of best-effort (today's behavior) if not specified.

I think we should be clear about precedence, since this is role-based and a user can have multiple roles. For example, what would the behavior be if my user has these two roles:

role1:

options: record_sessions: desktop: best-effort ssh: best-effort default: strict

role2:

options: record_sessions: kube: strict ssh: strict

My guess - if they:

start a kube session: role2's strict option should apply

start an ssh session: role2's strict option should apply, since it is stricter than role1's best-effort option

start an app session, role1's default of strict applies, because it is stricter than role2's "implicit" default of "best-effort"

rfd/0064-audit-modes.md

zmb3 · 2022-04-01T16:16:14Z

The more I think about this, I wonder if it should be tied to roles at all.

Roles are user-based (and since a user can have many roles the behavior quickly gets complicated and hard to understand, see my previous comment about precedence).

Whether or not to abort a session if we can't continue recording seems like maybe it should be more based on the resource, not the role (or roles) of the user.

I wonder what it would look like if this was based on labels instead.

For example, maybe it's a cluster-wide setting with label selectors.

recording_preference:
  # recording must be strictly enforced for all resources in the prod env
  - mode: strict
    resources: all
    selectors:
      env: prod
  # recording must also be strictly enforced for desktops and kube clusters
  # in the staging env
  - mode: strict
    resources: [ desktop, kube ]
    selectors:
      env: staging
  # for everything else, best effort is okay
  # note: maybe we don't need this, and we always assume best-effort unless
  # an explicit match for a strict policy is found..
  - mode: best-effort
    resources: all
    selectors:
      '*': '*'

Having this in one place might be easier for users to reason about, rather than having to understand our precedence rules and manually compare each of a user's roles when the behavior doesn't match their expectations.

This has the added advantage of not conflicting with the enable/disable behavior we have for desktops today.

What do you think, @klizhentas?

r0mant · 2022-04-01T16:50:25Z

I don't think making this a part of cluster recording preference fully solves the original problem tbh. We need to let users specify "strict" recording/audit mode but also provide a "best_effort" escape hatch for certain cases. So the behavior needs to be dynamic and I think roles are a good place for this kind of stuff. Esp. paired with things like access requests.

r0mant

Looks mostly good to me, I only have a couple of general comments:

Be more specific about "audit logging" vs "session recording" terminology.
Make sure that any changes to existing settings record_session.desktop are backwards compatible.

r0mant · 2022-04-01T16:53:00Z

rfd/0064-audit-modes.md

+
+## Details
+
+Audit modes will define how Teleport proceeds in case of audit failures. Those


I would be specific with "audit" vs "session recording" terminology.

For audit events I think we only do "best effort" now since IIRC we basically ignore all errors from EmitAuditEvent (unless I'm mistaken). I would include it in the RFD also but focus on the session recording first to fix the original problem.

Not in all cases. For example, in application access, failing to emit events causes the request to fail.

Do you think we should also cover these cases?

rfd/0064-audit-modes.md

Co-authored-by: Zac Bergquist <[email protected]>

gabrielcorado · 2022-04-08T19:00:19Z

@marcoandredinis @zmb3 @r0mant Thanks for your review. After speaking with @smallinsky, we agreed to reduce the scope of this RFD to only session recordings (removing the audit logging part). I've made some updates to the contents, and some comments might be no longer valid.

The idea is to focus this RFD on the initial issue regarding accessing nodes that have disk issues. After covering and implementing it, we can work on supporting more protocols on the session recording modes (for example, desktop) and, if we feel it is necessary, work on another RFD to cover audit logging (events emitted by the EmitAuditEvent function).

marcoandredinis

LGTM

rfd/0068-session-recording-modes.md

chore(rfd): audit modes

853b469

gabrielcorado added the rfd Request for Discussion label Mar 31, 2022

gabrielcorado requested review from r0mant and smallinsky March 31, 2022 20:55

gabrielcorado self-assigned this Mar 31, 2022

github-actions bot requested review from marcoandredinis and nklaassen March 31, 2022 20:56

klizhentas approved these changes Mar 31, 2022

View reviewed changes

marcoandredinis reviewed Apr 1, 2022

View reviewed changes

rfd/0064-audit-modes.md Outdated Show resolved Hide resolved

marcoandredinis reviewed Apr 1, 2022

View reviewed changes

rfd/0064-audit-modes.md Outdated Show resolved Hide resolved

smallinsky reviewed Apr 1, 2022

View reviewed changes

marcoandredinis reviewed Apr 1, 2022

View reviewed changes

rfd/0064-audit-modes.md Outdated Show resolved Hide resolved

marcoandredinis reviewed Apr 1, 2022

View reviewed changes

zmb3 reviewed Apr 1, 2022

View reviewed changes

r0mant reviewed Apr 1, 2022

View reviewed changes

rfd/0064-audit-modes.md Outdated Show resolved Hide resolved

gabrielcorado and others added 3 commits April 4, 2022 12:53

Update rfd/0064-audit-modes.md

a08c2a2

Co-authored-by: Zac Bergquist <[email protected]>

refactor(rfd): remove audit logging from the scope

3d6cd37

refactor(rfd): rename rfd file

c8190a1

gabrielcorado changed the title ~~RFD: Audit modes~~ RFD 68: Session recording modes Apr 8, 2022

chore(rfd): update rfd number

3f6b5c8

marcoandredinis approved these changes Apr 11, 2022

View reviewed changes

gabrielcorado added 2 commits April 11, 2022 10:40

Merge branch 'master' into gabrielcorado/rfd-audit-modes

8b55671

chore(rfd): add modes precedence section

8e05a9a

smallinsky approved these changes Apr 20, 2022

View reviewed changes

nklaassen approved these changes Apr 28, 2022

View reviewed changes

rfd/0068-session-recording-modes.md Outdated Show resolved Hide resolved

rfd/0068-session-recording-modes.md Outdated Show resolved Hide resolved

Joerger mentioned this pull request May 25, 2022

SSH Session recording modes #12916

Merged

Joerger reviewed Jun 3, 2022

View reviewed changes

rfd/0068-session-recording-modes.md Outdated Show resolved Hide resolved

gabrielcorado added 2 commits June 6, 2022 11:07

Merge branch 'master' into gabrielcorado/rfd-audit-modes

68a7204

refactor(rfd): update with implementation changes

93c3410

Joerger approved these changes Jun 6, 2022

View reviewed changes

gabrielcorado added 2 commits June 6, 2022 17:38

Merge branch 'master' into gabrielcorado/rfd-audit-modes

bd94940

chore(rfd): update status

9c0b9eb

gabrielcorado enabled auto-merge (squash) June 6, 2022 20:39

gabrielcorado merged commit fdc60b2 into master Jun 6, 2022

gabrielcorado deleted the gabrielcorado/rfd-audit-modes branch November 2, 2022 00:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFD 68: Session recording modes #11631

RFD 68: Session recording modes #11631

gabrielcorado commented Mar 31, 2022 •

edited

Loading

klizhentas left a comment

smallinsky Apr 1, 2022

marcoandredinis Apr 1, 2022

zmb3 Apr 1, 2022

gabrielcorado Apr 4, 2022

gabrielcorado Apr 11, 2022

zmb3 left a comment

zmb3 Mar 31, 2022

gabrielcorado Apr 1, 2022

zmb3 Mar 31, 2022

gabrielcorado Apr 4, 2022

zmb3 Apr 1, 2022

zmb3 commented Apr 1, 2022 •

edited

Loading

r0mant commented Apr 1, 2022

r0mant left a comment

r0mant Apr 1, 2022

gabrielcorado Apr 4, 2022

gabrielcorado commented Apr 8, 2022 •

edited

Loading

marcoandredinis left a comment

	`record_session` will be extended to hold the audit mode values: `strict` and
	`record_sessions` will be extended to hold the audit mode values: `strict` and


		## Details

		Audit modes will define how Teleport proceeds in case of audit failures. Those

RFD 68: Session recording modes #11631

RFD 68: Session recording modes #11631

Conversation

gabrielcorado commented Mar 31, 2022 • edited Loading

klizhentas left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zmb3 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

zmb3 commented Apr 1, 2022 • edited Loading

r0mant commented Apr 1, 2022

r0mant left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

gabrielcorado commented Apr 8, 2022 • edited Loading

marcoandredinis left a comment

Choose a reason for hiding this comment

gabrielcorado commented Mar 31, 2022 •

edited

Loading

zmb3 commented Apr 1, 2022 •

edited

Loading

gabrielcorado commented Apr 8, 2022 •

edited

Loading