Feat: log rate limits #2900

ctlong · 2022-08-03T21:51:27Z

Introduce log rate limits to org and space quotas, and tasks and processes.

cloudfoundry/capi-release#245

I have reviewed the contributing guide
I have viewed, signed, and submitted the Contributor License Agreement
I have made this pull request to the main branch
I have run CF Acceptance Tests

cc/ @sethboyles

mkocher · 2022-08-19T17:53:55Z

@sethboyles we think this is ready to go! Can you please take a look?

linux-foundation-easycla · 2022-08-19T18:47:34Z

The committers listed above are authorized under a signed CLA.

✅ login: ctlong / name: Carson Long (6f115cd)
✅ login: rroberts2222 / name: Rebecca Roberts (20d07d8, 5030793, a19a6ff)
✅ login: acrmp / name: Andrew Crump (bb31b14, fc22209, 6d3e2e8, 230e6a5, 2048c71, be468d6)
✅ login: Benjamintf1 / name: Benjamin Fuller (5f0da2e)
✅ login: mkocher / name: Matthew Kocher (bb70156, 3760cfa)

Updates the `/v3/organization_quotas/` and `/v3/space_quotas/` endpoints to allow setting and retrieving of a new parameter (`log_rate_limit_in_bytes_per_second`). This will eventually permit the user to set log line production limits in bytes per second, rather than lines per second. Updates v3/processes and v3/tasks endpoints to support `log_rate_limit_in_bytes_per_second` An unlimited log rate limit is represented as -1 Tracker Story ID: [#182311424] Tracker Story ID: [#182353823] Tracker Story ID: [#182311433] Tracker Story ID: [#182624538] Tracker Story ID: [#182624530] Github Issue: cloudfoundry/capi-release#245 Signed-off-by: Carson Long <[email protected]> Signed-off-by: Kenneth Lakin <[email protected]> Signed-off-by: Duane May <[email protected]> Signed-off-by: Matthew Kocher <[email protected]> Signed-off-by: Ben Fuller <[email protected]> Signed-off-by: Seth Boyles <[email protected]> Co-authored-by: Matthew Kocher <[email protected]>

- We accept -1 without a byte suffix to mean unlimited - We also return -1 in the rendered manifest [#182624538](https://www.pivotaltracker.com/story/show/182624538) Github Issue: cloudfoundry/capi-release#245 Co-authored-by: Rebecca Roberts <[email protected]> Co-authored-by: Andrew Crump <[email protected]>

- We found scaling_operation? to be a misleading name - Just refer to started? to make it clearer when validation will be performed [#182624538](https://www.pivotaltracker.com/story/show/182624538) Co-authored-by: Andrew Crump <[email protected]>

[#182311441](https://www.pivotaltracker.com/story/show/182311441) Co-authored-by: Rebecca Roberts <[email protected]>

[#182311441](https://www.pivotaltracker.com/story/show/182311441) Co-authored-by: Matthew Kocher <[email protected]>

- The log rate limit from the application web process is applied to the staging task - The staging log rate limit can be customized when creating a build, for consistency with memory and disk limits [#182311441](https://www.pivotaltracker.com/story/show/182311441) Co-authored-by: Rebecca Roberts <[email protected]>

[#182311441](https://www.pivotaltracker.com/story/show/182311441) cloudfoundry/capi-release#245 Co-authored-by: Rebecca Roberts <[email protected]>

- Found that if there is a new version of Diego and an old version of cloud controller, the field would not be provided and defaulted to 0. - Wrapped the integer in a MessageType so it would default to null. Co-authored-by: Duane May <[email protected]>

* Add Log Rate container metrics. These metrics allow users to see how much each application is logging. * Remove support for getting metrics from Traffic Controller. Log Cache has been the default for retreiving metrics for a long time. Removing "temporary" config flag and support for the old way of connecting to Traffic Controller for metrics We did the two above items together because we did not want to add support for getting the new container metrics from Traffic Controller. The code was manually creating Traffic Controller protobufs and shoving Log Cache data into them, which we had to move away from to make this work without delving into Traffic Controller protobufs. Signed-off-by: Duane May <[email protected]> Co-authored-by: Duane May <[email protected]> Signed-off-by: Matthew Kocher <[email protected]> Co-authored-by: Matthew Kocher <[email protected]> Signed-off-by: Ben Fuller <[email protected]> Co-authored-by: Ben Fuller <[email protected]>

All our archeology seems to indicate that dashes should be preferred in manifests as they are yaml and yaml prefers kebab-case to snake_case. We also uncovered some oddities around setting process level properties on the top level app object. We added tests to document that disk and memory don't show up in the diff when changed on the app instead of the process. We also discovered that proprties like health-check that shouldbe kebab-case were not working at the app level and fixed them to be consistent. Signed-off-by: Rebecca Roberts <[email protected]> Co-authored-by: Rebecca Roberts <[email protected]>

sethboyles · 2022-08-19T21:35:21Z

Awesome! I'll go ahead and mark it as 'ready for review'

moleske

I haven't looked at this in detail but have some questions just from a 5 minute skim

moleske · 2022-08-19T22:57:22Z

db/migrations/20220606172913_add_log_rate_limit_to_quota_definitions.rb

@@ -0,0 +1,5 @@
+Sequel.migration do


Do we need five separate migrations? I assume they came to be because of the many commits, but if all five are needed for the feature, it seems like it should be one migration

👍 I pushed a change to combine the migrations.

lib/diego/bbs/models/desired_lrp_pb.rb

Signed-off-by: Rebecca Roberts <[email protected]> Co-authored-by: Rebecca Roberts <[email protected]> wip: group with app manifest message updates for unlimited

- Adds validation to messages for builds, manifest processes and tasks - Avoids 'An unknown error occurred' caused by the database insert failing due to an out of range error. [#182969510](https://www.pivotaltracker.com/story/show/182969510) Co-authored-by: Matthew Kocher <[email protected]>

Fixes: Using the `raise_error` matcher without providing a specific error or message risks false positives [#182969510](https://www.pivotaltracker.com/story/show/182969510) Co-authored-by: Rebecca Roberts <[email protected]>

jdgonzaleza · 2022-08-29T19:37:34Z

Hello! Please let the cli team when this PR since it's related with cloudfoundry/cli#2303

sethboyles · 2022-09-01T17:34:28Z

@Benjamintf1 for the docs, should we mark this as experimental or upcoming? Are the changes in upstream components like Diego merged in?

ctlong · 2022-09-01T17:40:45Z

@sethboyles I'd lean towards no, but defer to your process for introducing new features. The Diego work has been merged, and released in v2.66.0. That's the only upstream component.

Note: Ben's on vacation for the next couple weeks.

sethboyles · 2022-09-01T17:44:48Z

That's good enough for me. Thanks!

sethboyles · 2022-09-01T18:37:20Z

Hey @ctlong ,

We are seeing a couple of errors with manifests.

When supplying -1 or 0 for log-rate-limit-per-second, we are getting 500s from the manifest_diff and apply_manifest endpoints. It seems like strip is being called on the value, which errors because 0 and -1 are integers, not strings.

The other issue we are seeing is if we set disk_quota to 0. Normally this errors with

For application 'dora2': Process "web": Disk quota must use a supported unit: B, K, KB, M, MB, G, GB, T, or TB
FAILED

However, with this branch, we are seeing this error rendered in the manifest_diff:

Pushing app dora to org org / space space as admin...
Applying manifest file dora-manifest2.yml...

Updating with these attributes...
  ---
  applications:
  - name: dora
    processes:
-   - disk_quota: 1024M
+   - disk_quota: 'disk_quota must use a supported unit: B, K, KB, M, MB, G, GB, T, or TB'
      health-check-type: port
      instances: 1
      memory: 256M
      type: web

And then the error is not picked up by apply_manifest and then a deployment continues, despite having an invalid manifest. Weirdly this only happens with 0 and not -1 or other integers. We double checked that this error didn't occur on envs with our latest release.

We looked at the code a little and are not sure how this odd error happened

The cf CLI would provide these values as strings, however directly curling the endpoint with the values as integers would result in a 500. Co-authored-by: Carson Long <[email protected]>

Co-authored-by: Carson Long <[email protected]>

acrmp · 2022-09-01T22:39:03Z

@sethboyles Thanks! We pushed some more changes to address these issues.

sethboyles · 2022-09-02T18:49:15Z

@acrmp thanks! Confirmed that both issues are fixed.

Sending an int value like 9999 for log-rate-limit-per-second causes a 500. This is the error:

undefined method `strip' for 9999:Integer
byte_converter.convert_to_b(human_readable_byte_value.strip)

I think coercing human_readable_byte_value to a string at the beginning of that method would protect against that.

(oh the joys of yaml...)

acrmp · 2022-09-02T19:42:03Z

@sethboyles Thanks. I pushed a fix.

sethboyles · 2022-09-07T17:57:05Z

Noticed a 500 on processes/scale when sending null:

Mysql2::Error: Column 'log_rate_limit' cannot be null

sethboyles · 2022-09-07T18:21:45Z

checked the other endpoints:
builds
tasks
space_quotas
org_quotas
--they all seemed fine 👍

duanemay · 2022-09-07T21:00:01Z

@sethboyles While looking at the issue with sending null for log_rate_limit_in_bytes_per_second we also noticed an existing issue with sending null for memory_in_mb

cf curl /v3/apps/$(cf app dora --guid)/processes/web/actions/scale -X POST -d '{ "memory_in_mb": null}'

sethboyles · 2022-09-07T21:14:28Z

We noticed it with instances, too. I think that is related to the memory issue since it tries to do some math with those two values. I didn't think null is supposed to be valid for instances, not sure about memory_in_mb. Is null valid in the scale endpoint for log_rate_limit_in_bytes_per_second?

sethboyles · 2022-09-07T21:17:05Z

that is to say I'm not super concerned if it's not supposed to be valid. If it's not, we can toss a generic story in the backlog to deal with all three of these fields if you'd rather get this PR merged sooner than later

Signed-off-by: Andrew Crump <[email protected]>

duanemay · 2022-09-07T21:49:33Z

@sethboyles We don't think null is valid in this case. But we have put in a fix so that the error is not produced in this case.

@acrmp & Duane

- this [pr](cloudfoundry/cloud_controller_ng#2900) added log rate limits which are now part of the expected manifest Co-authored-by: Michael Oleske <[email protected]> Co-authored-by: David Alvarado <[email protected]>

We stopped using it as of #2900 Co-authored-by: Joseph Palermo <[email protected]> Co-authored-by: Merric de Launey <[email protected]>

We stopped using it as of #2900 Co-authored-by: Joseph Palermo <[email protected]> Co-authored-by: Merric de Launey <[email protected]> Co-authored-by: Joseph Palermo <[email protected]>

We stopped using it as of cloudfoundry#2900 Co-authored-by: Joseph Palermo <[email protected]> Co-authored-by: Merric de Launey <[email protected]> Co-authored-by: Joseph Palermo <[email protected]>

cf-gitbot added the unscheduled label Aug 3, 2022

ctlong force-pushed the log-rate-limit-rebased branch 2 times, most recently from e42958b to e341e31 Compare August 4, 2022 22:15

mkocher force-pushed the log-rate-limit-rebased branch from aea5d28 to 820ebce Compare August 19, 2022 17:43

mkocher marked this pull request as ready for review August 19, 2022 17:53

mkocher force-pushed the log-rate-limit-rebased branch from 820ebce to 76e9f9e Compare August 19, 2022 18:47

mkocher marked this pull request as draft August 19, 2022 18:52

ctlong and others added 10 commits August 19, 2022 18:53

Pass log rate limit to Diego

bb31b14

[#182311441](https://www.pivotaltracker.com/story/show/182311441) Co-authored-by: Rebecca Roberts <[email protected]>

Pass task log rate limit to Diego

fc22209

[#182311441](https://www.pivotaltracker.com/story/show/182311441) Co-authored-by: Matthew Kocher <[email protected]>

Wrap log rate limit quota errors

230e6a5

[#182311441](https://www.pivotaltracker.com/story/show/182311441) cloudfoundry/capi-release#245 Co-authored-by: Rebecca Roberts <[email protected]>

mkocher force-pushed the log-rate-limit-rebased branch from 76e9f9e to be468d6 Compare August 19, 2022 18:54

rroberts2222 force-pushed the log-rate-limit-rebased branch from be468d6 to d1f388c Compare August 19, 2022 20:38

sethboyles marked this pull request as ready for review August 19, 2022 21:35

moleske reviewed Aug 19, 2022

View reviewed changes

mkocher and others added 3 commits August 29, 2022 16:41

Manifests handle -1 and 0 log-rate-limits without units

a9e65c0

Signed-off-by: Rebecca Roberts <[email protected]> Co-authored-by: Rebecca Roberts <[email protected]> wip: group with app manifest message updates for unlimited

Address rspec warning for potential false positive

e30a91f

Fixes: Using the `raise_error` matcher without providing a specific error or message risks false positives [#182969510](https://www.pivotaltracker.com/story/show/182969510) Co-authored-by: Rebecca Roberts <[email protected]>

mkocher force-pushed the log-rate-limit-rebased branch from e8afa9c to e30a91f Compare August 29, 2022 16:41

Benjamintf1 mentioned this pull request Aug 31, 2022

Log rate limit changes cloudfoundry/capi-release#266

Merged

3 tasks

acrmp mentioned this pull request Sep 1, 2022

Prevent quota updates or assignment with finite log rates #2948

Merged

5 tasks

acrmp and others added 2 commits September 1, 2022 22:18

Handle -1 and 0 when applying/diffing manifests

b041419

The cf CLI would provide these values as strings, however directly curling the endpoint with the values as integers would result in a 500. Co-authored-by: Carson Long <[email protected]>

Raise error when disk quota is zero

adfb5ca

Co-authored-by: Carson Long <[email protected]>

Don't error when limit is passed as an integer

b98d3ab

Ignore a null log rate limit on process scale

b11e37e

Signed-off-by: Andrew Crump <[email protected]>

sethboyles merged commit 5dba613 into cloudfoundry:main Sep 7, 2022

cf-gitbot removed the unscheduled label Sep 7, 2022

acrmp deleted the log-rate-limit-rebased branch September 7, 2022 22:46

sethboyles mentioned this pull request Sep 26, 2022

Logging Rate Limits - Allow operators to better control log production cloudfoundry/capi-release#245

Closed

MerricdeLauney added a commit that referenced this pull request Sep 26, 2022

Remove protobuf as dependency

0e16397

We stopped using it as of #2900 Co-authored-by: Joseph Palermo <[email protected]> Co-authored-by: Merric de Launey <[email protected]>

MerricdeLauney mentioned this pull request Sep 26, 2022

Remove protobuf as dependency #2983

Merged

5 tasks

MerricdeLauney added a commit that referenced this pull request Sep 27, 2022

Remove protobuf as dependency

148c8b4

We stopped using it as of #2900 Co-authored-by: Joseph Palermo <[email protected]> Co-authored-by: Merric de Launey <[email protected]>

Benjamintf1 mentioned this pull request Mar 1, 2024

move benjamintf1 to approver for capi cloudfoundry/community#786

Merged

beyhan mentioned this pull request May 3, 2024

Add David Alvarado as CLI approver cloudfoundry/community#835

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat: log rate limits #2900

Feat: log rate limits #2900

ctlong commented Aug 3, 2022 •

edited

Loading

mkocher commented Aug 19, 2022

linux-foundation-easycla bot commented Aug 19, 2022 •

edited

Loading

sethboyles commented Aug 19, 2022

moleske left a comment

moleske Aug 19, 2022

acrmp Aug 22, 2022

jdgonzaleza commented Aug 29, 2022

sethboyles commented Sep 1, 2022

ctlong commented Sep 1, 2022 •

edited

Loading

sethboyles commented Sep 1, 2022

sethboyles commented Sep 1, 2022

acrmp commented Sep 1, 2022

sethboyles commented Sep 2, 2022

acrmp commented Sep 2, 2022

sethboyles commented Sep 7, 2022

sethboyles commented Sep 7, 2022

duanemay commented Sep 7, 2022 •

edited

Loading

sethboyles commented Sep 7, 2022 •

edited

Loading

sethboyles commented Sep 7, 2022

duanemay commented Sep 7, 2022

Feat: log rate limits #2900

Feat: log rate limits #2900

Conversation

ctlong commented Aug 3, 2022 • edited Loading

mkocher commented Aug 19, 2022

linux-foundation-easycla bot commented Aug 19, 2022 • edited Loading

sethboyles commented Aug 19, 2022

moleske left a comment

Choose a reason for hiding this comment

moleske Aug 19, 2022

Choose a reason for hiding this comment

acrmp Aug 22, 2022

Choose a reason for hiding this comment

jdgonzaleza commented Aug 29, 2022

sethboyles commented Sep 1, 2022

ctlong commented Sep 1, 2022 • edited Loading

sethboyles commented Sep 1, 2022

sethboyles commented Sep 1, 2022

acrmp commented Sep 1, 2022

sethboyles commented Sep 2, 2022

acrmp commented Sep 2, 2022

sethboyles commented Sep 7, 2022

sethboyles commented Sep 7, 2022

duanemay commented Sep 7, 2022 • edited Loading

sethboyles commented Sep 7, 2022 • edited Loading

sethboyles commented Sep 7, 2022

duanemay commented Sep 7, 2022

ctlong commented Aug 3, 2022 •

edited

Loading

linux-foundation-easycla bot commented Aug 19, 2022 •

edited

Loading

ctlong commented Sep 1, 2022 •

edited

Loading

duanemay commented Sep 7, 2022 •

edited

Loading

sethboyles commented Sep 7, 2022 •

edited

Loading