Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AWS] [S3] fix: improve object size metric calculation #41755

Conversation

Kavindu-Dodan
Copy link
Contributor

@Kavindu-Dodan Kavindu-Dodan commented Nov 22, 2024

Proposed commit message

This fixes a bug with Cloudflare logpush integration, which uses the S3 input.

#40775 introduced a feature introducing s3_object_size_in_bytes histogram. Internally it utilized S3 SDK's GetObjectOutput instances' ContentLength property 1. In code level (SDK internally), this is derived using Content-Length header 2.

It seems that Cloudflare R2 (API compatible with S3) could omit Content-Length header 3. Hence, I have improved how we derive the s3_object_size_in_bytes metric by avoiding using Content-Length backed method.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

  • Build filebeat from this branch
  • Enable monitoring endpoints
  • Configure buckets with data and observe metrics in monitoring endpoint

Screenshots

See metric comparison with proposed collection method (before removing content length usage). Tracked total bytes are same as content length.

Handling validated for gzip

image

Handling validated for plain text

image


Footnotes

  1. https://docs.aws.amazon.com/AmazonS3/latest/API/API_GetObject.html#API_GetObject_ResponseSyntax

  2. https://github.com/aws/aws-sdk-go-v2/blob/service/s3/v1.60.1/service/s3/deserializers.go#L5477-L5484

  3. https://community.cloudflare.com/t/no-content-length-header-when-serving-plaintext-from-r2/565521

@Kavindu-Dodan Kavindu-Dodan added bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team backport-8.x Automated backport to the 8.x branch with mergify backport-8.16 Automated backport with mergify labels Nov 22, 2024
@Kavindu-Dodan Kavindu-Dodan requested a review from a team November 22, 2024 19:20
@Kavindu-Dodan Kavindu-Dodan requested a review from a team as a code owner November 22, 2024 19:20
@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Nov 22, 2024
@Kavindu-Dodan Kavindu-Dodan force-pushed the fix/improve-object-size-metric-calculation branch from 78c0c74 to 04531f3 Compare November 22, 2024 19:47
Comment on lines 212 to 214
totalBytesRead *monitoring.Uint

trackBytes int64
Copy link
Member

@andrewkroh andrewkroh Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the naming could be more clear. "trackBytes" does not mean much to me. Maybe...

	totalBytesReadMetric *monitoring.Uint
	totalBytesRead       int64

perhaps with some godoc for each field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, please see - 91c7d87

I have renamed and added doc comments so the tracking is clear :)

return n, err
}

func (m *monitoredReader) GetTrackedBytes() int64 {
Copy link
Member

@andrewkroh andrewkroh Nov 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The accessor seems unnecessary. The caller can read straight from the field.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed accessor :)

@@ -219,5 +221,10 @@ func newMonitoredReader(r io.Reader, metric *monitoring.Uint) *monitoredReader {
func (m *monitoredReader) Read(p []byte) (int, error) {
n, err := m.reader.Read(p)
m.totalBytesRead.Add(uint64(n))
m.trackBytes += int64(n)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the diff between totalBytesRead vs trackBytes?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@gizas good point. I added doc comment please see here - 91c7d87#diff-f8abd76e8ccf8f3244f2153161d307bdfd220d7d8b691423bb79fc43cc4a94eaR209-R214

This (renamed to totalBytesReadCurrent) tracks bytes read for currently tracked object. On the other hand, totalBytesRead (renamed to totalBytesReadMetric) track all byes read for all tracked objects. This is a cumulative counter.

Copy link
Contributor

@gizas gizas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving as LGTM the logic!
I agree accessor GetTrackedBytes() can be ommited

@Kavindu-Dodan Kavindu-Dodan force-pushed the fix/improve-object-size-metric-calculation branch from 3a1b4de to 25ea16c Compare November 25, 2024 15:42
Signed-off-by: Kavindu Dodanduwa <[email protected]>
@Kavindu-Dodan Kavindu-Dodan force-pushed the fix/improve-object-size-metric-calculation branch from 25ea16c to 91c7d87 Compare November 25, 2024 15:56
@Kavindu-Dodan Kavindu-Dodan merged commit 92bb2c5 into elastic:main Nov 25, 2024
22 checks passed
mergify bot pushed a commit that referenced this pull request Nov 25, 2024
* rely on monitored reader for content length

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* add tests to validate metrics

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* fix lint - ignore ok values as we know the stored value type

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* review change - rename and accessor

Signed-off-by: Kavindu Dodanduwa <[email protected]>

---------

Signed-off-by: Kavindu Dodanduwa <[email protected]>
(cherry picked from commit 92bb2c5)
mergify bot pushed a commit that referenced this pull request Nov 25, 2024
* rely on monitored reader for content length

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* add tests to validate metrics

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* fix lint - ignore ok values as we know the stored value type

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* review change - rename and accessor

Signed-off-by: Kavindu Dodanduwa <[email protected]>

---------

Signed-off-by: Kavindu Dodanduwa <[email protected]>
(cherry picked from commit 92bb2c5)
@Kavindu-Dodan Kavindu-Dodan deleted the fix/improve-object-size-metric-calculation branch November 25, 2024 19:33
Kavindu-Dodan added a commit that referenced this pull request Nov 25, 2024
…lculation (#41777)

* [AWS] [S3] fix: improve object size metric calculation (#41755)

* rely on monitored reader for content length

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* add tests to validate metrics

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* fix lint - ignore ok values as we know the stored value type

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* review change - rename and accessor

Signed-off-by: Kavindu Dodanduwa <[email protected]>

---------

Signed-off-by: Kavindu Dodanduwa <[email protected]>
(cherry picked from commit 92bb2c5)

* Update CHANGELOG.next.asciidoc

Remove unwanted backport entry

---------

Co-authored-by: Kavindu Dodanduwa <[email protected]>
Kavindu-Dodan added a commit that referenced this pull request Nov 25, 2024
* rely on monitored reader for content length

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* add tests to validate metrics

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* add changelog entry

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* fix lint - ignore ok values as we know the stored value type

Signed-off-by: Kavindu Dodanduwa <[email protected]>

* review change - rename and accessor

Signed-off-by: Kavindu Dodanduwa <[email protected]>

---------

Signed-off-by: Kavindu Dodanduwa <[email protected]>
(cherry picked from commit 92bb2c5)

Co-authored-by: Kavindu Dodanduwa <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-8.x Automated backport to the 8.x branch with mergify backport-8.16 Automated backport with mergify bug Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants