Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parse the url only once per Client #467

Merged
merged 3 commits into from
Mar 6, 2023

Conversation

mattrobenolt
Copy link
Contributor

@mattrobenolt mattrobenolt commented Feb 18, 2023

This stood out on flame graphs due ultimately to the http.NewRequestFromContext call, which ends up calling url.Parse and this happens on every single RPC call.

I did an optimization in #447 which similarly, and this is a natural extension of this.

We can just parse the url string once during NewClient, validate at that point, then tag along the parsed *url.URL everywhere else and never use it as a string again. This value never mutates through the lifetime of a Client and isn't publicly available on any structs.

Before:
image

After:
image

client.go Show resolved Hide resolved
Comment on lines +65 to +71
Proto: "HTTP/1.1",
ProtoMajor: 1,
ProtoMinor: 1,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is mirroring what's expected in stdlib. I hate it, but it is what it is. Happy to comment this up, but we can really skip any of the helpers and conveniences that NewRequestWithContext provides since our inputs are well known.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. This is fine with me, but I would like a few comments and a link the the implementation of NewRequestWithContext. (The hard-coded HTTP/1.1 constants are offputting to read!)

duplex_http_call.go Show resolved Hide resolved
protobuf_util.go Show resolved Hide resolved
protocol_connect.go Show resolved Hide resolved
client.go Show resolved Hide resolved
This stood out on flame graphs due ultimately to the
http.NewRequestFromContext call, which ends up calling url.Parse.

I did an optimization in connectrpc#447 which similarly, and this is a natural
extension of this.

We can just parse the url string once during NewClient, validate at that
point, then tag along the parsed *url.URL everywhere else and never use
it as a string again. This value never mutates through the lifetime of a
Client and isn't publicly available on any structs.
Comment on lines +60 to +63
// ensure we make a copy of the url before we pass along to the
// Request. This ensures if a transport out of our control wants
// to mutate the req.URL, we don't feel the effects of it.
url = cloneURL(url)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is erroring on the side of caution since we can't really control what an http.RoundTripper may do, and I'm not sure if it's expected that a RoundTripper is not allowed to mutate this, or is expected to make it's own copy if it were?

Doing this at least mirrors the safety we had when we passed along a string and it was parsed every time.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

RoundTripper implementations aren't supposed to mutate the URL:

// RoundTrip should not modify the request, except for
// consuming and closing the Request's Body. RoundTrip may
// read fields of the request in a separate goroutine. Callers
// should not mutate or reuse the request until the Response's
// Body has been closed.

In practice, I don't think many people read the docs very carefully so this extra safety is a good idea.

duplex_http_call.go Outdated Show resolved Hide resolved
@mattrobenolt
Copy link
Contributor Author

Just to follow up on this, we've been running this patch in production now and the improvement is measurable in that the url.Parse time is entirely gone.

This isn't significant on it's own, but since this is per-RPC, it's definitely a relatively expensive operation when the URL doesn't change. It also skips the method validation http.validateMethod since we're just using the constants and know it's valid.

Creating the io.Pipe() now what dominates here, and that's pretty essential.

@akshayjshah
Copy link
Member

Creating the io.Pipe() now what dominates here, and that's pretty essential.

Weeeeeelllllll.....it's not really. IIRC most of your RPCs are unary, so we don't actually need the pipe or the extra goroutine - we've got all the data in memory and can easily pass it as the request body. We only need all the code in duplex_http_call.go for streaming requests; unary and server streaming RPCs could use a different, faster implementation of StreamingClientConn. We had code written for this early in Connect's life, and IIRC it was ~30% faster.

I ended up deciding to drop the alternate code path and exercise the request-streaming code on every RPC because it's by far the most complex code in connect-go, and I wanted to flush out any bugs quickly. I think we're past that point now, so we can at least consider a PR to speed up unary and server streaming calls.

@mattrobenolt
Copy link
Contributor Author

Oh, if we can drop this and the goroutine, that'd be a decently large win. I guess it makes sense this is only for a bidirectional stream. I'd suspect it can be dropped also for a server steam only with a unary request? I'd be down to look into this unless you have a good grasp or what would be involved.

Regardless, deduping the url.Parse is still good.

@akshayjshah
Copy link
Member

Oh, if we can drop this and the goroutine, that'd be a decently large win. I guess it makes sense this is only for a bidirectional stream. I'd suspect it can be dropped also for a server steam only with a unary request? I'd be down to look into this unless you have a good grasp or what would be involved.

Yep, it should also work for server streaming. IIRC you'd need a different implementation of StreamingClientConn that errors when Send is called more than once; other than that it shouldn't be too too bad. We'd love it if you took a stab at a PR!

Agreed that this PR is good either way :)

Copy link
Member

@akshayjshah akshayjshah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely much nicer! Appreciate the optimization, especially since it also simplifies some irritating error handling.

Please do fix the lint errors about variable name length - our house style frowns upon 1-2 letter variable names unless their scope is very small. Once the lint errors are fixed and the NewRequestWithContext comments are added in, this is good to merge.

@akshayjshah akshayjshah changed the title client: Parse the url only once per Client Parse the url only once per Client Mar 6, 2023
@mattrobenolt
Copy link
Contributor Author

@akshayjshah pushed both changes.

In my defense, I copy/pasted cloneURL from stdlib. I'm sure you preferred that over me doing a //go:linkname. :)

@akshayjshah akshayjshah merged commit 6118a20 into connectrpc:main Mar 6, 2023
renovate bot referenced this pull request in open-feature/flagd Apr 14, 2023
[![Mend
Renovate](https://app.renovatebot.com/images/banner.svg)](https://renovatebot.com)

This PR contains the following updates:

| Package | Type | Update | Change |
|---|---|---|---|
|
[github.com/bufbuild/connect-go](https://togithub.com/bufbuild/connect-go)
| require | minor | `v1.5.2` -> `v1.6.0` |

---

### Release Notes

<details>
<summary>bufbuild/connect-go</summary>

###
[`v1.6.0`](https://togithub.com/bufbuild/connect-go/releases/tag/v1.6.0)

[Compare
Source](https://togithub.com/bufbuild/connect-go/compare/v1.5.2...v1.6.0)

<!-- Release notes generated using configuration in .github/release.yml
at main -->

#### What's Changed

##### Enhancements

- Improve comments & add procedure consts to generated code by
[@&#8203;akshayjshah](https://togithub.com/akshayjshah) in
[https://github.com/bufbuild/connect-go/pull/480](https://togithub.com/bufbuild/connect-go/pull/480)
- Reduce per-call URL parsing cost by
[@&#8203;mattrobenolt](https://togithub.com/mattrobenolt) in
[https://github.com/bufbuild/connect-go/pull/467](https://togithub.com/bufbuild/connect-go/pull/467)
- Improve errors for outdated protobuf runtimes by
[@&#8203;akshayjshah](https://togithub.com/akshayjshah) in
[https://github.com/bufbuild/connect-go/pull/465](https://togithub.com/bufbuild/connect-go/pull/465)
- Switch README to use `buf curl` by
[@&#8203;akshayjshah](https://togithub.com/akshayjshah) in
[https://github.com/bufbuild/connect-go/pull/474](https://togithub.com/bufbuild/connect-go/pull/474)

##### Bugfixes

- Clarify purpose of handler_stream_test.go by
[@&#8203;Hirochon](https://togithub.com/Hirochon) in
[https://github.com/bufbuild/connect-go/pull/472](https://togithub.com/bufbuild/connect-go/pull/472)
- Make StreamType constants typed numerics by
[@&#8203;jhump](https://togithub.com/jhump) in
[https://github.com/bufbuild/connect-go/pull/486](https://togithub.com/bufbuild/connect-go/pull/486)
- Populate Spec and Peer in Client.CallServerStream by
[@&#8203;jhump](https://togithub.com/jhump) in
[https://github.com/bufbuild/connect-go/pull/487](https://togithub.com/bufbuild/connect-go/pull/487)

#### New Contributors

- [@&#8203;Hirochon](https://togithub.com/Hirochon) made their first
contribution in
[https://github.com/bufbuild/connect-go/pull/472](https://togithub.com/bufbuild/connect-go/pull/472)

**Full Changelog**:
bufbuild/connect-go@v1.5.2...v1.6.0

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined),
Automerge - At any time (no schedule defined).

🚦 **Automerge**: Enabled.

♻ **Rebasing**: Whenever PR becomes conflicted, or you tick the
rebase/retry checkbox.

🔕 **Ignore**: Close this PR and you won't be reminded about this update
again.

---

- [ ] <!-- rebase-check -->If you want to rebase/retry this PR, check
this box

---

This PR has been generated by [Mend
Renovate](https://www.mend.io/free-developer-tools/renovate/). View
repository job log
[here](https://app.renovatebot.com/dashboard#github/open-feature/flagd).

<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNS4zMS40IiwidXBkYXRlZEluVmVyIjoiMzUuMzEuNCJ9-->

Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
@mattrobenolt mattrobenolt deleted the optimize-newconn branch June 8, 2023 16:56
akshayjshah pushed a commit that referenced this pull request Jul 26, 2023
This stood out on flame graphs due ultimately to the
http.NewRequestFromContext call, which ends up calling url.Parse and
this happens on every single RPC call.

I did an optimization in #447 which similarly, and this is a natural
extension of this.

We can just parse the url string once during NewClient, validate at that
point, then tag along the parsed *url.URL everywhere else and never use
it as a string again. This value never mutates through the lifetime of a
Client and isn't publicly available on any structs.

Before:
<img width="1026" alt="image"
src="https://user-images.githubusercontent.com/375744/219882639-b05416ee-3a8f-4b38-9716-9d66a34eeffc.png">

After:
<img width="1177" alt="image"
src="https://user-images.githubusercontent.com/375744/220477453-c1b48350-665b-495b-a070-3efb08e53067.png">
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants