Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

how do I get the most performance out of the subscribe API? #824

Closed
willie opened this issue Dec 12, 2017 · 18 comments
Closed

how do I get the most performance out of the subscribe API? #824

willie opened this issue Dec 12, 2017 · 18 comments
Assignees
Labels
api: pubsub Issues related to the Pub/Sub API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@willie
Copy link

willie commented Dec 12, 2017

What do I need to make this code pull and ack messages faster?

	sub := client.Subscription(*subscription)
	// monkey with sub.ReceiveSettings.*
	err = sub.Receive(ctx, func(ctx context.Context, msg *pubsub.Message) {
		msg.Ack()
	}()

Right now, once instance of my app with:
sub.ReceiveSettings.NumGoroutines = 10 * 6 * runtime.NumCPU()
can pull about 5K / minute, but if I take the same program and run 6 copies of it, I get > 6X performance from pubsub.

Any suggestions? I don't want to have to go back to the old API, but will if I have to.

@jba
Copy link
Contributor

jba commented Dec 12, 2017

What about MaxOutstandingMessages and MaxOutstandingBytes? They will throttle your throughput no matter how many goroutines you have.

@willie
Copy link
Author

willie commented Dec 12, 2017

I have PLENTY of memory and bandwidth. What values should I try for MaxOutstandingMessages & MaxOutstandingBytes?

@jba
Copy link
Contributor

jba commented Dec 12, 2017

Use negative values for both to turn off throttling. I would start there.

@willie
Copy link
Author

willie commented Dec 12, 2017

Holy moly. That worked. THANK YOU.
I was banging my head hard about that.

It's completely different performance. Utterly blazing. Seems like the documentation needs a clear section on tuning aspects of the ReceiveSettings for different scenarios.

@jba jba self-assigned this Dec 12, 2017
@jba jba added api: pubsub Issues related to the Pub/Sub API. documentation priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. labels Dec 12, 2017
@jba
Copy link
Contributor

jba commented Dec 12, 2017

Sweet! You had us worried there. We designed Receive for high throughput, and our internal testing told us that it delivered.

I will convert this to a feature request to improve the docs.

@willie
Copy link
Author

willie commented Dec 12, 2017

I know originally this library defaulted to something like this: sub.ReceiveSettings.NumGoroutines = 10 * runtime.NumCPU() but, due to concerns about too much concurrency, it was dropped to a value of 1. What is the recommended amount for high-performance?

@jba
Copy link
Contributor

jba commented Dec 12, 2017

I would have said 10x, but you've got it up to 60x. So I don't have a good answer. With throttling off, that is your only remaining tunable parameter, so play with it.

@pongad, any suggestions?

@pongad
Copy link
Contributor

pongad commented Dec 13, 2017

I'm not sure I have a better suggestion; I think some experimentation is necessary.

If your computer runs infinitely quickly, it makes sense to make NumGoroutines large so there are more goroutines pulling messages. Real computers don't actually run that quickly, in which case setting the number too high will make the performance worse. The messages are stuck waiting for CPU on one machine and pubsub server can't send them to another machine.

I suppose it's possible for us to implement some kind of feedback mechanism to create just enough goroutines to fill flow control. I experimented with the idea but it's quite complicated and I never actually got it right.

@vchudnov-g vchudnov-g added the type: question Request for information or clarification. Not an issue. label Dec 13, 2017
@willie
Copy link
Author

willie commented Dec 13, 2017

Based on this discussion (and past code defaults) and to summarize, I'm going to stick with this:

	sub.ReceiveSettings.NumGoroutines = 10 * runtime.NumCPU()
	sub.ReceiveSettings.MaxOutstandingMessages = -1
	sub.ReceiveSettings.MaxOutstandingBytes = -1

as my defaults for high performance situations. Thank you all very much for the quick and direct feedback to make this work.

@pongad
Copy link
Contributor

pongad commented Dec 13, 2017

@willie Just a fair warning, the NumGoroutines is the number of goroutines pulling the messages, not processing them. We keep spawning goroutines until we're capped by either MaxOutstandingMessages or MaxOutstandingBytes.

If you ack messages so quickly, this is probably not a big deal. If you ack slow though, you might run out of memory. -1 is OK for trying things out but might not be for production systems.

@willie
Copy link
Author

willie commented Dec 13, 2017

@pongad Thanks for the extra information. I will try the NumGoroutines default value on my next run, as I do ack immediately (I rate limit and resource control downstream from the pubsub receive). (I guess I was messing with the wrong knob all this time.)

@danoscarmike danoscarmike added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed type: question Request for information or clarification. Not an issue. labels Dec 13, 2017
@willie
Copy link
Author

willie commented Dec 14, 2017

Good news. I was able to achieve the performance I needed without modifying NumGoroutines from the library defaults and passing -1 to MaxOutstandingMessages & MaxOutstandingBytes. I understand the caveats of that firehose, but it resolves the specific issue I was having.

I would like to understand, however, as to why MaxOutstandingMessages & MaxOutstandingBytes at their default values were slowing throttling me over time, when I was immediately msg.Ack() as the first line of my lightweight receive func.

@pongad pongad removed the priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. label Dec 15, 2017
@pongad
Copy link
Contributor

pongad commented Dec 15, 2017

@jba Thinking about this a little more, I'm also confused. I just ran a test. Leaving everything at default, I was able to pull 20K messages per second, orders of magnitude more than what @willie reported.

@willie If you want to dig more into this, can you let us know how many goroutines you have running with default settings? If you can share stack trace of some of them, that'd be really useful as well (http/pprof should help here).

@jba
Copy link
Contributor

jba commented Dec 15, 2017

@willie do you ack early in your callback, but then keep doing stuff? Our current implementation calls the flow controller's release method after the callback returns, not when the message is acked.

@willie
Copy link
Author

willie commented Dec 15, 2017

@jba Yes, I ack immediately (because I thought it would speed things up), then I parse a JSON message from the data, 4 string manipulations, spawn a goroutine, and then return from the callback.

@pongad That was not my experience. Well, sort of, it was fast at the beginning (not as fast as you describe, but that have been my network path (I'm not consuming from GCP at the moment)), then it trailed off to a much slower rate like a shallow curve.

@pongad
Copy link
Contributor

pongad commented Dec 18, 2017

@jba I recently talked to pubsub team. They recommend us releasing flow-control after the message is acked/nacked, not when the function returns. This isn't a top-priority; I can pick this up if you'd like?

@willie In general, we recommend calling ack/nack after you're "finished" with the message. In that way, if your machine dies while processing, the message is redelivered.

@willie
Copy link
Author

willie commented Dec 18, 2017

@pongad I agree with the general recommendation.

My particular use case is OK the way it is because the publisher is going to publish again and again until the situation that the subscriber is resolving is resolved. (Resolution involves uploads (and I have coalescing request logic in place), so I'd rather ack and process, rather than wait to ack after process.

@jba
Copy link
Contributor

jba commented Jan 31, 2018

Closing, since we've fixed #870.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: pubsub Issues related to the Pub/Sub API. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

No branches or pull requests

5 participants