Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not assume the vert.x context will never switch threads legally #1586

Closed
wants to merge 1 commit into from

Conversation

Sanne
Copy link
Member

@Sanne Sanne commented Apr 12, 2023

@tsegismont
Copy link
Contributor

@Sanne can you tell me more about your expectations and what might have changed between two vert.x versions?

I'm asking to make sure we haven't introduced a regression in Vert.x SQL Clients.

@Sanne
Copy link
Member Author

Sanne commented Apr 13, 2023

@tsegismont hi :)

I'm not sure if it relates to vert.x versions - we've had several users repor that my assertions would trigger; but the few reproducers we got would do some illegal stuff with threads leading me to think the assertion was just doing its job, but I wonder now if I underestimated those reports as there have been a number of them around this same theme.

The reproducer at quarkusio/quarkus#32533 is straight forward and triggered it; we further simplified it (e.g. remove Panache and the background timers) and it's still triggering, and I observe that an operation running on an event loop context does indeed occasionally shift to a different event-loop-thread.

I didn't expect that to be a possibility (and that's why I had introduced this assertion to begin with, so to spot illegal thread usages) but I suppose there could be valid reasons within vert.x internals (e.g. load-balancing of eventloops?); please let me know if it's totally unexpected.
I can see according to https://vert-x3.github.io/advanced-vertx-guide/index.html#_dealing-with-contexts that it might not be expected, but I'm not sure if I'm interpreting the doc accurately.

I have not been able to pinpoint how/when this happens, and it puzzles me that we never observed this behaviour in other load tests (aka Techempower), so we certainly need to look better - but I won't be able to get to this today (and Davide is off). At very least I'd like to know why it doesn't ever happen on Techempower benchmark runs.

In short, the issue might have been there in older versions of vert.x too - if not, that might explain why Techempower works fine as I've not upgraded it in a while, but only if you confirm this is totally unexpected from a vert.x point of view.

Thanks!

@Sanne Sanne marked this pull request as draft April 13, 2023 08:26
@Sanne
Copy link
Member Author

Sanne commented Apr 13, 2023

Converting to draft - best to understand it better first.

@Sanne
Copy link
Member Author

Sanne commented Apr 13, 2023

I verified the status with techempower: assertions don't trigger. Now I'm puzzled, really need to figure it out

@franz1981
Copy link

I suggest to increase the number of event loop, decrease the capacity of the database connection pool, and increase the number of http connected clients: in this way some of the dB connections will be assigned to a "wrong" event loop causing cross-talk between the http end point event loop to the borrowed connection db one, and back.
Did makes sense @tsegismont ? There is a way to force the dB connection pool to get into the slow path of connection acquisition?

@tsegismont
Copy link
Contributor

I didn't expect that to be a possibility (and that's why I had introduced this assertion to begin with, so to spot illegal thread usages) but I suppose there could be valid reasons within vert.x internals (e.g. load-balancing of eventloops?); please let me know if it's totally unexpected.

@Sanne found some time to give a try to the reproducer today. I can see the assertion failures in the logs. I'll come back to you asap.

@tsegismont
Copy link
Contributor

tsegismont commented Apr 17, 2023

@Sanne @franz1981 I found the problem and filed this: #1595

I don't know which particular techempower benchmark you are talking about in previous comments, but I suppose it doesn't use ID generation with sequence or table.

Also, @jponge has brought quarkusio/quarkus#32665 to my attention today. This is another problem: persist is invoked on a worker thread which seems no longer tolerated by HR.

@Sanne
Copy link
Member Author

Sanne commented Apr 17, 2023

@Sanne @franz1981 I found the problem and filed this: #1595

Awesome work!

I don't know which particular techempower benchmark you are talking about in previous comments, but I suppose it doesn't use ID generation with sequence or table.

You're right, our existing load tests don't use ID generation - that explains it.

Also, @jponge has brought quarkusio/quarkus#32665 to my attention today. This is another problem: persist is invoked on a worker thread which seems no longer tolerated by HR.

I've noticed the issue - in theory this was never tolerated.

@Sanne
Copy link
Member Author

Sanne commented Apr 17, 2023

So I'll close this as removing the assertions is not the right approach. We'll continue with #1595

@Sanne Sanne closed this Apr 17, 2023
@Sanne Sanne deleted the RelaxingAssertions branch April 17, 2023 20:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants