Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unbounded message queue #1

Open
wants to merge 2 commits into
base: workers-implementation
Choose a base branch
from
Open

Unbounded message queue #1

wants to merge 2 commits into from

Conversation

schets
Copy link

@schets schets commented Oct 4, 2015

Changed the message queue to be unbounded to simplify message passing logic and avoid a lock.

This has been tested against the unit tests, workers test, and the fifo properties/memory safety have been tested with valgrind under contention.

git botched the diff on producer-consumer-queue.h, but the entire thing has been changed.

@petkaantonov
Copy link
Owner

I'd like to keep the current queue for now, afaik it's the fastest possible way as long as the bounds aren't reached.

@schets
Copy link
Author

schets commented Oct 7, 2015

It depends on the cpu and compiler - on my laptops with newer intel cpus and clang/msvc, the unbounded version performs much better. On an older sandy bridge server, with an old gcc, it's reversed. I haven't gotten a chance to test on a newer server, or really investigate what is causing the difference. In both cases, the unbounded queue performs better when a small amount of work is performed after successful pushes/pops.

Unintuitively, on the server, removing buffers greatly improves performance of the unbounded queue, while having no effect on the bounded queue. The opposite holds on my laptop - the buffer greatly increases performance.

The difference between the two is at most ~30ns per push-pop pair in all situations that I've tested. I doubt that will make any difference given that objects must be serialized and deserialized before sending. Anyways, as seen by the benchmarks, 'fastest' for this type of data structure is highly architecture dependent and full of magic. I think that the simpler interface is worth the potential performance difference, especially if newer architectures are the ones that perform better with the unbounded queue (and not xeon vs core).

Here's the benchmark code.

@schets
Copy link
Author

schets commented Oct 7, 2015

Actually, I noticed an error in my testing - without the buffer, the unbounded queue performs better than the bounded queue on my laptop as well. This still probably doesn't mean a whole lot, as I've only tested on 2 architectures. In any case, the performance difference is probably very small compared to that actual work done on objects entering the queue.

I'll push those changes when I get home from work

@petkaantonov petkaantonov force-pushed the workers-implementation branch 3 times, most recently from 7e17b1f to 7ece8d7 Compare February 13, 2016 14:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants