-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Start of benchmark suite #11
Conversation
@parroty, I was hoping you help me figure out what I'm doing wrong with the mocking here? I attempted to duplicate essentially what the streaming test was doing but the stream object doesn't receive seem events when sending mock data to the http module. (Also: see a number of TODOs in my code to see where I'm going with this, and for some open questions you might be able to shed some light on. There is certainly a bit of code duplication here I will work on refactoring out once I can get things working properly.) |
Thanks, the benchmark is great. I don't have clear idea about the mocking yet, but will be looking into it later (as I don't have my laptop with me now). |
The issue seems reproduced when the pids of It means the information is not isolated in the stream (not good) and I'll try to fix (not sure yet, but maybe through separate gen_server for storing tweets), but tentative solution might be calling both from the diff --git a/bench/stream_bench.exs b/bench/stream_bench.exs
index 5fb0a4b..458ec67 100644
--- a/bench/stream_bench.exs
+++ b/bench/stream_bench.exs
@@ -36,7 +36,10 @@ defmodule BasicBench do
# actual benchmark iteration, process tweet message from the streaming API.
# mocks the stream ahead of time so mock setup doesn't affect benchmark time.
- bench "stream tweets", [stream: mock_stream()] do
+ bench "stream tweets", [mock: mock_stream()] do
+ # create stream
+ stream = ExTwitter.stream_filter(track: "twitter", receive_messages: true)
+ :timer.sleep(100) # put small wait for async, as per _stream_test.exs
send_mock_data(TestHelper.TestStore.get, @mock_tweet_json)
tweet = Enum.take(stream, 1) |> List.first
# TODO[bench]: remove below line for actual benchmarking once we get working
@@ -52,11 +55,6 @@ defmodule BasicBench do
# mock oauth so when we create the stream it uses the mocked post method
mock_oauth!
-
- # create stream
- stream = ExTwitter.stream_filter(track: "twitter", receive_messages: true)
- :timer.sleep(100) # put small wait for async, as per _stream_test.exs
- stream
end
defp mock_oauth! do |
Oh interesting. Hmm. The problem with putting the stream creation in the bench block is it will be considered part of the benchmark, when in order to benchmark we need to isolate the time just to parse a single message. This may be more a limitation of the benchfella library than an issue with extwitter. I'll see if I can figure out another way to get the data in.. |
Okay, I think I have an idea on how to do this via mocking input to the process_stream function directly, will report back soon. |
2036fbc
to
7188bdd
Compare
Okay, made a little progress after banging my head on this for 5 hours. I rebased on the Now the way it works is much simpler, because I made two methods public in the It somewhat works:
What I still need to do here is:
|
Thanks, it's great. Shall I merge with this commit? (Some of the remaining items might be continuous efforts). Once it becomes good set of commits, please indicate so. |
I still need to figure out how to mock out the tweet JSON parsing. I'm having trouble getting meck to work the way I would expect it to... |
@@ -102,7 +103,7 @@ defmodule ExTwitter.API.Streaming do | |||
is_end_of_message(part) -> | |||
message = Enum.reverse([part|acc]) | |||
|> Enum.join("") | |||
|> parse_tweet_message(configs) | |||
|> __MODULE__.parse_tweet_message(configs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note this change, which was required to get Meck working for this.
Okay, I think this is pretty much working now. You might want to give my code a review to make sure I didn't do anything dumb. Note that in order to properly stub out the tweet parsing for benchmark isolation, the benchmark scripts must now be run individually instead of in aggregate via Sample output (note these number should look different from earlier output since I'm working on a much faster desktop machine right now):
There is still some extra overhead in the stream process benchmarks due to spawning the process but I believe it's minimal enough now that these benchmarks can be used to aide any performance oriented refactoring (and provide a framework for adding more benchmarks as needed). Please let me know any comments/suggestions/concerns, and if you'd like me to squash the commits I can do that as well. Thanks for your help and for the great library! |
Thanks, it's great. I can merge, but maybe it's good to squash as you mentioned. They comprise a good meaningful set of commits for this PR item. |
see notes in bench/README.md regarding some caveats introduced with the current benchmark setup requires minor restructuring of some code to enable stubbing
Okay, I squashed this down to two commits now (I preserved the modification of the mix file to introduce the benchfella dependency as separate from the introduction of the benchmarks). |
Thanks! I'll be experimenting some with the bench. |
I'll dig in. I am not sure if the current benchmarks even represent the biggest areas for improvement, they were just my guesses based on my experiences with ttezel/twit#150 and what I saw in other clients in the http://github.com/mroth/twitter-streaming-showdown tests. A good place to start might also be using eprof or something like that to find the hot spots during streaming then write some benchmarks around that to experiment. I don't have any real experience with Erlang code profiling but I'll try to give it a shot if I have some spare time next week. |
As part of the work I've been doing in mroth/twitter-streaming-showdown, I've been looking at the performance and features of various Twitter streaming libraries. While ExTwitter is miles ahead of most in terms of having clear and well tested code, it's relative performance in high-throughput environments is less optimal.
Looking through the stream code, I have some ideas as to how to possibly make it more efficient I'd like to experiment with. But of course, you can't improve performance if you can't measure it!
Thus, as a first step, I wanted to create a benchmark test around stream processing so I could test these changes in a controlled fashion.
However, I'm having some difficulty getting the benchmarks to work so I'm doing an early PR to start a conversation. Details to follow in comment.