Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add performance investigation #12

Closed
wants to merge 9 commits into from
Closed

feat: add performance investigation #12

wants to merge 9 commits into from

Conversation

ddanielsantos
Copy link
Contributor

closes #8

details

Implemented a benchmark for the map_to_html function, to do it, i moved the content from src/util.rs to src/lib.rs (see library crate)

I also tried to add a benchmark the download_blogs function, but the 100 iterations would take a lot of time:

Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 1603.1s, or reduce sample count to 10.
Benchmarking map to html: Collecting 100 samples in estimated 1603.1 s (100 iterations)

I can reduce the iterations or try to figure out another solution, what do you think fits better?

@AntoniosBarotsis
Copy link
Owner

AntoniosBarotsis commented Sep 29, 2022

Made a small change to make adding more benchmarks easier.

The one you added works great and definitely shows that the performance bottleneck is in the HTTP calls.

I was thinking of adding a benchmark that just runs the code once for the downloads, that should be representative enough for me. (could you work on that?) I would be interested in getting a call heatmap for that.

I was looking into pprof-rs but I keep getting compilation errors whenever I try to add it, wondering if you could get it to work somehow.

Also let me know if you are participating in hacktober because in that case I'll merge this in a few days :)

@AntoniosBarotsis
Copy link
Owner

Ok so after digging into it for a little bit, it seems that the library is currently not officially supported on windows.

Found this issue and this MR but it's not fully implemented yet, unfortunately.

@ddanielsantos
Copy link
Contributor Author

ddanielsantos commented Sep 29, 2022

I was thinking of adding a benchmark that just runs the code once for the downloads, that should be representative enough for me. (could you work on that?) I would be interested in getting a call heatmap for that.

Yeah, for sure, I'll dig more into the docs later and try to do this.

Also let me know if you are participating in hacktober because in that case I'll merge this in a few days :)

Yes, I am. Also, this project looks good btw, I'll try to contribute to it whenever possible.

@AntoniosBarotsis
Copy link
Owner

Another thing that might be useful is to be able to view download times per link (in case one specific website is disproportionately slow for some reason, it would be nice to know). Not sure how this would work with a proper benchmark lib (we could also just manually time the downloads) but just throwing ideas out there 👍

I basically want to have a good enough idea of what hinders performance before I decide whether I want async or not

@ddanielsantos
Copy link
Contributor Author

I was searching about setting a lower number of samples for Criterion benchmarks (to bench the download_blog function) and discovered that the number of samples can't be < 10

here's the documentation about this

running the bench with 10 samples resulted in this:

Benchmarking download blogs/download blogs: Warming up for 3.0000 s
Warning: Unable to complete 10 samples in 5.0s. You may wish to increase target time to 204.2s.
download blogs/download blogs                                                                          
                        time:   [5.4904 s 7.1811 s 9.3531 s]
Found 1 outliers among 10 measurements (10.00%)
  1 (10.00%) high mild

what do you think?

@AntoniosBarotsis
Copy link
Owner

So the < 10 limit makes sense for Criterion since it's supposed to be a "statistics-driven" benchmarking lib but this works fine for me.

We could increase the measurement time with this function but I'm not sure if there's a reason to do that since that will always depend on a lot of external factors.

I think this is basically what I wanted! Could you push your work so I could take a look?

@AntoniosBarotsis AntoniosBarotsis marked this pull request as draft September 30, 2022 15:55
@AntoniosBarotsis
Copy link
Owner

Converting this to a draft so I don't click the big green "Merge" button accidentally

@AntoniosBarotsis
Copy link
Owner

Ok so, I made a per link benchmark and I found out that one of the feeds I was using is disproportionally slow which is interesting

image

But that's about it, everything else is great, and thanks a lot for contributing!! I'll merge in 1-2 days to make sure the MR counts.

@AntoniosBarotsis
Copy link
Owner

I fixed some linting warnings I was getting after the introduction of lib.rs, thanks again!

@AntoniosBarotsis
Copy link
Owner

AntoniosBarotsis commented Sep 30, 2022

From the hacktober website Your PR/MRs must be created between October 1 and October 31 (in any time zone, UTC-12 thru UTC+14). so just merging it after the 1st doesn't count. You might want to update your local branch (pull in the last few changes I made) and create another MR.

@ddanielsantos
Copy link
Contributor Author

Oh no 😅, ok I'll do it. Thanks again man, if you find any other issue someday, feel free to contact me

@ddanielsantos ddanielsantos marked this pull request as ready for review September 30, 2022 21:52
@AntoniosBarotsis
Copy link
Owner

AntoniosBarotsis commented Oct 1, 2022

@ddanielsantos waiting for the MR again 👀

@ddanielsantos
Copy link
Contributor Author

done 🫡

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Performance investigation [Rust]
2 participants