-
Notifications
You must be signed in to change notification settings - Fork 15.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
First pass at C# benchmarks #5802
Conversation
public SerializationConfig(string resource) | ||
{ | ||
var data = LoadData(resource); | ||
BenchmarkDataset dataset = BenchmarkDataset.Parser.ParseFrom(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: var?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, yes.
subTests = Configuration.Payloads.Select(p => new SubTest(p, parser.ParseFrom(p))).ToArray(); | ||
} | ||
|
||
[IterationSetup] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
InterationSetup is something to try and avoid - https://benchmarkdotnet.org/articles/features/setup-and-cleanup.html#sample-introsetupcleanupiteration
I think you should move Reset into the WriteToStream test. Changing the stream position on a MemoryStream shouldn't have a noticeable effect on the test. And it isn't needed for Parse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, happy to do that.
{ | ||
/// <summary> | ||
/// Benchmark for serializing (to a MemoryStream) and deserializing (from a ByteString). | ||
/// Over time we may wish to test the various different approaches to serialization and deserialization separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The way gRPC uses Google.Protobuf is with byte arrays:
Writing: Google.Protobuf.MessageExtensions.ToByteArray(message)
Reading: var byteArray = CopyMessageToNewByteArray(); staticMessageParserOfT.ParseFrom(byteArray)
I think two more benchmarks would be useful that include the overhead of allocating the byte array each time.
I don't mind adding them in a follow up PR if this is just for the basics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine, I'll do it in this one.
[Benchmark] | ||
public void WriteToStream() | ||
{ | ||
foreach (var item in subTests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this aggregate all the tested scenarios into one result? It would be useful to see grauar results, e.g. serializing a small message was fast but a message with a large large string was slower
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that each scenario can contain multiple messages. As per benchmarks.proto:
// The payload(s) for this dataset. They should be parsed or serialized
// in sequence, in a loop, ie.
//
// while (!benchmarkDone) { // Benchmark runner decides when to exit.
// for (i = 0; i < benchmark.payload.length; i++) {
// parse(benchmark.payload[i])
// }
// }
//
// This is intended to let datasets include a variety of data to provide
// potentially more realistic results than just parsing the same message
// over and over. A single message parsed repeatedly could yield unusually
// good branch prediction performance.
For more granular results, you'd have separate datasets with one payload each. It's fine to have multiple datasets for the same message.
In case you haven't seen it, this is the issue where we're discussing how to improve gRPC+Google.Protobuf perf and allocations: grpc/grpc-dotnet#30 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, will add another commit on Monday.
{ | ||
/// <summary> | ||
/// Benchmark for serializing (to a MemoryStream) and deserializing (from a ByteString). | ||
/// Over time we may wish to test the various different approaches to serialization and deserialization separately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's fine, I'll do it in this one.
subTests = Configuration.Payloads.Select(p => new SubTest(p, parser.ParseFrom(p))).ToArray(); | ||
} | ||
|
||
[IterationSetup] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, happy to do that.
[Benchmark] | ||
public void WriteToStream() | ||
{ | ||
foreach (var item in subTests) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea is that each scenario can contain multiple messages. As per benchmarks.proto:
// The payload(s) for this dataset. They should be parsed or serialized
// in sequence, in a loop, ie.
//
// while (!benchmarkDone) { // Benchmark runner decides when to exit.
// for (i = 0; i < benchmark.payload.length; i++) {
// parse(benchmark.payload[i])
// }
// }
//
// This is intended to let datasets include a variety of data to provide
// potentially more realistic results than just parsing the same message
// over and over. A single message parsed repeatedly could yield unusually
// good branch prediction performance.
For more granular results, you'd have separate datasets with one payload each. It's fine to have multiple datasets for the same message.
public SerializationConfig(string resource) | ||
{ | ||
var data = LoadData(resource); | ||
BenchmarkDataset dataset = BenchmarkDataset.Parser.ParseFrom(data); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, yes.
@JamesNK: I've pushed an extra commit; PTAL. |
The error from Kokoro looks like it's because it's using a very old .NET Core SDK:
I'm not sure why it's using anything from 1.0.3 when global.json requires 2.2.100... |
It's still using 1.0.3 because the .NET Core target on Google.Protobuf.Test is netcoreapp1.0 @jskeet. The SDK is still being downloaded in the kokoro dockerfile. The global.json requires the 2.2.100 sdk be used to run dotnet commands but doesn't prevent other runtimes from running tests and such. |
@ObsidianMinor: But it's not Google.Protobuf.Test that's causing the problem - it's Google.Protobuf.Benchmarks, which targets netcoreapp2.2. It appears the SDK thinks that netcoreapp2.2 doesn't support netstandard2.0 - which would be somewhat explained by it not using the expected SDK. I'll check the dockerfile when I get the chance. (The same project file works absolutely fine on my box, of course.) |
Right, the real issue is
The current working directory for So it's possible a compatible SDK isn't being installed at all. |
Aha. I'll try moving global.json higher up the directory hierarchy... |
Still no joy. Will have another look tomorrow :( |
FYI I'm sick at the moment. I'll look at this when I get back |
More info on test failures: I believe whenever the Dockerfile gets updated, the corresponding image needs to be built and pushed to dockerhub to enable testing (which hasn't been done, that's why all the linux tests are failing with an odd error message)
|
|
||
<PropertyGroup> | ||
<OutputType>Exe</OutputType> | ||
<TargetFramework>netcoreapp2.2</TargetFramework> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why specifically netcoreapp2.2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm expecting that to be more interesting than netcoreapp1.0, which is basically ancient. If we're looking at improving performance with modern APIs, I'm most interested in testing in modern runtimes too.
That said, I certainly wouldn't object to using multiple target frameworks. We can't target netcoreapp1.0 though with the current version of BenchmarkDotNet.
Thanks. Let's work out exactly what we want to do before doing that. |
kokoro/linux/64-bit/Dockerfile
Outdated
@@ -30,7 +30,7 @@ RUN echo "deb http://ppa.launchpad.net/ondrej/php/ubuntu trusty main" | tee /etc | |||
# Install dotnet SDK based on https://www.microsoft.com/net/core#debian | |||
# (Ubuntu instructions need apt to support https) | |||
RUN apt-get update && apt-get install -y --force-yes curl libunwind8 gettext && \ | |||
curl -sSL -o dotnet.tar.gz https://go.microsoft.com/fwlink/?LinkID=847105 && \ | |||
curl -sSL -o dotnet.tar.gz https://download.visualstudio.microsoft.com/download/pr/69937b49-a877-4ced-81e6-286620b390ab/8ab938cf6f5e83b2221630354160ef21/dotnet-sdk-2.2.104-linux-x64.tar.gz && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks like other projects are still using netcoreapp1.0
, so they will need the older SDK installed to be able to run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. We should work out whether we want to still test against netcoreapp1.0 (by installing multiple SDKs), or update everything to 2.x at the same time.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should update all the .exe projects to netcoreapp2.1 because that corresponds to an LTS version of the SDK (btw, that strategy is not based on a MSFT best practice or anything, it just seems as logical choice to me after experiencing some of the pains with missing SDK versions and limitations of global.json that doesn't allow wildcards)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That SGTM. How about I split that task into a separate PR that just updates stuff, and we can get that working before coming back to benchmarks?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be great. Btw I think I have the ability to rebuild and push the dockerimage once needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Btw we also need to be careful about not breaking
protobuf/csharp/build_packages.bat
Line 4 in 8bdecd7
dotnet pack -c Release src/Google.Protobuf.sln /p:SourceLinkCreate=true || goto :error |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. So we need to update the Windows image as well. Are you in a position to do that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's no simple way to "update windows image" on kokoro. There's a version of dotnet SDK preinstalled, but I don't know what exactly it is. For windows build, we might need to install dotnet SDK via a script. I'm actually surprised that we don't have a presubmit build for protobuf C# on Windows - I wil try to work with protobuf team to setup one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a PR already up for updating projects over at #5838
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ObsidianMinor: Wonderful, thanks very much :)
Here's more details btw |
Overall it looks good. One improvement is to add https://benchmarkdotnet.org/articles/configs/configs.html Here is a config that I think will suit this project: internal class BenchmarkDotNetConfig : ManualConfig
{
public BenchmarkDotNetConfig()
{
Add(ConsoleLogger.Default);
Add(MarkdownExporter.GitHub);
Add(MemoryDiagnoser.Default);
Add(StatisticColumn.OperationsPerSecond);
Add(DefaultColumnProviders.Instance);
Add(JitOptimizationsValidator.FailOnError);
}
} With it enabled:
|
Done. Once we've got the environment sorted, we should be able to add this easily... |
#5838 has been merged, let's rebase this PR |
70be6db
to
d5d621a
Compare
Have rebased to a single commit - hopefully that will get everything building. I'll add another commit to fix the Distcheck (which I still don't see the point of, in terms of C# files, but that's another story). As a heads-up, later in the week I'm hoping to find time to convert the proto2-only benchmarks to proto3 as well, so there'll be more to add here - but it doesn't matter to me whether that happens as part of this PR or as a new one later. |
As a heads up - the C# linux tests might be lying - they are being skipped (fix is in #5876, might be safer to wait until it's in)
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me. Something to interate and add to in the future.
One suggestion: it would be a good idea to do call the benchmarks during CI build and run one iteration. We do this in the asp.net core build to ensure the code changes haven't broken the benchmarks.
Yes, once we've got the basic code in - and potentially some more benchmarks - there are all kinds of things we might want to do in CI etc. (For Noda Time I keep a record of all the benchmarks so I can see progress over time.) |
Merging as the Ruby failure is unrelated to this change. |
@jskeet I noticed that the "Windows Csharp Release" job now creates a nupkg for the benchmarks project - I guess that's unintended? |
I don't know much about that side of things - @jtattermusch probably knows more. |
We are running this command to create the packages: protobuf/csharp/build_packages.bat Line 4 in 66fb3ce
So it looks like |
Ah, I can add |
For simplicity - and consistency with the conformance tests - I've put the benchmarking code in the same area of source control as the main Google.Protobuf code.
I'm sure I'll need to edit Makefile.am to list files etc before we merge, but I wanted to get everything else approved first.