Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

First pass at C# benchmarks #5802

Merged
merged 2 commits into from
Mar 13, 2019
Merged

Conversation

jskeet
Copy link
Contributor

@jskeet jskeet commented Mar 1, 2019

For simplicity - and consistency with the conformance tests - I've put the benchmarking code in the same area of source control as the main Google.Protobuf code.

I'm sure I'll need to edit Makefile.am to list files etc before we merge, but I wanted to get everything else approved first.

public SerializationConfig(string resource)
{
var data = LoadData(resource);
BenchmarkDataset dataset = BenchmarkDataset.Parser.ParseFrom(data);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: var?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, yes.

subTests = Configuration.Payloads.Select(p => new SubTest(p, parser.ParseFrom(p))).ToArray();
}

[IterationSetup]
Copy link
Contributor

@JamesNK JamesNK Mar 1, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InterationSetup is something to try and avoid - https://benchmarkdotnet.org/articles/features/setup-and-cleanup.html#sample-introsetupcleanupiteration

I think you should move Reset into the WriteToStream test. Changing the stream position on a MemoryStream shouldn't have a noticeable effect on the test. And it isn't needed for Parse.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, happy to do that.

{
/// <summary>
/// Benchmark for serializing (to a MemoryStream) and deserializing (from a ByteString).
/// Over time we may wish to test the various different approaches to serialization and deserialization separately.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The way gRPC uses Google.Protobuf is with byte arrays:

Writing: Google.Protobuf.MessageExtensions.ToByteArray(message)
Reading: var byteArray = CopyMessageToNewByteArray(); staticMessageParserOfT.ParseFrom(byteArray)

I think two more benchmarks would be useful that include the overhead of allocating the byte array each time.

I don't mind adding them in a follow up PR if this is just for the basics.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine, I'll do it in this one.

[Benchmark]
public void WriteToStream()
{
foreach (var item in subTests)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this aggregate all the tested scenarios into one result? It would be useful to see grauar results, e.g. serializing a small message was fast but a message with a large large string was slower

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that each scenario can contain multiple messages. As per benchmarks.proto:

  // The payload(s) for this dataset.  They should be parsed or serialized
  // in sequence, in a loop, ie.
  //
  //  while (!benchmarkDone) {  // Benchmark runner decides when to exit.
  //    for (i = 0; i < benchmark.payload.length; i++) {
  //      parse(benchmark.payload[i])
  //    }
  //  }
  //
  // This is intended to let datasets include a variety of data to provide
  // potentially more realistic results than just parsing the same message
  // over and over.  A single message parsed repeatedly could yield unusually
  // good branch prediction performance.

For more granular results, you'd have separate datasets with one payload each. It's fine to have multiple datasets for the same message.

@JamesNK
Copy link
Contributor

JamesNK commented Mar 1, 2019

In case you haven't seen it, this is the issue where we're discussing how to improve gRPC+Google.Protobuf perf and allocations: grpc/grpc-dotnet#30

Copy link
Contributor Author

@jskeet jskeet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, will add another commit on Monday.

{
/// <summary>
/// Benchmark for serializing (to a MemoryStream) and deserializing (from a ByteString).
/// Over time we may wish to test the various different approaches to serialization and deserialization separately.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's fine, I'll do it in this one.

subTests = Configuration.Payloads.Select(p => new SubTest(p, parser.ParseFrom(p))).ToArray();
}

[IterationSetup]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, happy to do that.

[Benchmark]
public void WriteToStream()
{
foreach (var item in subTests)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea is that each scenario can contain multiple messages. As per benchmarks.proto:

  // The payload(s) for this dataset.  They should be parsed or serialized
  // in sequence, in a loop, ie.
  //
  //  while (!benchmarkDone) {  // Benchmark runner decides when to exit.
  //    for (i = 0; i < benchmark.payload.length; i++) {
  //      parse(benchmark.payload[i])
  //    }
  //  }
  //
  // This is intended to let datasets include a variety of data to provide
  // potentially more realistic results than just parsing the same message
  // over and over.  A single message parsed repeatedly could yield unusually
  // good branch prediction performance.

For more granular results, you'd have separate datasets with one payload each. It's fine to have multiple datasets for the same message.

public SerializationConfig(string resource)
{
var data = LoadData(resource);
BenchmarkDataset dataset = BenchmarkDataset.Parser.ParseFrom(data);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably, yes.

@jskeet
Copy link
Contributor Author

jskeet commented Mar 4, 2019

@JamesNK: I've pushed an extra commit; PTAL.

@jskeet jskeet assigned jtattermusch and anandolee and unassigned anandolee Mar 5, 2019
@jskeet
Copy link
Contributor Author

jskeet commented Mar 5, 2019

The error from Kokoro looks like it's because it's using a very old .NET Core SDK:

/opt/dotnet/sdk/1.0.3/NuGet.targets
...
Errors in /tmp/protobuf/protobuf/csharp/src/Google.Protobuf.Benchmarks/Google.Protobuf.Benchmarks.csproj
      Package BenchmarkDotNet 0.11.4 is not compatible with netcoreapp2.2 (.NETCoreApp,Version=v2.2). Package BenchmarkDotNet 0.11.4 supports: netstandard2.0 (.NETStandard,Version=v2.0)
      One or more packages are incompatible with .NETCoreApp,Version=v2.2.

I'm not sure why it's using anything from 1.0.3 when global.json requires 2.2.100...

@ObsidianMinor
Copy link
Contributor

It's still using 1.0.3 because the .NET Core target on Google.Protobuf.Test is netcoreapp1.0 @jskeet. The SDK is still being downloaded in the kokoro dockerfile. The global.json requires the 2.2.100 sdk be used to run dotnet commands but doesn't prevent other runtimes from running tests and such.

@jskeet
Copy link
Contributor Author

jskeet commented Mar 5, 2019

@ObsidianMinor: But it's not Google.Protobuf.Test that's causing the problem - it's Google.Protobuf.Benchmarks, which targets netcoreapp2.2. It appears the SDK thinks that netcoreapp2.2 doesn't support netstandard2.0 - which would be somewhat explained by it not using the expected SDK. I'll check the dockerfile when I get the chance.

(The same project file works absolutely fine on my box, of course.)

@ObsidianMinor
Copy link
Contributor

ObsidianMinor commented Mar 5, 2019

Right, the real issue is

.NET Core SDK looks for a global.json file in the current working directory (which isn't necessarily the same as the project directory) or one of its parent directories.

The current working directory for dotnet commands in kokoro is the root repository directory. So dotnet restore csharp/src/Google.Protobuf.sln will never check for a global.json file. @jskeet

So it's possible a compatible SDK isn't being installed at all.

@jskeet
Copy link
Contributor Author

jskeet commented Mar 5, 2019

So it's possible a compatible SDK isn't being installed at all.

Aha. I'll try moving global.json higher up the directory hierarchy...

@jskeet
Copy link
Contributor Author

jskeet commented Mar 5, 2019

Still no joy. Will have another look tomorrow :(

@JamesNK
Copy link
Contributor

JamesNK commented Mar 5, 2019

FYI I'm sick at the moment. I'll look at this when I get back

@jtattermusch
Copy link
Contributor

More info on test failures:

I believe whenever the Dockerfile gets updated, the corresponding image needs to be built and pushed to dockerhub to enable testing (which hasn't been done, that's why all the linux tests are failing with an odd error message)

+ DOCKER_IMAGE_NAME=grpctesting/protobuf_9018e2155b9fd5f58a2c067aa1ab1eedc15fe704
+ docker pull grpctesting/protobuf_9018e2155b9fd5f58a2c067aa1ab1eedc15fe704
Using default tag: latest
Error response from daemon: pull access denied for grpctesting/protobuf_9018e2155b9fd5f58a2c067aa1ab1eedc15fe704, repository does not exist or may require 'docker login'


<PropertyGroup>
<OutputType>Exe</OutputType>
<TargetFramework>netcoreapp2.2</TargetFramework>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why specifically netcoreapp2.2?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm expecting that to be more interesting than netcoreapp1.0, which is basically ancient. If we're looking at improving performance with modern APIs, I'm most interested in testing in modern runtimes too.

That said, I certainly wouldn't object to using multiple target frameworks. We can't target netcoreapp1.0 though with the current version of BenchmarkDotNet.

@jskeet
Copy link
Contributor Author

jskeet commented Mar 6, 2019

I believe whenever the Dockerfile gets updated, the corresponding image needs to be built and pushed to dockerhub to enable testing (which hasn't been done, that's why all the linux tests are failing with an odd error message)

Thanks. Let's work out exactly what we want to do before doing that.

@@ -30,7 +30,7 @@ RUN echo "deb http://ppa.launchpad.net/ondrej/php/ubuntu trusty main" | tee /etc
# Install dotnet SDK based on https://www.microsoft.com/net/core#debian
# (Ubuntu instructions need apt to support https)
RUN apt-get update && apt-get install -y --force-yes curl libunwind8 gettext && \
curl -sSL -o dotnet.tar.gz https://go.microsoft.com/fwlink/?LinkID=847105 && \
curl -sSL -o dotnet.tar.gz https://download.visualstudio.microsoft.com/download/pr/69937b49-a877-4ced-81e6-286620b390ab/8ab938cf6f5e83b2221630354160ef21/dotnet-sdk-2.2.104-linux-x64.tar.gz && \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like other projects are still using netcoreapp1.0, so they will need the older SDK installed to be able to run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. We should work out whether we want to still test against netcoreapp1.0 (by installing multiple SDKs), or update everything to 2.x at the same time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should update all the .exe projects to netcoreapp2.1 because that corresponds to an LTS version of the SDK (btw, that strategy is not based on a MSFT best practice or anything, it just seems as logical choice to me after experiencing some of the pains with missing SDK versions and limitations of global.json that doesn't allow wildcards)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That SGTM. How about I split that task into a separate PR that just updates stuff, and we can get that working before coming back to benchmarks?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be great. Btw I think I have the ability to rebuild and push the dockerimage once needed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw we also need to be careful about not breaking

dotnet pack -c Release src/Google.Protobuf.sln /p:SourceLinkCreate=true || goto :error

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. So we need to update the Windows image as well. Are you in a position to do that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's no simple way to "update windows image" on kokoro. There's a version of dotnet SDK preinstalled, but I don't know what exactly it is. For windows build, we might need to install dotnet SDK via a script. I'm actually surprised that we don't have a presubmit build for protobuf C# on Windows - I wil try to work with protobuf team to setup one.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's a PR already up for updating projects over at #5838

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ObsidianMinor: Wonderful, thanks very much :)

@jtattermusch
Copy link
Contributor

I believe whenever the Dockerfile gets updated, the corresponding image needs to be built and pushed to dockerhub to enable testing (which hasn't been done, that's why all the linux tests are failing with an odd error message)

Thanks. Let's work out exactly what we want to do before doing that.

Here's more details btw
https://github.com/protocolbuffers/protobuf/blob/c3340b20a8dbf230ddbc68096ea1e8631317d528/kokoro/linux/dockerfile/push_testing_images.sh

@JamesNK
Copy link
Contributor

JamesNK commented Mar 6, 2019

Overall it looks good. One improvement is to add MemoryDiagnoser to the benchmarkdotnet configuration.

https://benchmarkdotnet.org/articles/configs/configs.html
https://benchmarkdotnet.org/articles/configs/diagnosers.html

Here is a config that I think will suit this project:

internal class BenchmarkDotNetConfig : ManualConfig
{
    public BenchmarkDotNetConfig()
    {
        Add(ConsoleLogger.Default);
        Add(MarkdownExporter.GitHub);

        Add(MemoryDiagnoser.Default);
        Add(StatisticColumn.OperationsPerSecond);
        Add(DefaultColumnProviders.Instance);

        Add(JitOptimizationsValidator.FailOnError);
    }
}

With it enabled:

|              Method |        Configuration |       Mean |     Error |    StdDev |        Op/s | Gen 0/1k Op | Gen 1/1k Op | Gen 2/1k Op | Allocated Memory/Op |
|-------------------- |--------------------- |-----------:|----------:|----------:|------------:|------------:|------------:|------------:|--------------------:|
|       WriteToStream | googl(...)roto3 [22] |   984.4 ns | 11.192 ns | 10.469 ns | 1,015,867.7 |      5.2910 |           - |           - |              4168 B |
|         ToByteArray | googl(...)roto3 [22] |   779.1 ns |  7.575 ns |  6.715 ns | 1,283,590.2 |      0.3757 |           - |           - |               296 B |
| ParseFromByteString | googl(...)roto3 [22] |   828.5 ns |  5.796 ns |  5.138 ns | 1,207,052.3 |      1.2503 |           - |           - |               984 B |
|     ParseFromStream | googl(...)roto3 [22] | 1,337.3 ns | 25.909 ns | 26.607 ns |   747,754.0 |      6.4754 |           - |           - |              5104 B |

@jskeet
Copy link
Contributor Author

jskeet commented Mar 7, 2019

Done. Once we've got the environment sorted, we should be able to add this easily...

@jtattermusch
Copy link
Contributor

#5838 has been merged, let's rebase this PR

@jskeet jskeet force-pushed the csharp-benchmarks branch from 70be6db to d5d621a Compare March 12, 2019 07:19
@jskeet
Copy link
Contributor Author

jskeet commented Mar 12, 2019

Have rebased to a single commit - hopefully that will get everything building. I'll add another commit to fix the Distcheck (which I still don't see the point of, in terms of C# files, but that's another story).

As a heads-up, later in the week I'm hoping to find time to convert the proto2-only benchmarks to proto3 as well, so there'll be more to add here - but it doesn't matter to me whether that happens as part of this PR or as a new one later.

@jtattermusch
Copy link
Contributor

jtattermusch commented Mar 12, 2019

As a heads up - the C# linux tests might be lying - they are being skipped (fix is in #5876, might be safer to wait until it's in)

Running tests.
+ dotnet test -c Release -f netcoreapp1.0 csharp/src/Google.Protobuf.Test/Google.Protobuf.Test.csproj
Microsoft (R) Build Engine version 15.9.20+g88f5fadfbe for .NET Core
Copyright (C) Microsoft Corporation. All rights reserved.
  Restore completed in 44.11 ms for /tmp/protobuf/protobuf/csharp/src/Google.Protobuf/Google.Protobuf.csproj.
  Restore completed in 41.4 ms for /tmp/protobuf/protobuf/csharp/src/Google.Protobuf.Test/Google.Protobuf.Test.csproj.
Skipping running test for project /tmp/protobuf/protobuf/csharp/src/Google.Protobuf.Test/Google.Protobuf.Test.csproj. To run tests with dotnet test add "<IsTestProject>true<IsTestProject>" property to project file.

Copy link
Contributor

@JamesNK JamesNK left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me. Something to interate and add to in the future.

One suggestion: it would be a good idea to do call the benchmarks during CI build and run one iteration. We do this in the asp.net core build to ensure the code changes haven't broken the benchmarks.

@jskeet
Copy link
Contributor Author

jskeet commented Mar 12, 2019

Yes, once we've got the basic code in - and potentially some more benchmarks - there are all kinds of things we might want to do in CI etc. (For Noda Time I keep a record of all the benchmarks so I can see progress over time.)

@jskeet
Copy link
Contributor Author

jskeet commented Mar 13, 2019

Merging as the Ruby failure is unrelated to this change.

@jskeet jskeet merged commit 66fb3ce into protocolbuffers:master Mar 13, 2019
@jskeet jskeet deleted the csharp-benchmarks branch March 13, 2019 15:27
@jtattermusch
Copy link
Contributor

@jskeet I noticed that the "Windows Csharp Release" job now creates a nupkg for the benchmarks project - I guess that's unintended?

http://cnsviewer2/placer/prod/home/kokoro-dedicated/build_artifacts/prod/protobuf/github/master/windows/csharp_release/presubmit/17/20190312-020808

@jskeet
Copy link
Contributor Author

jskeet commented Mar 14, 2019

I don't know much about that side of things - @jtattermusch probably knows more.

@jtattermusch
Copy link
Contributor

We are running this command to create the packages:

dotnet pack -c Release src/Google.Protobuf.sln /p:SourceLinkCreate=true || goto :error

So it looks like dotnet thinks that Google.Protobuf.Benchmarks should create a nuget package (despite it being an exe project).

@jskeet
Copy link
Contributor Author

jskeet commented Mar 14, 2019

Ah, I can add <IsPackable>false</IsPackable> to the project file very easily. Will do that in another PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants