Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance issues on Bref PHP 8.1 and Later #1528

Open
yh1224 opened this issue May 10, 2023 · 29 comments
Open

Performance issues on Bref PHP 8.1 and Later #1528

yh1224 opened this issue May 10, 2023 · 29 comments

Comments

@yh1224
Copy link

yh1224 commented May 10, 2023

I am experiencing a significant performance issue with the AWS SDK for PHP when running on Bref PHP 8.1 and subsequent versions. Specifically, I have observed that accessing AWS with the SDK is approximately three times slower as compared to Bref PHP 8.0.

To further analyze this, I conducted the same processes on an EC2 instance and found no noticeable performance difference between them. The version of the AWS SDK for PHP I am using is 3.269.9.

Moreover, I conducted a test measuring the time it took to execute 100 queries on DynamoDB. Here are the results:

Environment Specification Elapsed
EC2 PHP 8.0.28 c5.large 7.5 sec
EC2 PHP 8.1.18 c5.large 7.5 sec
EC2 PHP 8.2.5 c5.large 7.5 sec
Bref PHP 8.0.28 (php-80:44) Memory: 1024 MB 4 sec
Bref PHP 8.1.18 (php-81:43) Memory: 1024 MB 11 sec
Bref PHP 8.2.5 (php-82:33) Memory: 1024 MB 11 sec
@yh1224 yh1224 changed the title Subject: Performance issues on Bref PHP 8.1 and Later Performance issues on Bref PHP 8.1 and Later May 10, 2023
@mnapoli
Copy link
Member

mnapoli commented May 10, 2023

Thanks for the report and the details!

The one thing that jumps to mind would be OpenSSL 3, that was applied to PHP 8.1 and 8.2: brefphp/aws-lambda-layers#74

A quick search yield these issues for example:

Would you be able to run the same test with Bref 2.0.2 (here are the layer versions: https://runtimes.bref.sh/?region=us-east-1&version=2.0.2)? (this is before OpenSSL 3)

cc @GrahamCampbell if that rings a bell to you?

@yh1224
Copy link
Author

yh1224 commented May 10, 2023

@mnapoli

I tried using the Layer compatible with Bref 2.0.2 and there was no degradation in performance as shown below.

Environment Layer Elapsed
Bref 2.0.2 / PHP 8.0.28 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-80:42 4 sec
Bref 2.0.2 / PHP 8.1.17 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-81:41 4 sec
Bref 2.0.2 / PHP 8.2.4 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-82:31 4 sec
Bref 2.0.4 / PHP 8.0.28 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-80:44 4 sec
Bref 2.0.4 / PHP 8.1.18 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-81:43 11 sec
Bref 2.0.4 / PHP 8.2.5 arn:aws:lambda:ap-northeast-1:534081306603:layer:php-82:33 11 sec

I would like to use this older version for now. Thank you very much.

@GrahamCampbell
Copy link
Contributor

Interesting, I have not noticed any excess latency on my workloads (though my workloads are super spiky, so it's hard to notice), and I have some that send 1000s of requests to dynamo using the symfony curl http client with a parallelism of 64 (a curl parameter). I am reluctant to suggest that we rollback to openssl 1.1 because of the fact that it will be EOL very soon, leaving us without any security patches. I wonder if we should do a build with openssl 3.1, and see if it also have the same performance issues for @yh1224.

@GrahamCampbell
Copy link
Contributor

Moreover, I conducted a test measuring the time it took to execute 100 queries on DynamoDB.

Are you re-using the client? Can you share a minimal re-producer please?

@GrahamCampbell
Copy link
Contributor

Is the same issue present on arm64 - that is, is this just an issue for x86_64?

@yh1224
Copy link
Author

yh1224 commented May 10, 2023

@GrahamCampbell

Are you re-using the client? Can you share a minimal re-producer please?

I create the DynamoDbClient each time.

Is the same issue present on arm64 - that is, is this just an issue for x86_64?

Same with arm64. I feel that it is a little more slower than x86_64.

Environment Layer Elapsed
arm64 / Bref 2.0.2 / PHP 8.0.28 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-80:43 4s
arm64 / Bref 2.0.2 / PHP 8.1.17 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-81:22 4s
arm64 / Bref 2.0.2 / PHP 8.2.4 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-82:19 4s
arm64 / Bref 2.0.4 / PHP 8.0.28 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-80:45 4s
arm64 / Bref 2.0.4 / PHP 8.1.18 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-81:24 13s
arm64 / Bref 2.0.4 / PHP 8.2.5 arn:aws:lambda:ap-northeast-1:534081306603:layer:arm-php-82:21 13s

Here is the CDK project to check.
https://github.com/yh1224/aws-cdk-lambda-bref-test

The source is below.
https://github.com/yh1224/aws-cdk-lambda-bref-test/tree/main/src/TestFunc

@GrahamCampbell
Copy link
Contributor

GrahamCampbell commented May 10, 2023

I create the DynamoDbClient each time.

This is probably the issue. You should avoid doing that. I think what's happening here is that the TLS handshake has to happen over and over. That would also explain why I haven't noticed this. Not reusing connections is very expensive.

@GrahamCampbell
Copy link
Contributor

openssl/openssl#17064

I think this issue is unrelated. There is only one thread here.

@mnapoli
Copy link
Member

mnapoli commented May 10, 2023

This is probably the issue. You should avoid doing that. I think what's happening here is that the TLS handshake has to happen over and over. That would also explain why I haven't noticed this. Not reusing connections is very expensive.

I would argue that:

  • most PHP apps do this, so we have to consider it (it's not an edge case)
  • this lets us identify the underlying issue

Fact is that OpenSSL 3 has an issue. Reusing connections is just a workaround, and it's not good for most PHP apps.

@GrahamCampbell
Copy link
Contributor

I strongly disagree that it is a workaround. Setting up a new TLS connection is an expensive thing to do, and application developers should always reuse connections, in the same way that it is standard practice to reuse database connections.

@GrahamCampbell
Copy link
Contributor

Most PHP apps that I have seen do re-use connections. Anyone using the Laravel Guzzle wrapper will have connections re-used. Similarly, anyone using any Laravel native AWS implementations like SES or SQS will also have connections re-used.

@GrahamCampbell
Copy link
Contributor

Not re-using connections and not cleaning up can lead to serious issues like exceeding the max open sockets limit on Lambda (I have done this before :trollface:).

@yh1224
Copy link
Author

yh1224 commented May 10, 2023

@GrahamCampbell

I understand that this is not an issue with PHP/Bref, but is due to OpenSSL 3.

I'm aware that reusing the client is inefficient, but due to the legacy code of our current production being migrated to Lambda, it seems we can't address it immediately. As the difference in execution time directly affects the cost of using Lambda, I still want to hold out with the old version at this time.

If the performance degradation with OpenSSL is due to some "problem", I would like to wait for its resolution.

Thank you very much.

@mnapoli
Copy link
Member

mnapoli commented May 11, 2023

@GrahamCampbell maybe we are not talking about the same thing. Maybe you are talking about reusing the client in the same request? I am talking about reusing the client between Lambda invocations.

Most of the HTTP requests served with Bref are with FPM. That means each request has to do the SSL connection again (because the client cannot be kept alive). This is why I'm saying OpenSSL 3 introduces a performance regression that impacts a majority of Bref users.

@GrahamCampbell
Copy link
Contributor

Maybe you are talking about reusing the client in the same request?

Yes.

I am talking about reusing the client between Lambda invocations.

I see. I also do that, because I don't use php-fpm, but I appreciate many people don't.

This is why I'm saying OpenSSL 3 introduces a performance regression that impacts a majority of Bref users.

That is not so clear. We'd have to do some more investigation, maybe profiling with blackfire.

@M1ke
Copy link

M1ke commented Dec 12, 2023

In case it's useful, here's a Linux (Ubuntu) serverful example of this openssl 1.1 vs 3 issue when it comes to performance of PHP scripts making web requests: php/php-src#10753 (comment)

@mnapoli
Copy link
Member

mnapoli commented Dec 13, 2023

@M1ke thanks for sharing. We have hopes that OpenSSL 3.2 might fix things, but that needs to be tested.

@atrope
Copy link
Contributor

atrope commented Jan 29, 2024

Hey @mnapoli I've just found out that the performance issues i told you in the other issue is because we are making https calls.

When i can make the https in plain http the performance is the same(or better) as before.
Unfortunately there are some calls that we can't perform in http.

How can we test the new OpenSSL3.2? Maybe it will fix those issues. I was seeing on average 400ms with https calls and with plain http it is now 120ms on average.

With the fix i have hopes to make it under 100ms as it was before.

Thanks

@GrahamCampbell
Copy link
Contributor

We are waiting for the next release of postgres 15.x. The current release does not support OpenSSL 3.2.

@atrope
Copy link
Contributor

atrope commented Jan 29, 2024

@GrahamCampbell is it possible to generate a layer without postgres if we don't use it? Maybe we could have an specific version without postgres because 4x the pricing for using https is a lot.

@atrope
Copy link
Contributor

atrope commented Jan 29, 2024

Or do you know which version of php-82-fpm has openssl 3.2? That way i can test in a stressfull env anc check if this is the actual problerm

@GrahamCampbell
Copy link
Contributor

You can compile a layer yourself. Otherwise, it'll probably be around a month from now when OpenSSL 3.2.1 gets put into Bref, in combination with Postgres 15.6. It'll likely be PHP 8.1, 8.2 and 8.3. PHP 8.0 will remain on OpenSSL 1.1.1.

@atrope
Copy link
Contributor

atrope commented Jan 29, 2024

Okay, i've now tested in 1M requests the version 31 of the layer which has OpenSSL1.1.1
- arn:aws:lambda:us-east-1:534081306603:layer:php-82-fpm:31

Only by changing the newest one (58) to the old one I see the performance increase of 50%.
That said I really think we should not keep the OpenSSL3 in the production version of the Layer.

As we all know a function running on average in 140ms is 100% more expensive than one running at 70ms.

Screenshot 2024-01-29 at 16 54 56

@GrahamCampbell
Copy link
Contributor

The problem with OpenSSL 1.1.1 is that it has unpatched security vulnerabilities.

@mnapoli
Copy link
Member

mnapoli commented Jan 30, 2024

Hey @atrope! I've seen your other message, thanks for bumping this issue. Yeah, that is tempting to go back to OpenSSL 1.1.1 😞

it'll probably be around a month from now when OpenSSL 3.2.1 gets put into Bref, in combination with Postgres 15.6

Sounds like the best course of action right now unfortunately

@GrahamCampbell
Copy link
Contributor

And that time has arrived. 🎉

  1. postgres 16.2 aws-lambda-layers#161
  2. OpenSSL 3.2.1 aws-lambda-layers#162

Probably we can wait till next week to release this stuff (assuming those PRs are accepted) so that we can also include cURL 8.6.1 and the next round of PHP patch releases, both are due to land next week.

@mnapoli
Copy link
Member

mnapoli commented Feb 16, 2024

It's out in Bref 2.1.15 https://github.com/brefphp/bref/releases/tag/2.1.15 🎉

I'd love to hear if things are working better with this new version (which uses OpenSSL 3.2).

@atrope
Copy link
Contributor

atrope commented Feb 27, 2024

Hi!

Just tested it out the latest version.

We had an average of 75ms with layer 31 and with layer 61(php-82-fpm) it increased to ~85ms. Tested in the last 15minutes period.

Will leave it for the next days and check how it will perform.

@GrahamCampbell
Copy link
Contributor

What about the performance of layer 60 vs 61? Also, how does P99.9 compare - for me this was the big improvement, assuming you are reusing connections between invokes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants