Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: add new lua-resty-events as events implementation #10550

Merged
merged 46 commits into from
Dec 14, 2023

Conversation

bzp2010
Copy link
Contributor

@bzp2010 bzp2010 commented Nov 27, 2023

Description

Provide lua-resty-events as a new implementation of the event system. It experimentally provides better performance and efficiency.
fix #10431

Checklist

  • I have explained the need for this PR and the problem it solves
  • I have explained the changes or the new features added to this PR
  • I have added tests corresponding to this change
  • I have updated the documentation to reflect this change
  • I have verified that this change is backward compatible (If not, please discuss on the APISIX mailing list first)

@bzp2010 bzp2010 force-pushed the feat-new-events-queue branch from f366245 to 7aea90c Compare December 4, 2023 03:15
@bzp2010 bzp2010 marked this pull request as ready for review December 8, 2023 01:59
ci/centos7-ci.sh Outdated
@@ -41,11 +41,15 @@ install_dependencies() {
# install openresty to make apisix's rpm test work
yum install -y yum-utils && yum-config-manager --add-repo https://openresty.org/package/centos/openresty.repo
wget "https://raw.githubusercontent.com/api7/apisix-build-tools/apisix-runtime/${APISIX_RUNTIME}/build-apisix-runtime-debug-centos7.sh"
wget "https://raw.githubusercontent.com/api7/apisix-build-tools/apisix-runtime/${APISIX_RUNTIME}/build-apisix-runtime.sh"
#wget "https://raw.githubusercontent.com/api7/apisix-build-tools/apisix-runtime/${APISIX_RUNTIME}/build-apisix-runtime.sh"
wget https://raw.githubusercontent.com/Sn0rt/apisix-build-tools/guohao/update-lua-resty-events/build-apisix-runtime.sh
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pls change this link

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

chmod +x build-apisix-runtime-debug-centos7.sh
chmod +x build-apisix-runtime.sh
./build-apisix-runtime-debug-centos7.sh

# patch lua-resty-events
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why update this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is impossible to execute a reload without a transient interruption because the worker that was handling the unix socket connection is being restarted, and the other workers will not be able to connect to the broker over the unix socket again until the new worker that replaces it is up and running, so the logs will be some error level logs about connection closed, which will interrupt our tests, so we'll have to change the code here to warn level.

There is no need to apply this patch in production as it has no impact other than causing some error logs, when the worker is available, the worker and broker will reconnect, and event push will be available again.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok.

ci/redhat-ci.sh Outdated
@@ -94,7 +100,7 @@ run_case() {
make init
set_coredns
# run test cases
FLUSH_ETCD=1 prove --timer -Itest-nginx/lib -I./ -r ${TEST_FILE_SUB_DIR} | tee /tmp/test.result
FLUSH_ETCD=1 TEST_EVENTS_MODULE=lua-resty-worker-events prove --timer -Itest-nginx/lib -I./ -r ${TEST_FILE_SUB_DIR} | tee /tmp/test.result
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don’t quite understand this part. I see that the code already uses lua-resty-event as the default event library. Why do I need to specify this variable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First, this is the same logic executed in the other test scripts, with the difference that under ubuntu we are executing the tests for both event modules at the same time, where an environment variable controlled by GitHub workflow is used as input.

You can see the header of some of the test cases, some of the tests are related to the events library and because of the implementation of the library they have slightly different behaviors, such as proactive health checks, which use different test cases under different events modules.

If an events module is not explicitly configured, it will default to the value in config-default, which is lua-resty-events, so simultaneous testing is not possible.

But it looks like some of the logic can be ported to APISIX.pm, and I'll be making that change.

Copy link
Contributor Author

@bzp2010 bzp2010 Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to lua-resty-events as the default test events module

t/APISIX.pm Outdated
@@ -413,6 +413,11 @@ _EOC_

require "resty.core"

local events_sock_path = "$apisix_home/t/servroot/logs/stream_worker_events.sock"
if os.getenv("TEST_NGINX_USE_HUP") ~= "1" then
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why does the test case using the HUP signal need to remove the sock, and other cases do not need to remove the unix sock?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Theoretically they will process properly, I can try. Maybe it's not necessary.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed, it can work.

@@ -153,8 +156,6 @@ qr/(create new checker|try to release checker): table/
create new checker: table
try to release checker: table
create new checker: table
--- no_error_log
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should it be deleted?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the cost of the lua-resty-events library, whose events are asynchronous and delayed, so when the health checker is first created, the events are not communicated in a timely manner, so these error logs must appear.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They should have no impact because the main use of the test is to test that when the service discovers a change in the node provided, the new checker is created correctly rather than as a result of a health check.

If you are concerned about this, we will have to follow the example of the passive health check tests by listing them as separate test files and skipping them when they are not needed.

@bzp2010 bzp2010 requested a review from Sn0rt December 8, 2023 09:53
@moonming moonming merged commit 553ac42 into apache:master Dec 14, 2023
41 of 42 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

bug: the worker run into a dead end in events.lua
4 participants