Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

help request: upgrade to 2.14.0 after the route matching error, adjust the priority also did not solve, back to 2.13.1 on the normal #7136

Closed
bingoku opened this issue May 26, 2022 · 26 comments
Assignees
Labels
checking check first if this issue occurred stale

Comments

@bingoku
Copy link

bingoku commented May 26, 2022

Description

升级到 2.14.0 后路由匹配错乱,调整优先级也无解,回退到 2.13.1就正常了

不知道如何查询这个问题,求解。

Environment

apisix version:apache/apisix:2.14.0-centos

@tzssangglass tzssangglass changed the title help request: 升级到 2.14.0后路由匹配错乱,调整优先级也无解,回退到 2.13.1就正常了 help request: upgrade to 2.14.0 after the route matching error, adjust the priority also did not solve, back to 2.13.1 on the normal May 26, 2022
@tzssangglass
Copy link
Member

Your description does not provide valid information and I need detailed reproduction steps to be able to reproduce this issue locally.

@bingoku
Copy link
Author

bingoku commented May 26, 2022

image

其实我没做什么调整,这个图就是现在的路由规划。升级到2.14就乱了。回退到 2.13 是正常的,你说的复现步骤,恐怕不知道怎么给你们展示了。

@lemonrains
Copy link

I'm also get the similar error after upgrade to 2.14.0, and get normal if I downgrade the version to 2.13.1.

I got the error of the route which enabled plugin openid-connect(base on keycloak) . And it always redirect to uri /auth.
Such as xxxx.xx.com/auth

@tzssangglass tzssangglass self-assigned this May 26, 2022
@tzssangglass tzssangglass added the checking check first if this issue occurred label May 26, 2022
@tzssangglass
Copy link
Member

其实我没做什么调整,这个图就是现在的路由规划。升级到2.14就乱了

Is this image your own dashborad? I'd like to know what you mean by chaos

@tzssangglass
Copy link
Member

I got the error of the route which enabled plugin openid-connect(base on keycloak) . And it always redirect to uri /auth.
Such as xxxx.xx.com/auth

Your problem is not related to this issue, please submit a new issue and give the steps to reproduce it.

@tzssangglass
Copy link
Member

升级到2.14就乱了。回退到 2.13 是正常的,你说的复现步骤,恐怕不知道怎么给你们展示了。

As far as I know, the APISIX admin API in 2.14.0 is not significantly tweaked.

I need to know.

1, what exactly do you mean by confusion? admin API returns inaccurate data? Or what?

  1. the reproduction step is to describe clearly how the problem occurred, so that I can use the same steps (what was done on which version of APISIX, what data was used, what the data looked like, step by step objective and detailed description of your behavior and the observed phenomena)

@tokers
Copy link
Contributor

tokers commented May 27, 2022

Also, some descriptions of the disorder of route matching are desired, so that we can try to troubleshoot why a request that should hit route A will hit route B.

@lemonrains
Copy link

I got the error of the route which enabled plugin openid-connect(base on keycloak) . And it always redirect to uri /auth.
Such as xxxx.xx.com/auth

Your problem is not related to this issue, please submit a new issue and give the steps to reproduce it.

Thank you for your reply. Finally I config response-rewrite plugin in order to redirect to /. Maybe it's my mistake, I'll check it later if it's relate to the new version.

@bingoku
Copy link
Author

bingoku commented May 27, 2022

你看上图我的规划路由 A 本来应该访问到 B,但是会访问到了 C D E 感觉像是 host 没匹配到。
我尝试过 指定service 的 host、router 的host,以及他们的优先级都没有解决 2.14 中的错乱。

经过上述操作后,我回滚2.13.1后就恢复正常了。

@tokers
Copy link
Contributor

tokers commented May 27, 2022

你看上图我的规划路由 A 本来应该访问到 B,但是会访问到了 C D E 感觉像是 host 没匹配到。 我尝试过 指定service 的 host、router 的host,以及他们的优先级都没有解决 2.14 中的错乱。

经过上述操作后,我回滚2.13.1后就恢复正常了。

This is useless for troubleshooting. Could you show some configs for us so that we can reproduce it?

@bingoku
Copy link
Author

bingoku commented May 30, 2022

你看上图我的规划路由 A 本来应该访问到 B,但是会访问到了 C D E 感觉像是 host 没匹配到。 我尝试过 指定service 的 host、router 的host,以及他们的优先级都没有解决 2.14 中的错乱。
经过上述操作后,我回滚2.13.1后就恢复正常了。

This is useless for troubleshooting. Could you show some configs for us so that we can reproduce it?

如何提供更多信息?有其它沟通方式吗?微信什么,我这边是 100% 可复现。或者等其它大用户跟你们反馈吧。我这小厂精力有限。。

@badx
Copy link

badx commented Jun 17, 2022

I'm also get the similar error after upgrade to 2.14.1, and get normal if I downgrade the version to 2.13.1.
I checked the config by etcd and it looks like the config is correct.
I deleted all the configuration and added a new route, the problem still exists
I suspect the problem is caused by the upgrade.

@tokers
Copy link
Contributor

tokers commented Jun 19, 2022

@spacewander @tzssangglass Do you have any idea about this?

@tzssangglass
Copy link
Member

@spacewander @tzssangglass Do you have any idea about this?

I've gathered similar information from other channels, but there is no reproducible demo, and I've noticed one detail: one of the steps is to migrate etcd data.

@fatpa
Copy link

fatpa commented Jul 13, 2022

We also get the similar error after upgrading from 2.11 to 2.14.1.
According to the error logs, I found that some of the queries mismatched the route when high concurrency simulation (about 600 qps), and got 302 / 400 / 404, etc status code.
At the same time, we make some tests in different versions, 2.13.1 works well but not 2.13.2.

We don't migrate etcd data but just enable etcd tls.

@tzssangglass
Copy link
Member

Hi @fatpa , can you provide a complete reproducible step-by-step? This is important for us to solve this problem.

@fatpa
Copy link

fatpa commented Jul 17, 2022

Would you like to talk on slack or WeChat? Our production environment is a bit complex, and I can't show much here.

In my case, we do the things as below:

  1. enabled etcd tls function in 2.11 version, and it doesn't come to any errors
  2. remove the apisix node from aliyun slb
  3. upgrade the apisix from 2.11 to 2.14.1
  4. make some curl tests to this node, no errors
  5. add this node back to aliyun slb
  6. some 302 / 400 / 404 status code exists, maybe half of total queries (600 qps). At the same time, the other apisix nodes with 2.11 version in same aliyun slb, work fine, with no errors.
  7. downgrade the version back to 2.13.2, still getting same 302 / 400 / 404 status code
  8. downgrade again back to 2.13.1, everything is fine

And I found that it may just happen in high concurrency scenes.

@tzssangglass
Copy link
Member

ok, I will check this issue and #7449 together.

@tzssangglass
Copy link
Member

from: #7449 (comment)

Can you give the content and header of the response when a 302 / 400 / 404 status code appears?

@tzssangglass
Copy link
Member

  • downgrade the version back to 2.13.2, still getting same 302 / 400 / 404 status code

and the error logs related to it

@iori0758
Copy link

iori0758 commented Sep 8, 2022

the route matching wrong did happen on my server,Maybe caused by etcd's connection is too many,now i changed etcd version to 3.4.20, it is normal so far
etcd-io/etcd#14185
etcd-io/etcd#14081
but 404 status still appear sometime,it just happed on high concurrency situation

apisix:2.14.1
etcd:3.4.20

路由错乱我前几天也碰到过,之前排查原因估计是etcd连接数超过250,导致curl带tls证书请求缓慢,所以我更换了etcd版本到3.4.20,目前1个星期了为止还没有复现
但目前还是会碰到偶见404,在高并发的场景有一些js和html会出现404,apisix显示请求到后台了,但后台nginx没有打印该请求日志,貌似请求到别的地方了
然后还有一个场景是变更apisix-dashboard的时候,偶尔会出现请求403

@tzssangglass
Copy link
Member

路由错乱我前几天也碰到过,之前排查原因估计是etcd连接数超过250,导致curl带tls证书请求缓慢,所以我更换了etcd版本到3.4.20,目前1个星期了为止还没有复现
但目前还是会碰到偶见404,在高并发的场景有一些js和html会出现404,apisix显示请求到后台了,但后台nginx没有打印该请求日志,貌似请求到别的地方了
然后还有一个场景是变更apisix-dashboard的时候,偶尔会出现请求403

There is nothing we can do about the phenomenon but there is no stable recurrence of the steps.

@enuoCM
Copy link

enuoCM commented Nov 9, 2022

你看上图我的规划路由 A 本来应该访问到 B,但是会访问到了 C D E 感觉像是 host 没匹配到。 我尝试过 指定service 的 host、router 的host,以及他们的优先级都没有解决 2.14 中的错乱。

经过上述操作后,我回滚2.13.1后就恢复正常了。

Hi, 你配置的上游服务是不是有http,也有https?我们使用2.14.1这个版本,如果上游有http和https的话,就100%重现路由乱匹配的问题。

@tzssangglass
Copy link
Member

Hi, 你配置的上游服务是不是有http,也有https?我们使用2.14.1这个版本,如果上游有http和https的话,就100%重现路由乱匹配的问题。

maybe fixed by #7466

@github-actions
Copy link

This issue has been marked as stale due to 350 days of inactivity. It will be closed in 2 weeks if no further activity occurs. If this issue is still relevant, please simply write any comment. Even if closed, you can still revive the issue at any time or discuss it on the [email protected] list. Thank you for your contributions.

@github-actions github-actions bot added the stale label Oct 26, 2023
Copy link

This issue has been closed due to lack of activity. If you think that is incorrect, or the issue requires additional review, you can revive the issue at any time.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
checking check first if this issue occurred stale
Projects
None yet
Development

No branches or pull requests

8 participants