Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random crash in natsConn_processMsg #637

Closed
fran6co opened this issue Feb 16, 2023 · 2 comments · Fixed by #638
Closed

Random crash in natsConn_processMsg #637

fran6co opened this issue Feb 16, 2023 · 2 comments · Fixed by #638

Comments

@fran6co
Copy link

fran6co commented Feb 16, 2023

We are getting this sporadic errors, they seem to be correlated with a subscription timing out but looking at the code I can't figure out why would it crash when trying to lock the connection mutex. Any ideas what could cause this? it looks like a race condition between the readloop and the lifetime of the connection.

/usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x14420) [0x7fedfaf84420]
/usr/lib/x86_64-linux-gnu/libpthread.so.0(pthread_mutex_trylock+0x17) [0x7fedfaf7b257]
./app(natsMutex_Lock+0x34) [0x564abde25964]
./app(natsConn_processMsg+0x1ae) [0x564abddf56ee]
./app(natsParser_Parse+0x419) [0x564abde19989]
./app(+0x2093e93) [0x564abddf4e93]
./app(+0x20c4ca4) [0x564abde25ca4]
/usr/lib/x86_64-linux-gnu/libpthread.so.0(+0x8609) [0x7fedfaf78609]
/usr/lib/x86_64-linux-gnu/libc.so.6(clone+0x43) [0x7fedf538f133]
@kozlovic
Copy link
Member

I suspect that this is not the lock of the connection that is failing, but the lock of the subscription. I see that there could be a race where the subscription is destroyed by the application while processing an inbound message for this subscription, which could lead to that crash.
Just to make sure, which library version are you using? And what type of subscription are you using and how/where are you destroying it?

kozlovic added a commit that referenced this issue Feb 16, 2023
A race could cause the read loop to crash when processing a message
for a subscription that is being removed in another thread.

Resolves #637

Signed-off-by: Ivan Kozlovic <[email protected]>
@fran6co
Copy link
Author

fran6co commented Feb 16, 2023

We are using version 3.5.0 and we are using natsConnection_SubscribeTimeout. We call natsSubscription_Destroy inside the callback when message is null.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants