-
Notifications
You must be signed in to change notification settings - Fork 2.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Increase reference to session when handling SIP calls (see #2188) #2216
Conversation
I was able to test out this patch and it did prevent the crashing, however I found two memory leaks. The first is line 4608 in The second issue I found is when the browser disconnects from Janus while a call is in progress. A Let me know if you need additional logs for these leaks or any other info. Thanks for the help with this issue. |
Good catch! I'll fix that.
That's what I feared would happen, in my comment on the issue page: the loop being stopped before all events are shipped. Not sure what the proper way to handle this will be, since it will probably mean adding some complex state management. I'll have to think about it. |
@zaltar can you check this update? I tried to use the I tested this briefly and it seems to be working as expected, but you may want to stress it a bit more, especially with helpers involved. On helpers, I wonder if the same issue that is happening with calls can happen with subscriptions as well (e.g., if the helper sent a subscribe or originated a transfer), but we can worry about that another day. |
Unfortunately the crashes are now back. From what I can tell, |
I guess one way to address that is by only unreffing if the session was in the list we're removing it from. That should solve the crash. Not sure if it will introduce a leak, though: maybe we shouldn't set |
@zaltar pushed a new tentative fix. |
I think we're very close to a fix. Locally I just had to modify one thing that AddressSanitizer caught. In Thanks for helping with this! |
Something like this?
|
Ok, done, thanks for the feedback. |
Since @zaltar mentioned that with that fix, he couldn't replicate crashes or leaks anymore, I'll merge and close the original issue. Thanks for the help! |
This is supposed to help with the occasional crashes happening in the SIP plugin when using helpers, described in #2188.
The root cause there is that the
janus_sip_session
associated to a helper may be freed (helper closed), but it still "lives" in the Sofia SIP loop thread, e.g., because there are still events associated to it. This happens when we a call is originated vianua_invite
on a helper, or when we pass an incoming call to a helper and so bind it to the helper session usingnua_bind_handle
: this causes all events associated to that call to include a pointer to that session, as thehmagic
. What seems to be happening is that the call still has some state that can trigger events: since the Sofia loop belongs to the master session, it still runs, and so can still ship those events; trying to dereference the now freed helper session causes the crash.This patch tries to address that by using reference counters to ensure the session is not destroyed until the calls it handles are gone. Specifically, we add a new reference when starting a new call, and when getting a new one: when helpers are involved, we make sure it's their reference that is updated. The reference is then only decreased when we get the
nua_i_terminated
callback, which means the call is definitely over as far as the Sofia SIP stack is concerned.I tested briefly and it seems to be working. That said, I didn't test intensively: no time for that. As such, if you use the SIP plugin, make sure to test this properly. Notice that you should not only make sure it doesn't crash where it did before, but ideally also check whether this introduces leaks: to do that, please build with libasan support, which will print a summary of the leaked memory when you shut Janus down cleanly. You can also uncomment
REFCOUNT_DEBUG
inrefcount.h
to more specifically track if there is any reference still dangling when Janus is shut down cleanly.The sooner you can provide me feedback on that (and why not, fixes if you find anything broken), the sooner we can fix this.