Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bugfix/crash web view #3787

Merged
merged 9 commits into from
Oct 13, 2023
Merged

Conversation

dmiales
Copy link
Contributor

@dmiales dmiales commented Aug 12, 2023

This PR fixes issue #3500. What was the issue:
I have a tablet hanging as a home control. The tablet is running home assistant. And about once a day the application crashes. And when I pass by it, I see the desktop. The reason is that the WebView is leaking memory. And in the event of the death of the WebView process (com.google.android.webview:sandboxed_process0:org.chromium.content.app.SandboxedProcessService), our application also died. It is unacceptable.
I immediately provide a reproduction of the problem:

  1. use adb: adb shell
  2. without permissions, we won't be able to kill the process: su
  3. find and kill the process: ps -A | grep sandboxed | awk '{print $2}' | xargs kill

For clarity, I will add two videos (before / after)

before.mp4
afrer.mp4

@jpelgrom jpelgrom linked an issue Aug 12, 2023 that may be closed by this pull request
@jpelgrom
Copy link
Member

The documentation specifically calls out that the actual view shouldn't be reused:

The given WebView can't be used, and should be removed from the view hierarchy, all references to it should be cleaned up, e.g any references in the Activity or other classes saved using View.findViewById(int) and similar calls, etc.

but this is reusing the view?

Clearing cache also should be removed as it isn't related to crash handling.

@dmiales
Copy link
Contributor Author

dmiales commented Aug 12, 2023

Removed unnecessary calls.

Because we're using a trick.
The documentation says:

Notify host application that the given WebView's render process has exited. Our goal is not to let it complete, we process the situation and reboot (a new sandbox process is created).

if the host application handled the situation that process has exited, otherwise, application will crash if render process crashed, or be killed if render process was killed by the system.

And in this case, a running application is clearly preferable to a falling one.

@jpelgrom
Copy link
Member

Now we've quoted all documentation for that function :)

I still don't understand your position. You're reusing the view. That is not a 'trick'? As I understand it, you can use this callback to tear out the view and handle it (which could be creating a new one).

view: WebView?,
handler: RenderProcessGoneDetail?
): Boolean {
webView.reload()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we add a logging statement here? not sure if any details will be helpful but could be useful for troubleshooting purposes?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also it looks like this method gets called for every open webview so maybe we need to match view to make sure its what we expected? view == webView before doing any recreation?

Multiple WebView instances may be associated with a single render process; onRenderProcessGone will be called for each WebView that was affected.

Not sure if we need to worry about an actual crash or not too?

https://developer.android.com/reference/android/webkit/RenderProcessGoneDetail#didCrash()

this is indeed a good find though, nice to know we can handle this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dshokouhi
I added logging. It's important, I agree.

didCrash() is a cool check. But there is a problem:

Added in API level 26

we have min 21

view == webView

wouldn't that be overkill? As far as I could check in webView is always the same as in the view parameter.

I keep you informed, the application on the tablet has been working for almost two days without failures

Copy link
Member

@dshokouhi dshokouhi Aug 13, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

didCrash() is a cool check. But there is a problem:\n\nAdded in API level 26

So is the entire method you are overriding

https://developer.android.com/reference/android/webkit/WebViewClient#onRenderProcessGone(android.webkit.WebView,%20android.webkit.RenderProcessGoneDetail)

Just add a conditional check we have plenty of code like that

wouldn't that be overkill? As far as I could check in webView is always the same as in the view parameter.

The documentation tells us it may not be the same, also user can have multiple instances created using shortcut or notification.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is strange behavior
изображение

Maybe then like this? Reload view from parameter

handler: RenderProcessGoneDetail?
): Boolean {
Log.e(TAG, "onRenderProcessGone: webView crashed")
view!!.reload()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will crash if view is null we should avoid crashing when possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

confirm

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading the description again I dont think WebView can load in certain cases which is why the description says to remove the view from the hierarchy

The application's implementation of this callback should only attempt to clean up the specific WebView given as a parameter, and should not assume that other WebView instances are affected. The given WebView can't be used, and should be removed from the view hierarchy, all references to it should be cleaned up, e.g any references in the Activity or other classes saved using [View.findViewById(int)](https://developer.android.com/reference/android/view/View#findViewById(int)) and similar calls, etc.

have you tested this fix against the site Google recommended to test against?

To cause an render process crash for test purpose, the application can call loadUrl("chrome://crash") on the WebView.

Copy link
Member

@dshokouhi dshokouhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to make sure we follow Google's recommendation here and to clean up the WebView instance that is no longer available

The application's implementation of this callback should only attempt to clean up the specific WebView given as a parameter, and should not assume that other WebView instances are affected. The given WebView can't be used, and should be removed from the view hierarchy, all references to it should be cleaned up, e.g any references in the Activity or other classes saved using [View.findViewById(int)](https://developer.android.com/reference/android/view/View#findViewById(int)) and similar calls, etc.

@home-assistant home-assistant bot marked this pull request as draft September 19, 2023 15:34
@home-assistant
Copy link

Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍

Learn more about our pull request process.

@dmiales
Copy link
Contributor Author

dmiales commented Sep 19, 2023

@dshokouhi

I was on vacation. But before that, I committed the changes and tested the hypotheses. Replying to the first comment:

have you tested this fix against the site Google recommended to test against?

Yes, I checked for chrome://crash, it works identical to killing the process.
Replying to the last comment:

We need to make sure we follow Google's recommendation here and to clean up the WebView instance that is no longer available

https://github.com/home-assistant/android/pull/3787/files#diff-76bbedb2918a5db79c7070e9bb2e01d9241388745f60b0719c0d1f74123d89bdR396 I corrected it here, we take exactly the instance that comes into the method.

While I was testing my solution, I found an unusual error that broke the application
home-assistant/core#71205 (the same error exists in neighboring projects.
#2076 (comment)
#2369 (comment)
The essence is the same: Disconnected: "Did not receive auth message within 10 seconds") I debugged and realized that when we recreate the WebView, addJavascriptInterface is not created again. Because of this, we lose communication with the internal API. As a result, I put it in a separate method; we use it during initial initialization and again when re-creating the WebView.

@dshokouhi
Copy link
Member

https://github.com/home-assistant/android/pull/3787/files#diff-76bbedb2918a5db79c7070e9bb2e01d9241388745f60b0719c0d1f74123d89bdR396 I corrected it here, we take exactly the instance that comes into the method.

so looking at the commit we are attempting to reload() the webview if the view is still there. However that is not following Google's recommendation of removing the view from the hierarchy.

Reloading does not do anything to the view hierarchy according to the docs.

https://developer.android.com/reference/android/webkit/WebView#reload()

@dmiales
Copy link
Contributor Author

dmiales commented Sep 21, 2023

I understand what you are talking about. Probably the ideal solution is to completely recreate the webview? But it’s extremely difficult to do this with this hook. It’s unlikely that I have enough competence to implement this.

The question here is why this is not recommended, and whether we should adhere to this recommendation in this case. After all, we are rebooting in a super-critical situation, the other way out of which is a complete crash of the application.
And if you look at which is worse, the choice is obvious.

At least my smart home is on this commit, which makes me happy

@dmiales dmiales requested a review from dshokouhi September 21, 2023 11:01
@dmiales dmiales force-pushed the bugfix/crashWebView branch from 5f74b63 to 523acc1 Compare October 2, 2023 10:45
@dmiales dmiales marked this pull request as ready for review October 3, 2023 10:30
@dmiales dmiales force-pushed the bugfix/crashWebView branch from 523acc1 to 118d532 Compare October 3, 2023 10:30
@dmiales
Copy link
Contributor Author

dmiales commented Oct 3, 2023

@dshokouhi check my comment

Copy link
Member

@dshokouhi dshokouhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I ran some testing here and using Googles recommended way of testing the issue I can see that it is working and the page is refreshed accordingly. Lets just make sure to clean up any new introduced warnings and I think this should be ok.

@@ -224,6 +225,7 @@ class WebViewActivity : BaseActivity(), io.homeassistant.companion.android.webvi
private var downloadFileUrl = ""
private var downloadFileContentDisposition = ""
private var downloadFileMimetype = ""
private var javascriptInterface = "externalApp"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this can be a val as its not going to be changed

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Member

@dshokouhi dshokouhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I tested the PR using the recommendation from Google (loading chrome://crash). Actually could only test using remote debugging. Sure enough webview crashed and the code reloaded and the page was fine so the fix looks good here.

@dmiales
Copy link
Contributor Author

dmiales commented Oct 6, 2023

I also tested using this method. And the main thing is that the tablet has been crash-free for almost two weeks now

@JBassett JBassett merged commit eba9295 into home-assistant:master Oct 13, 2023
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

App crashes several times a day
4 participants