ChromeDriver often needs to send JavaScript code to Chrome for execution. This document will explain the following:
- Sources of JavaScript code, user-supplied vs. embedded.
- The DevTools command for sending JavaScript code to Chrome.
- Preprocessing done by ChromeDriver to work around limitations in DevTools.
JavaScript code run by ChromeDriver can be divided into two broad categories, depending on where the code comes from.
- The client application can send JavaScript code for executing, as described in the Executing Script section of the WebDriver spec.
- ChromeDriver itself has a large number of embedded JavaScript fragments, which are used to implement certain features.
The WebDriver standard provides two commands for running JavaScript code, Execute Script and Execute Async Script. Both commands allow the user to provide a fragment of JavaScript code (as a string), with zero or more argument. These commands have a number of features that make them complicated to implement:
-
Both the arguments and return value can be a variety of types, including objects, collections, and DOM elements. These values need to be converted to appropriate JSON representation in order to be transported over the network.
-
If the user supplied code returns a Promise, ChromeDriver needs to wait for that Promise to be resolved or rejected before returning.
-
In the case of Execute Async Script command, the standard provides a mechanism for the script to notify ChromeDriver when it has finished execution.
ChromeDriver has a large number of JavaScript fragments that it can send to Chrome for a variety of reasons. Examples include:
- Finding out the page load status with
"document.readyState"
. - Ensure that pending navigation is started by evaluating
a simple expression of
"1"
. - Using one of the JavaScript fragments (called "atoms") supplied by Selenium, to find elements on a page, to get the text displayed by a DOM element, etc.
- Call one of the JavaScript function in the js directory, e.g., to get the location of an element on the page.
The Chrome DevTools provide Runtime.evaluate command for running JavaScript code. ChromeDriver uses this command to send all JavaScript code to Chrome.
The Runtime.evaluate command has a large number of parameters. Here are the ones important to ChromeDriver.
-
string expression
: This is the actual JavaScript code to execute. This is the only required parameter. -
ExecutionContextId contextId
: A web page can have multiple JavaScript contexts. For example, there is a separate context for each frame. This parameter allows ChromeDriver to select the context to run the code. If omitted, the code runs in the context associated with the top-level document. -
boolean awaitPromise
: If set totrue
and the JavaScript code returns a Promise object, DevTools should wait for the Promise to be resolved or rejected before returning. ChromeDriver usually sets this totrue
. -
boolean returnByValue
: Controls how objects are returned from the script. See below for more details. ChromeDriver usually sets this totrue
.
The value generated by the last statement of the script becomes the result
of the script, and is returned by the Runtime.evaluate command.
If the last statement does not generate any result,
then the result of the script is undefined
.
The script must not terminate by executing a return
statement.
Like most DevTools commands,
Runtime.evaluate returns a JSON object to the caller.
This JSON object always has a property named result
,
whose value is another JSON object of
type RemoteObject.
The type of the result (e.g., string, number, object, etc)
is given in the type
property of the RemoteObject.
If the script generates a primitive result,
the result value is stored in the value
property of the RemoteObject.
For example, if the script evaluates to "Hello", Runtime.evaluate returns:
{
"result": {
"type": "string",
"value": "Hello"
}
}
If the script generates a non-primitive result,
what happens depends on the returnByValue
parameter passed to
the Runtime.evaluate command.
If returnByValue
is true
,
the result is serialized into JSON format and stored in the value
property.
For example,
{
"result": {
"type": "object",
"value": {
"x": [ 1, 2, 3 ]
}
}
}
If returnByValue
is true
but the result object is not compatible with
JSON format, then information can be lost or an error can be returned.
ChromeDriver should be written to avoid this situation.
If returnByValue
is false
or unspecified,
then the returned RemoteObject does not have a value
property,
but has an objectId
property instead.
This object ID can be used in other DevTools commands to query for
information for the result object.
ChromeDriver uses this feature only in a very small number of cases,
such as resolving the iframe associated with an element.
If the script ends by throwing an uncaught exception,
Runtime.evaluate returns a JSON object with two properties.
The exceptionDetails
property contains details about the exception,
while the result
property contains information about the value passed
to the throw
statement.
If awaitPromise
is true and the script returns a rejected Promise,
it is treated the same as an uncaught exception.
The Runtime.evaluate command has a number of limitations. These limitations are listed here, and the rest of the document will explain how ChromeDriver works around these limitations.
-
It does not allow passing any arguments to the JavaScript code.
-
The return value must be compatible with JSON format.
-
It does not allow
await
statements, unless they are inside async functions.
Usually JavaScript execution is single-threaded, so that if any code is blocked by a modal dialog, no addition code can be executed until the dialog is dismissed. However, there is a bit of magic involved with DevTools while a modal dialog is shown -- it runs a nested message loop to dispatch DevTools commands, including running additional JavaScript code.
However, promises are only resolved at the top message loop (even trivial ones,
such as await true
). Thus, setting awaitPromise
parameter to true would
block Runtime.evaluate command from returning if a modal dialog is shown.
ChromeDriver uses several different ways to run JavaScript code, depending on what features are required. These are explained in this section.
ChromeDriver code can directly use the Runtime.evaluate command provided by
the DevTools API.
This can be done by calling WebView::SendCommand
method or its variations,
or by calling DevToolsClient::SendCommand
method or its variations.
The advantage of this method is it provides access to all the parameters supported by the Runtime.evaluate command. It it is also the most efficient method. The disadvantage is the caller is faced with all the limitations mentioned in the previous section. The caller also has to figure out how to route the script to the right frame.
The WebView::EvaluateScript
method is a thin wrapper around DevTools Runtime.evaluate command.
The only additional service it provides is routing the script to the
desired frame. The caller can provide a frame ID,
and WebView::EvaluateScript
will make sure that the script is evaluated
in that frame.
Its main drawback is it does not provide access to any of the additional parameters supported by DevTools API.
The WebView::CallFunction
method
(and its variation WebView::CallFunctionWithTimeout
method)
wraps the supplied JavaScript code inside
callFunction.
It requires that the supplied JavaScript code must be a function.
This method provides the following functionality:
- It allows arguments to be passed to the JavaScript function.
- It converts arguments from JSON objects to JavaScript objects.
- Unless
opt_unwrappedReturn
istrue
, it converts return value to something compatible with JSON. - If the return value is not a Promise already, it is wrapped inside a Promise. This way, the return value is always a Promise, and can be waited for by the Runtime.evaluate command.
The WebView::CallUserSyncScript
method is used by the Execute Script command in the WebDriver API.
It is responsible for wrapping the user-supplied script inside a
function,
before passing it to WebView::CallFunctionWithTimeout
.
This is necessary because the WebDriver standard requires that
the user-supplied script is not a function,
while WebView::CallFunctionWithTimeout
requires its input to be a function.
Wrapping the script in a function has additional benefits:
- It allows passing arguments to the script, as required by the standard.
- It allows using
await
statement in the script.
The WebView::CallUserAsyncFunction
method is used by the Execute Async Script command in the WebDriver API.
It wraps the user-supplied script inside
executeAsyncScript
function,
before passing it to WebView::CallFunctionWithTimeout
.
The [executeAsyncScript
] function is responsible for waiting for the
async script to finish, as required by the WebDriver standard.