You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Thanks for contributing to the Docker-Selenium project! A PR well described will help maintainers to quickly review and merge it
Before submitting your PR, please check our contributing guidelines, applied for this repository.
Avoid large PRs, help reviewers by making them as simple and short as possible.
Description
feat(chart): probe checks for Distributor and Router
Motivation and Context
Distributor Probes
By default, startupProbe, readinessProbe and livenessProbe are enabled for this component in both full distributed and Hub-Nodes mode.
There is a script in chart configs/distributor/distributorProbe.sh is loaded into ConfigMap and mounted to the container is used by livenessProbe. You can customize the script via --set-file distributorConfigMap.extraScripts.distributorProbe\.sh=/path/to/your_script.sh or set via YAML values.
There are some reports on a scenario that would be difficult to reproduce or rare: Grid UI is accessible but no nodes can be fetched or registered. Or something like there are few requests in session queue but could not be accepted. After restarting the Distributor, the issue is resolved. Based on that, a proactive approach to do automatic restart whenever detecting it is not healthy via livenessProbe and the condition check is executed. The script queries GraphQL endpoint to get sessionCount, and sessionQueueSize. If the sessionQueueSize is greater than 0 and sessionCount is 0 until the failureThreshold, the Distributor will be restarted. You can adjust the threshold as well as interval via probe settings.
Router Probes
By default, startupProbe, readinessProbe and livenessProbe are enabled for this component in full distributed mode.
There is a script in chart configs/router/routerProbe.sh loaded into ConfigMap and mounted to the container is used by livenessProbe. You can customize the script via --set-file routerConfigMap.extraScripts.routerProbe\.sh=/path/to/your_script.sh or set via YAML values.
The script checks GraphQL endpoint is reachable. If the http_code is not 200 until the failureThreshold, the Router will be restarted. You can adjust the threshold as well as interval via probe settings.
Types of changes
Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Breaking change (fix or feature that would cause existing functionality to change)
4, due to the extensive changes across multiple scripts and configurations, including new probe checks and environment variable adjustments. The complexity of the changes, especially around the probe scripts and deployment configurations, requires careful review to ensure they function as intended without side effects.
🧪 Relevant tests
No
⚡ Possible issues
Possible Bug: The use of environment variables like ROUTER_USERNAME and ROUTER_PASSWORD in the probe scripts without proper validation or default values could lead to issues if these variables are not set, potentially causing the script to fail or behave unexpectedly.
Performance Concern: The probe scripts make multiple external calls using curl which could impact the performance if not properly managed, especially under high load or in a production environment.
-while true; do+timeout=300+elapsed=0+while [ $elapsed -lt $timeout ]; do
terminating_pods=$(kubectl get pods -n ${SELENIUM_NAMESPACE} --no-headers | grep Terminating | wc -l)
if [ $terminating_pods -eq 0 ]; then
echo "No pods in 'Terminating' state."
break
else
echo "Waiting for $terminating_pods pod(s) to terminate..."
sleep 2
+ elapsed=$((elapsed + 2))
fi
done
+if [ $elapsed -ge $timeout ]; then+ echo "Timeout reached while waiting for pods to terminate."+fi
Suggestion importance[1-10]: 9
Why: Implementing a timeout mechanism is critical to avoid potential infinite loops in the script, which can lead to resource exhaustion or stalled operations, especially in production environments.
9
Add quotes around a variable to handle cases where the variable might be empty or contain spaces
Add quotes around the variable SELENIUM_GRID_HOST to prevent potential issues if the variable is empty or contains spaces.
Why: Adding quotes around the variable SELENIUM_GRID_HOST is crucial to prevent script errors in cases where the variable is empty or contains spaces, which could lead to unexpected behavior or security issues.
8
Add quotes around variables to handle cases where the variables might be empty or contain spaces
Add double quotes around the variable ${ROUTER_USERNAME} and ${ROUTER_PASSWORD} to prevent potential issues if the variables are empty or contain spaces.
Why: Adding quotes around the variables ${ROUTER_USERNAME} and ${ROUTER_PASSWORD} is important to ensure that the script handles cases where these variables might be empty or contain spaces, preventing script failures or unintended behavior.
8
Add a timeoutSeconds value to the livenessProbe configuration to prevent indefinite hangs
The livenessProbe configuration for the distributor should include a timeoutSeconds value to ensure that the probe does not hang indefinitely.
Why: The suggestion is valid as it addresses a potential issue where the liveness probe could hang indefinitely. However, the PR already includes timeoutSeconds in the probe configurations, making this suggestion slightly redundant but still valuable for clarity.
7
Add a default value for extraScriptsDirectory to avoid potential issues with undefined values
To ensure consistency and avoid potential issues with missing or incorrect values, consider adding a default value for $.Values.distributorConfigMap.extraScriptsDirectory in case it is not defined.
Why: The suggestion to add a default value for extraScriptsDirectory is valid and improves robustness by handling cases where the variable might not be defined.
7
Maintainability
Remove the duplicate entry for global.seleniumGrid.imagePullSecret in the global configuration table
The entry for global.seleniumGrid.imagePullSecret is duplicated in the global configuration table. This redundancy can cause confusion and should be removed.
-| `global.seleniumGrid.imagePullSecret` | `""` | Pull secret to be used for all images |
| `global.seleniumGrid.imagePullSecret` | `""` | Pull secret to be used for all images |
Suggestion importance[1-10]: 8
Why: The suggestion correctly identifies and resolves a redundancy issue in the documentation, which improves maintainability and reduces confusion.
8
Best practice
Change the default value of extraScripts to null to avoid potential script loading issues
The extraScripts field for distributorConfigMap and routerConfigMap should have a default value of null instead of an empty string to avoid potential issues with script loading.
Why: The suggestion to use null instead of an empty string for extraScripts could potentially avoid issues in script handling. This is a good practice but not critical.
6
Adjust indentation for env values to improve readability and consistency
To ensure that the env values are properly indented and formatted, consider using nindent 4 instead of nindent 2 for better readability and consistency with other YAML structures.
Why: Adjusting the indentation for better readability and consistency is a minor improvement, but it does enhance the maintainability of the YAML configuration.
5
Performance
Combine apt-get commands to reduce image size and improve security
To reduce the size of the Docker image and improve security, consider combining the apt-get commands into a single RUN statement and cleaning up unnecessary files in one step.
Why: Combining apt-get commands into a single RUN statement is a good practice for reducing layers in Docker images, which can enhance performance and security.
6
Clarity
Clarify the description of the distributorProbe.sh script to specify its use for the livenessProbe
The description for the distributorProbe.sh script in the Distributor Probes section should clarify that the script is used for the livenessProbe.
-There is a script in chart `configs/distributor/distributorProbe.sh` is loaded into ConfigMap and mounted to the container is used by `livenessProbe`.+There is a script in chart `configs/distributor/distributorProbe.sh` that is loaded into ConfigMap and mounted to the container, which is used by the `livenessProbe`.
Suggestion importance[1-10]: 5
Why: The suggestion improves clarity in the documentation by specifying the use of the script. However, the existing description in the PR is already quite clear about the script's purpose, making this improvement minor.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
User description
Thanks for contributing to the Docker-Selenium project!
A PR well described will help maintainers to quickly review and merge it
Before submitting your PR, please check our contributing guidelines, applied for this repository.
Avoid large PRs, help reviewers by making them as simple and short as possible.
Description
feat(chart): probe checks for Distributor and Router
Motivation and Context
Distributor Probes
By default,
startupProbe
,readinessProbe
andlivenessProbe
are enabled for this component in both full distributed and Hub-Nodes mode.There is a script in chart
configs/distributor/distributorProbe.sh
is loaded into ConfigMap and mounted to the container is used bylivenessProbe
. You can customize the script via--set-file distributorConfigMap.extraScripts.distributorProbe\.sh=/path/to/your_script.sh
or set via YAML values.There are some reports on a scenario that would be difficult to reproduce or rare:
Grid UI is accessible but no nodes can be fetched or registered. Or something like there are few requests in session queue but could not be accepted. After restarting the Distributor, the issue is resolved
. Based on that, a proactive approach to do automatic restart whenever detecting it is not healthy vialivenessProbe
and the condition check is executed. The script queries GraphQL endpoint to getsessionCount
, andsessionQueueSize
. If thesessionQueueSize
is greater than 0 andsessionCount
is 0 until thefailureThreshold
, the Distributor will be restarted. You can adjust the threshold as well as interval via probe settings.Router Probes
By default,
startupProbe
,readinessProbe
andlivenessProbe
are enabled for this component in full distributed mode.There is a script in chart
configs/router/routerProbe.sh
loaded into ConfigMap and mounted to the container is used bylivenessProbe
. You can customize the script via--set-file routerConfigMap.extraScripts.routerProbe\.sh=/path/to/your_script.sh
or set via YAML values.The script checks GraphQL endpoint is reachable. If the
http_code
is not200
until thefailureThreshold
, the Router will be restarted. You can adjust the threshold as well as interval via probe settings.Types of changes
Checklist
PR Type
Enhancement, Tests
Description
--no-install-recommends
.Changes walkthrough 📝
12 files
distributorProbe.sh
Add distributor liveness probe script for health checks.
charts/selenium-grid/configs/distributor/distributorProbe.sh
routerProbe.sh
Add router liveness probe script for health checks.
charts/selenium-grid/configs/router/routerProbe.sh
generate_release_notes.sh
Fix ChromeDriver version retrieval and update release notes.
generate_release_notes.sh
_nameHelpers.tpl
Add templates for distributor and router ConfigMap names.
charts/selenium-grid/templates/_nameHelpers.tpl
Dockerfile
Optimize Dockerfile to reduce image size.
NodeChromium/Dockerfile
--no-install-recommends
flag to reduce image size.distributor-configmap.yaml
Add ConfigMap template for distributor probe scripts.
charts/selenium-grid/templates/distributor-configmap.yaml
distributor-deployment.yaml
Integrate distributor probe scripts into deployment configuration.
charts/selenium-grid/templates/distributor-deployment.yaml
hub-deployment.yaml
Integrate distributor probe scripts into hub deployment configuration.
charts/selenium-grid/templates/hub-deployment.yaml
router-configmap.yaml
Add ConfigMap template for router probe scripts.
charts/selenium-grid/templates/router-configmap.yaml
router-deployment.yaml
Integrate router probe scripts into deployment configuration.
charts/selenium-grid/templates/router-deployment.yaml
server-configmap.yaml
Add environment variable configuration to server ConfigMap.
charts/selenium-grid/templates/server-configmap.yaml
values.yaml
Add configuration for distributor and router probes.
charts/selenium-grid/values.yaml
2 files
chart_test.sh
Improve test script robustness and platform handling.
tests/charts/make/chart_test.sh
base-resources-values.yaml
Add liveness probe failure threshold for router and distributor.
tests/charts/ci/base-resources-values.yaml
1 files
__init__.py
Fix typo in TEST_PLATFORMS environment variable.
tests/SeleniumTests/init.py
TEST_PLATFORMS
default value.1 files
README.md
Update README with distributor and router probe configurations.
charts/selenium-grid/README.md