-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[UI] Fix the UI bugs when there is environment under the Terminating status #2954
[UI] Fix the UI bugs when there is environment under the Terminating status #2954
Conversation
Hello chengyanjin,My role is to assist you with the merge of this Status report is not available. |
cfacd0f
to
15dc174
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some things to change, but most importantly, I don't think we should implement any "locking" from the UI. If Salt cannot handle running two jobs in parallel, it should be responsible for saying so (or use some mechanism to queue one of them for later execution). UI cannot know everything (what if an incompatible job was triggered by CLI?), unfortunately.
ui/src/ducks/app/salt.js
Outdated
// Check if the job jid from localStorage is exist. If not, we should remove it from localStorage | ||
const jobNotFound = result?.return?.find( | ||
(job) => Object.values(job)[0]['Error'] === ERROR_NO_JOB_JID_FOUND, | ||
); | ||
yield call(removeJobFromLocalStorage, Object.keys(jobNotFound)[0]); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I think that's not going to work well.
Try this:
- create a dummy
Node
- click Deploy
- refresh the page, and look for the
printJob
call in your browser's network capture tool
Personally, until the job is finished, I get:
{
"return": [
{
"20201201120203793335": {
"Result": {},
"StartTime": "2020, Dec 01 12:02:03.793335",
"Error": "Cannot contact returner or no job with this jid"
}
}
]
}
Which means the job would be cleared out, with this code.
However, if I query for a non-existing JID:
{
"return": [
{
"12345": {
"Result": {},
"StartTime": "",
"Error": "Cannot contact returner or no job with this jid"
}
}
]
}
Maybe we should also check the StartTime
...?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, good catch! It will delete good jobs.
I don't know how we can check the StartTime. Because let's say I executed Salt job from an instance, and switch to another instance. I will have a StartTime too.
Maybe we should not remove any JID from localStorage?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because let's say I executed Salt job from an instance, and switch to another instance. I will have a StartTime too.
Are you sure? If the job ID doesn't exist, then I don't see how Salt can give you a StartTime
...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gdemonet That's weird that you get an error key with Cannot contact returner or no job with this jid
, seems like something is wrong (the call failed or the ID really doesn't exists).
The presence of the Error
key is also what we use in the operator:
// Check for "Error" field.
if reason, found := info["Error"]; found {
// Unknown Job ID: maybe the Salt server restarted or something like that.
logger.Info("Salt job not found")
return nil, fmt.Errorf("cannot get status for job %s: %v", jobID, reason)
}
If your job is really in progress, then you'll get some info about it's status (and an key Result
with an empty dict since it's not finished yet):
// Extract results.
result, ok := info["Result"].(map[string]interface{})
if !ok {
return nil, fmt.Errorf("missing 'Result' key in %v", info)
}
// No result yet, the job is still running.
if len(result) == 0 {
logger.Info("Salt job is still running")
return nil, nil
}
At least for the states used by the storage-operator, I don't know if this holds for every Salt call.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, indeed there is something wrong somewhere (my bet is on Salt).
Tested the following:
- for a runner calling our
deploy_node
orchestrate, as the UI does
# salt-run --async state.orch metalk8s.orchestrate.deploy_node saltenv=metalk8s-2.6.0-dev pillar='{orchestrate: {node_name: worker-1}}'
[WARNING ] Running in asynchronous mode. Results of this execution may be collected by attaching to the master event bus or by examing the master job cache, if configured. This execution is running under tag salt/run/20201201163643992851
# salt-run jobs.print_job 20201201163643992851
20201201163643992851:
----------
Error:
Cannot contact returner or no job with this jid
Result:
----------
StartTime:
2020, Dec 01 16:36:43.992851
[INFO ] Runner completed: 20201201163704623108
- for a simple
salt
command, as the storage-operator does:
# salt --async 'worker-1' state.sls metalk8s.volumes saltenv=metalk8s-2.6.0-dev
Executed command with job ID: 20201201164240351912
# salt-run jobs.print_job 20201201164240351912
20201201164240351912:
----------
Arguments:
- metalk8s.volumes
|_
----------
__kwarg__:
True
saltenv:
metalk8s-2.6.0-dev
Function:
state.sls
Minions:
- worker-1
Result:
----------
StartTime:
2020, Dec 01 16:42:40.351912
Target:
worker-1
Target-type:
glob
User:
root
[INFO ] Runner completed: 20201201164247350096
Of course, if I wait long enough, I get a full Result
for both jobs, meaning there is no actual error in the first case outside the missing reporter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because let's say I executed Salt job from an instance, and switch to another instance. I will have a StartTime too.
Are you sure? If the job ID doesn't exist, then I don't see how Salt can give you a
StartTime
...
@gdemonet I think if you try with the real job which generates the real timestamp (not a random number), you can get StartTime.
So basically we should not remove any job with the error Cannot contact returner or no job with this jid
right? ( It's not actually an error...) because it might be during the transition state.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. As discussed during the meeting, I will drop this commit.
Since the issue: Salt shares the Job JID because we store in localStorage will not happen in the production.
ui/src/services/salt/utils.js
Outdated
status.completed = true; | ||
const returner = Object.values(job['Result'])[0].return; | ||
status = { ...status, ...parseJobError(returner) }; | ||
if (job['Error'] === ERROR_NO_JOB_JID_FOUND) { | ||
status.completed = true; | ||
} else { | ||
status.completed = true; | ||
const returner = Object.values(job['Result'])[0].return; | ||
status = { ...status, ...parseJobError(returner) }; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be simpler to just check if there is something in job['Result']
.
ui/src/containers/SolutionList.js
Outdated
@@ -193,6 +203,9 @@ const SolutionsList = (props) => { | |||
dispatch(deleteEnvironmentAction(environment.name)); | |||
}} | |||
inverted={true} | |||
disabled={ | |||
environment.namespaces[0].status.phase === STATUS_TERMINATING |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens if we have more than one namespace? Could be:
environment.namespaces.some(ns => ns.status.phase === STATUS_TERMINATING)
ui/src/translations/fr.json
Outdated
@@ -141,7 +141,8 @@ | |||
"env_preparation": "Préparation de l’environnement", | |||
"env_preparation_success": "L'environnement {envName} est en cours de préparation.", | |||
"env_preparation_failed": "La préparation de l'environnement {envName} a échoué. {step} - {reason}", | |||
"preparing_environemnt": "Préparer l'environnement", | |||
"preparing_environment": "Préparer l'environnement", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we have both "prepare" and "preparing", the right translation for "preparing" would be Environnement en cours de préparation
.
What bugs me is that env_preparation_success
already means "in progress", for some reason. Should we make sure all those translations are used appropriately?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the french translation of "preparing"!
Well, actually these two keywords hold the same meaning indeed, but they are used by different components.
"env_preparation_success": "Environment {envName} is being prepared."
used by the notification
"preparing_environment": "Preparing the environment"
used by the loader in the env list when adding a solution which should be a shorter version.
so for me, it may not clear which one to use but make sure to have both.
ui/src/ducks/app/solutions.js
Outdated
envName, | ||
solName, | ||
solVersion, | ||
// Need to check if Salt is under `deploy-node` before adding solution to environment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's the issue with #2941 - "solution import" means running ./solutions.sh import --archive /path/to/solution.iso
from the Bootstrap node.
Adding a Solution to an environment only means performing K8s API calls (either from the UI directly, or from Salt master), but it should not trigger any Salt master restart.
So I don't think this check is needed at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Absolutely right, I was confused with solution import and add a solution to an environment.
I will remove this checking...
{intl.translate('preparing_environemnt', { | ||
{intl.translate('preparing_environment', { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should be in the same commit as the one changing the key in src/translations/
.
cecc7c4
to
5bc2ebe
Compare
5bc2ebe
to
6839eb8
Compare
The environment list is broken due to the environment which are in terminating status. And since the configMap has been already deleted. The call readNamespacedConfigMap gets error. But we didn't catch the error. Refs: #2949
Add the missing translation for prepare environment Fix the typo Refs: #2941
Will need to wait for #2959 |
/approve |
Integration data createdI have created the integration data for the additional destination branches.
The following branches will NOT be impacted:
You can set option
The following options are set: approve |
In the queueThe changeset has received all authorizations and has been added to the The changeset will be merged in:
The following branches will NOT be impacted:
There is no action required on your side. You will be notified here once IMPORTANT Please do not attempt to modify this pull request.
If you need this pull request to be removed from the queue, please contact a The following options are set: approve |
I have successfully merged the changeset of this pull request
The following branches have NOT changed:
Please check the status of the associated issue None. Goodbye chengyanjin. |
Component: UI, salt, solution
Context:
The Environment list does not display correctly when there is a namespace/environment on terminating status.
Summary:
FIx the bug of #2949
Acceptance criteria:
Closes: #2949