Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

👍🏼 Update IF to dump logs and errors to yaml file #592

Closed
5 tasks done
Tracked by #629 ...
jmcook1186 opened this issue Apr 9, 2024 · 25 comments · Fixed by #691
Closed
5 tasks done
Tracked by #629 ...

👍🏼 Update IF to dump logs and errors to yaml file #592

jmcook1186 opened this issue Apr 9, 2024 · 25 comments · Fixed by #691
Assignees
Milestone

Comments

@jmcook1186
Copy link
Contributor

jmcook1186 commented Apr 9, 2024

Sub of: #629 -> #650

What

Update IF so that errors and logs are saved to file as well as being displayed in the console.

Why

As a core developer I want to be able to run negative tests, which requires errors and logs to be captured in a file we can snapshot against. As a user I want to be able to provide a log file as part of a bug report to help the core team guide me through debugging.

Context

Testing negative flows is important for us to make sure IF behaves as intended across a large set of scenarios. It is hard to do this without dumping errors and logs to files because the alternative is for IF to panic without any persistent result to test against. This can be resolved by capturing the errors and logs to a file.

SoW

Acceptance criteria

  • Manifests that causes critical exceptions (causing the tool to fail) output status and error detail in an execution node.

GIVEN the user is in a folder
WHEN they run this manifest file:

name: basic-error-demo
description:
tags:
initialize:
  plugins:
    teads-curve: 
      path: '@grnsft/if-unofficial-plugins'
      method: TeadsCurve
      global-config:
        interpolation: spline
  outputs:
    - yaml
tree:
  children:
    child-0:
      defaults:
        cpu/thermal-design-power: 100
      pipeline:
        - teads-curve
      inputs:
        - timestamp: 2023-07-06T00:00
          cpu/utilization: 20

THEN the tool logs the error, exits and outputs a manifest file which contains these additional nodes:

  • execution
    • status: (success/fail)
    • error: The exception that caused the failure and contents of the exception (to string())
name: basic-error-demo
description:
tags:
initialize:
  plugins:
    teads-curve: 
      path: '@grnsft/if-unofficial-plugins'
      method: TeadsCurve
      global-config:
        interpolation: spline
  outputs:
    - yaml
execution:
  status: fail
  error: 'InputValidationError:   "duration" parameter is required. Error code: invalid_type'.
tree:
  children:
    child-0:
      defaults:
        cpu/thermal-design-power: 100
      pipeline:
        - teads-curve
      inputs:
        - timestamp: 2023-07-06T00:00
          cpu/utilization: 20

The errors and logs should still be displayed in the console as usual, because the version int he manifest file will not contain the full stack trace.

QA

@jmcook1186 jmcook1186 added this to IF Apr 9, 2024
@jmcook1186 jmcook1186 changed the title Update IF to dump logs are errors to yaml file Update IF to dump logs and errors to yaml file Apr 9, 2024
@jmcook1186 jmcook1186 added this to the Sprint 11 / QA1 milestone Apr 11, 2024
@jmcook1186 jmcook1186 moved this to Backlog in IF Apr 11, 2024
@jawache
Copy link
Contributor

jawache commented Apr 16, 2024

@jmcook1186 this one is good to go from me.

@narekhovhannisyan
Copy link
Member

@jmcook1186 @jawache it's not clear output, logs and the errors should be stored in different files?

@jawache
Copy link
Contributor

jawache commented Apr 16, 2024

@narekhovhannisyan just anything that caused the manifest to fail and if to exit (critical error) not all the normal info, debug logs.

For negative testing the manifest file should contain the error so we can do both negative and positive testing with the if-verify tool.

Thinking about it the node should just be 'execution.error' instead of log since it will only contain the error that caused it to fail. Stringified exception that caused failure.

@jawache jawache changed the title Update IF to dump logs and errors to yaml file 👍🏼 Update IF to dump logs and errors to yaml file Apr 21, 2024
@jmcook1186 jmcook1186 removed the READY label Apr 22, 2024
@zanete zanete moved this from Backlog to In Refinement in IF Apr 22, 2024
@zanete
Copy link

zanete commented Apr 23, 2024

@MariamKhalatova in order to complete the acceptance criteria for this issue and get it ready for development, there are a couple of asks from @jawache to you in the description. Can you let me if they are clear to you and you can help get the needed information? This is an urgent issue that is blocking other work in the #650 feature 🙏
cc @narekhovhannisyan

@jmcook1186
Copy link
Contributor Author

jmcook1186 commented Apr 23, 2024

@zanete FYI I untagged @MariamKhalatova and added a failing manifest and expected output manifest to the ticket description

@jmcook1186 jmcook1186 self-assigned this Apr 23, 2024
@jmcook1186 jmcook1186 moved this from In Refinement to Ready in IF Apr 23, 2024
@jmcook1186 jmcook1186 removed their assignment Apr 24, 2024
@zanete zanete moved this from In Refinement to In Progress in IF Apr 25, 2024
@zanete
Copy link

zanete commented Apr 29, 2024

This needs some major restructuring of the index file to properly handle things, currently we just catch the error and print to the console, but we need to output in the manifest, and in order to do that we need to continue the execution of the manifest. Some of the errors may cause some variables to be undefined and that needs to be solved.

@jawache
Copy link
Contributor

jawache commented Apr 29, 2024

@narekhovhannisyan (assume that came from you 😄) isn't this as simple as this?

load Inputmanifestfile
Try
All our compute steps. .
Catch exception which we can't recover from
Add exception to inputmanifest, save, exit

@zanete
Copy link

zanete commented Apr 29, 2024

@jawache ys sorry for the lack of context, forgot to add that I was taking notes during standup and as @jmcook1186 wasn't there, didn't want to miss important technical points made.

@narekhovhannisyan
Copy link
Member

narekhovhannisyan commented Apr 29, 2024

@jawache

const impactEngine = async () => {
  try {
  const options = parseArgs();

  if (!options) {
    return;
  }

  logger.info(DISCLAIMER_MESSAGE);
  const {inputPath, paramPath, outputOptions} = options;

  const {tree, rawContext, parameters} = await load(inputPath!, paramPath);
  const context = await injectEnvironment(rawContext);
  parameterize.combine(context.params, parameters);
  const pluginStorage = await initalize(context.initialize.plugins);
  const computedTree = await compute(tree, {context, pluginStorage});
  const aggregatedTree = aggregate(computedTree, context.aggregation);
  context['if-version'] = packageJson.version;
  exhaust(aggregatedTree, context, outputOptions);

  return;
  } catch (e) {
     // here you have to collect all the data before the error happened.
     // f.ex. if failure happened on aggregation state, you have to save all the job done before that and write in file
     // however in that state, all the variables are inaccessible in catch block
     
     // or we have to keep only the initial version of the yaml and inject error message
  }
};

@jmcook1186
Copy link
Contributor Author

I'm not sure I'm following either - is this complication arising because the manifest is loaded into memory inside the try block? Meaning the catch doesn't have access to it?

So a fix would be to separate the call to load from the try/catch block that computes the manifest?

In either case, the intended outcome is only to catch the error message and insert it into the original manifest file as follows:

execution:
  status: fail
  error: 'InputValidationError:   "duration" parameter is required. Error code: invalid_type'.

I'm not seeing why this needs to access any other system state. Might be worth a call to explain @narekhovhannisyan ?

@jawache
Copy link
Contributor

jawache commented Apr 30, 2024

Yes exactly just have

const {tree, rawContext, parameters} = await load(inputPath!, paramPath);

Outside of the try block, if an exception is thrown loading a Yaml file then can just log and exit, no need to add that exception it to the Yaml file.

@narekhovhannisyan
Copy link
Member

@jawache if you don't want to keep yaml file in latest shape when writing down yeah that will work. F.ex. you error occurred in aggregation state, you want to save yaml file in computed state or just the initial version with error?

@jawache
Copy link
Contributor

jawache commented Apr 30, 2024

@narekhovhannisyan just in it's initial state is fine (plus exception & runtime info like versions from #591)

If you want to know hints re: state just prior to failure we can go to the logs and see what we logged (after we finish of #600)

And also, the way we're defining the functionality of the if-diff if there are extra tree nodes from a partial run, it would trigger a difference that is unnecessary to see for testing purposes. So just the initial state and the exception plus runtime is all we need for now.

@narekhovhannisyan narekhovhannisyan linked a pull request May 1, 2024 that will close this issue
9 tasks
@narekhovhannisyan narekhovhannisyan moved this from In Progress to Pending Review in IF May 1, 2024
@MariamKhalatova MariamKhalatova moved this from Pending Review to Testing in IF May 3, 2024
@MariamKhalatova
Copy link
Contributor

@jawache @jmcook1186 @narekhovhannisyan In the testing scope of this feature, came across with following behaviour:
When manifest structure is invalid (f.ex. initalize->plugin->method is missing), then error is thrown in console, without saving output manifest. Is it the behaviour we expect?

@jmcook1186
Copy link
Contributor Author

Oh interesting - nice catch @MariamKhalatova . We should capture these errors in a manifest file too. @narekhovhannisyan

@zanete zanete moved this from Testing to In Progress in IF May 7, 2024
@narekhovhannisyan
Copy link
Member

@MariamKhalatova @jmcook1186 for handling that we need to separate loading and validation lifecycles, and move the validation under the try catch block

@jawache
Copy link
Contributor

jawache commented May 7, 2024

@narekhovhannisyan and @MariamKhalatova generally validation errors that are so bad the file can't even be loaded are not important to include, you might be passing in a jpeg instead of a yaml or something else just as silly. If the input file can't even be parsed as a yaml file, then there is little chance we can add in an exceptions node so we just log to console.

Specifically from your example though @MariamKhalatova we do want to include errors raised when the plugin can't initialize due to bad config, that's a common problem.

@narekhovhannisyan to capture that why can't the try block be moved to just the line below? That's before plugin initialization. (In my mind this whole exercise should be very simple now we have refactored everything in some pure functional architecture, it's mostly a decision as to where we put the try keyword)

const impactEngine = async () => {

  const options = parseArgs();

  if (!options) {
    return;
  }

  logger.info(DISCLAIMER_MESSAGE);
  const {inputPath, paramPath, outputOptions} = options;

  const {tree, rawContext, parameters} = await load(inputPath!, paramPath);

  try {

  const context = await injectEnvironment(rawContext);
  parameterize.combine(context.params, parameters);
  const pluginStorage = await initalize(context.initialize.plugins);
  const computedTree = await compute(tree, {context, pluginStorage});
  const aggregatedTree = aggregate(computedTree, context.aggregation);
  context['if-version'] = packageJson.version;
  exhaust(aggregatedTree, context, outputOptions);

  return;
  } catch (e) {
     // here you have to collect all the data before the error happened.
     // f.ex. if failure happened on aggregation state, you have to save all the job done before that and write in file
     // however in that state, all the variables are inaccessible in catch block
     
     // or we have to keep only the initial version of the yaml and inject error message
  }
};

@narekhovhannisyan
Copy link
Member

@jawache Yeah, we discussed a case where the problem is not because of loading, but validating the structure. currently, load and validation are taking place in one function. So separating them into two different functions will solve the problem and it will be possible to move it to try/catch block

@jawache
Copy link
Contributor

jawache commented May 7, 2024

@narekhovhannisyan oh I see, so the raw validation of the manifest file structure is happening in const {tree, rawContext, parameters} = await load(inputPath!, paramPath); along side just loading the file?

@narekhovhannisyan
Copy link
Member

@jawache yeah, separating the validation from it will solve the problem

@narekhovhannisyan narekhovhannisyan moved this from In Progress to Pending Review in IF May 8, 2024
@jmcook1186
Copy link
Contributor Author

Docs partially addressed here
Green-Software-Foundation/if-docs#71

Also requires a PR to if-docs to update https://if.greensoftware.foundation/major-concepts/manifest-file with a new section titled execution where status and error fields can be explained.

@narekhovhannisyan
Copy link
Member

@MariamKhalatova MariamKhalatova moved this from Pending Review to Testing in IF May 9, 2024
@github-project-automation github-project-automation bot moved this from Testing to Done in IF May 12, 2024
@zanete zanete modified the milestones: Sprint 11 / QA1, Improve Trust Jun 5, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

5 participants