Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Add test for evaluate step #460

Merged
merged 12 commits into from
Aug 19, 2024

Conversation

creatorrr
Copy link
Contributor

@creatorrr creatorrr commented Aug 17, 2024

Signed-off-by: Diwank Tomer [email protected]


🚀 This description was created by Ellipsis for commit aead2ab

Summary:

This PR enhances evaluate_step, introduces switch_step, and integrates both with tests and exception handling for improved workflow robustness.

Key points:

  • Enhance evaluate_step with tests and exception handling.
  • Introduce switch_step function to evaluate cases in workflows.
  • Integrate switch_step into create_worker.
  • Update STEP_TO_ACTIVITY and STEP_TO_LOCAL_ACTIVITY to include SwitchStep.
  • Add tests for switch_step and evaluate_step.
  • Handle BaseException in steps and log errors.
  • Update dependencies in pyproject.toml.
  • Reflect changes in TypeScript and Python SDKs.

Generated with ❤️ by ellipsis.dev

Signed-off-by: Diwank Tomer <[email protected]>
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Reviewed everything up to ae112cb in 27 seconds

More details
  • Looked at 260 lines of code in 10 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. agents-api/agents_api/activities/task_steps/evaluate_step.py:17
  • Draft comment:
    The change from context.definition.arguments to context.current_step.evaluate is significant. Please ensure that there are adequate tests covering this new implementation to prevent potential bugs.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is asking for verification of tests, which is not actionable and does not point out a specific code change requirement. It is more of a reminder or suggestion rather than a necessary change. The rules specify not to ask for verification or ensure behavior is intended, which this comment does.
    I might be overlooking the potential impact of the code change, but the comment does not specify a definite issue or actionable change. It is more of a suggestion to ensure tests are in place.
    Even if the code change is significant, the comment does not provide a specific action or point out a definite issue, which is required by the rules.
    The comment should be removed as it does not adhere to the rules, specifically by asking for verification rather than pointing out a specific code change requirement.

Workflow ID: wflow_92AWOc7Sp5yr8hcg


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 43c03fe in 28 seconds

More details
  • Looked at 205 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_g0iaDIqBxfbDdTLg


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

@@ -161,48 +162,5 @@ async def run(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The removal of detailed step handling (lines 48-162) could break the workflow execution logic, especially since different step types like PromptStep, ToolCallStep, etc., are no longer individually processed. This change needs a thorough review to ensure that it doesn't introduce regressions or remove necessary functionalities.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 345ddfe in 38 seconds

More details
  • Looked at 254 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. agents-api/agents_api/workflows/task_execution.py:162
  • Draft comment:
    The PR description mentions modifications to evaluate_step to use context.current_step, but this change is not visible in the provided code. Please ensure that the intended changes are correctly implemented and included in the PR.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment is speculative and asks the author to ensure changes are implemented, which is not allowed. It also references the PR description, which is not relevant to the code diff being reviewed.
    I might be missing the context of the PR description, but the rules clearly state not to ask for confirmation or ensure behavior, and not to comment on the PR description.
    Even if the PR description is relevant, the comment still violates the rules by asking for confirmation and ensuring behavior.
    The comment should be removed as it violates the rules by being speculative and asking for confirmation.

Workflow ID: wflow_oV78mGotO6ol6wn1


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 2fbd877 in 34 seconds

More details
  • Looked at 252 lines of code in 3 files
  • Skipped 0 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. agents-api/agents_api/workflows/task_execution.py:162
  • Draft comment:
    The PR description mentions modifications to evaluate_step to use context.current_step, but the changes are not visible in the provided code. Please ensure that the intended changes are correctly implemented or update the PR description accordingly.
  • Reason this comment was not posted:
    Decided after close inspection that this draft comment was likely wrong and/or not actionable:
    The comment violates the rule against asking the PR author to update the PR description or confirm intentions. It is not directly pointing out a code issue but rather a discrepancy between the PR description and the code, which is not actionable in terms of code changes.
    I might be missing the importance of ensuring the PR description matches the code, but according to the rules, this is not a valid reason to comment.
    The rules are clear that comments should focus on code changes, not on the PR description or intentions.
    The comment should be removed as it violates the rule against asking for PR description updates or confirmations.

Workflow ID: wflow_Ea10OXHSTFLZ2oDE


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Diwank Tomer added 2 commits August 17, 2024 17:34
Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 Looks good to me! Incremental review on 8e0c9c0 in 1 minute and 25 seconds

More details
  • Looked at 6159 lines of code in 77 files
  • Skipped 2 files when reviewing.
  • Skipped posting 1 drafted comments based on config settings.
1. sdks/ts/src/api/models/Tasks_BaseWorkflowStep.ts:9
  • Draft comment:
    The addition of new step kinds like embed, search, set, get, etc., is well-reflected in the TypeScript SDK. Ensure that all new step kinds are properly documented and that their behavior is clearly defined in the backend logic to prevent any runtime issues.
  • Reason this comment was not posted:
    Confidence changes required: 33%
    The PR adds a variety of new steps and modifies existing ones in the workflow system. The changes include new step kinds like embed, search, set, get, foreach, map_reduce, parallel, switch, sleep, return, and updates to existing steps to accommodate these changes. The PR also updates the TypeScript SDK to reflect these new and modified steps, ensuring that the types and schemas are consistent with the backend changes. The PR seems well-structured and follows the project's patterns for defining and handling workflow steps.

Workflow ID: wflow_VwXw1pPOimZl4wa1


You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 6cd98ae in 1 minute and 36 seconds

More details
  • Looked at 587 lines of code in 20 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_5i1ME6KRe8tMe0gX


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

) -> StepOutcome[dict[str, Any]]:
exprs = context.definition.arguments
output = simple_eval_dict(exprs, values=context.model_dump())
async def evaluate_step(context: StepContext) -> StepOutcome:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The evaluate_step function signature should include the specific step type EvaluateStep for the context parameter to ensure type safety and clarity in the function's usage.

Suggested change
async def evaluate_step(context: StepContext) -> StepOutcome:
async def evaluate_step(context: StepContext[EvaluateStep]) -> StepOutcome:

return result

except Exception as e:
logging.error(f"Error in log_step: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error logging message incorrectly references log_step instead of wait_for_input_step. This could lead to confusion when debugging.

Suggested change
logging.error(f"Error in log_step: {e}")
logging.error(f"Error in wait_for_input_step: {e}")

return StepOutcome(output=arguments, transition_to=("step", transition_target))

except Exception as e:
logging.error(f"Error in log_step: {e}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error logging message incorrectly references log_step instead of yield_step. This could lead to confusion when debugging.

Suggested change
logging.error(f"Error in log_step: {e}")
logging.error(f"Error in yield_step: {e}")

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 2c6dada in 41 seconds

More details
  • Looked at 206 lines of code in 4 files
  • Skipped 0 files when reviewing.
  • Skipped posting 2 drafted comments based on config settings.
1. agents-api/agents_api/activities/task_steps/if_else_step.py:16
  • Draft comment:
    The if_else_step function is missing type annotations for its parameters. Consider adding these to enhance code clarity and error prevention.
async def if_else_step(context: StepContext[IfElseWorkflowStep]) -> StepOutcome:
  • Reason this comment was not posted:
    Marked as duplicate.
2. agents-api/agents_api/activities/task_steps/log_step.py:16
  • Draft comment:
    The log_step function is missing type annotations for its parameters. Consider adding these to enhance code clarity and error prevention.
async def log_step(context: StepContext[LogStep]) -> StepOutcome:
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_P8MY5sXEYAQJHYuW


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

) -> StepOutcome[dict[str, Any]]:
exprs = context.definition.arguments
output = simple_eval_dict(exprs, values=context.model_dump())
async def evaluate_step(context: StepContext) -> StepOutcome:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The evaluate_step function is missing type annotations for its parameters. Consider adding these to enhance code clarity and error prevention.

Suggested change
async def evaluate_step(context: StepContext) -> StepOutcome:
async def evaluate_step(context: StepContext[EvaluateStep]) -> StepOutcome:

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 9fdafaf in 1 minute and 13 seconds

More details
  • Looked at 768 lines of code in 17 files
  • Skipped 2 files when reviewing.
  • Skipped posting 4 drafted comments based on config settings.
1. agents-api/agents_api/activities/task_steps/if_else_step.py:29
  • Draft comment:
    Catching BaseException is generally not recommended as it can catch system-exiting exceptions such as SystemExit and KeyboardInterrupt, which should normally be allowed to propagate. Consider catching Exception instead, which does not include these system-exiting exceptions.
  • Reason this comment was not posted:
    Marked as duplicate.
2. agents-api/agents_api/activities/task_steps/log_step.py:28
  • Draft comment:
    Catching BaseException is generally not recommended as it can catch system-exiting exceptions such as SystemExit and KeyboardInterrupt, which should normally be allowed to propagate. Consider catching Exception instead, which does not include these system-exiting exceptions.
  • Reason this comment was not posted:
    Marked as duplicate.
3. agents-api/agents_api/activities/task_steps/return_step.py:27
  • Draft comment:
    Catching BaseException is generally not recommended as it can catch system-exiting exceptions such as SystemExit and KeyboardInterrupt, which should normally be allowed to propagate. Consider catching Exception instead, which does not include these system-exiting exceptions.
  • Reason this comment was not posted:
    Marked as duplicate.
4. agents-api/agents_api/activities/task_steps/yield_step.py:41
  • Draft comment:
    Catching BaseException is generally not recommended as it can catch system-exiting exceptions such as SystemExit and KeyboardInterrupt, which should normally be allowed to propagate. Consider catching Exception instead, which does not include these system-exiting exceptions.
  • Reason this comment was not posted:
    Marked as duplicate.

Workflow ID: wflow_sEkgyJl3lZASS1Cu


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.


return StepOutcome(output=output)
except BaseException as e:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Catching BaseException is generally not recommended as it can catch system-exiting exceptions such as SystemExit and KeyboardInterrupt, which should normally be allowed to propagate. Consider catching Exception instead, which does not include these system-exiting exceptions.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on 69100c5 in 24 seconds

More details
  • Looked at 192 lines of code in 5 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_HdtBKqyanbQguNZy


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

cases: list[str] = [c.case for c in context.current_step.switch]

for i, case in enumerate(cases):
result = simple_eval(case, names=context.model_dump())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using simple_eval for evaluating expressions can be a security risk if user input is not properly sanitized. Consider using a more secure method or ensure that the input is strictly controlled.

Copy link
Contributor

@ellipsis-dev ellipsis-dev bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❌ Changes requested. Incremental review on aead2ab in 30 seconds

More details
  • Looked at 201 lines of code in 5 files
  • Skipped 0 files when reviewing.
  • Skipped posting 0 drafted comments based on config settings.

Workflow ID: wflow_VnzNU2wtqzPVokzZ


Want Ellipsis to fix these issues? Tag @ellipsis-dev in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet mode, and more.

cases: list[str] = [c.case for c in context.current_step.switch]

for i, case in enumerate(cases):
result = simple_eval(case, names=context.model_dump())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using simple_eval for evaluating expressions can be a security risk if user input is directly evaluated. Consider using a more secure method or ensure proper sandboxing to prevent arbitrary code execution.

@whiterabbit1983 whiterabbit1983 merged commit 72da581 into dev-tasks-disable-routes Aug 19, 2024
2 of 5 checks passed
@whiterabbit1983 whiterabbit1983 deleted the f/workflow-tests branch August 19, 2024 13:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants