-
Notifications
You must be signed in to change notification settings - Fork 907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Add test for evaluate step #460
Conversation
Signed-off-by: Diwank Tomer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Reviewed everything up to ae112cb in 27 seconds
More details
- Looked at
260
lines of code in10
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. agents-api/agents_api/activities/task_steps/evaluate_step.py:17
- Draft comment:
The change fromcontext.definition.arguments
tocontext.current_step.evaluate
is significant. Please ensure that there are adequate tests covering this new implementation to prevent potential bugs. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is asking for verification of tests, which is not actionable and does not point out a specific code change requirement. It is more of a reminder or suggestion rather than a necessary change. The rules specify not to ask for verification or ensure behavior is intended, which this comment does.
I might be overlooking the potential impact of the code change, but the comment does not specify a definite issue or actionable change. It is more of a suggestion to ensure tests are in place.
Even if the code change is significant, the comment does not provide a specific action or point out a definite issue, which is required by the rules.
The comment should be removed as it does not adhere to the rules, specifically by asking for verification rather than pointing out a specific code change requirement.
Workflow ID: wflow_92AWOc7Sp5yr8hcg
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 43c03fe in 28 seconds
More details
- Looked at
205
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
0
drafted comments based on config settings.
Workflow ID: wflow_g0iaDIqBxfbDdTLg
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
@@ -161,48 +162,5 @@ async def run( | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The removal of detailed step handling (lines 48-162) could break the workflow execution logic, especially since different step types like PromptStep
, ToolCallStep
, etc., are no longer individually processed. This change needs a thorough review to ensure that it doesn't introduce regressions or remove necessary functionalities.
43c03fe
to
345ddfe
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 345ddfe in 38 seconds
More details
- Looked at
254
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. agents-api/agents_api/workflows/task_execution.py:162
- Draft comment:
The PR description mentions modifications toevaluate_step
to usecontext.current_step
, but this change is not visible in the provided code. Please ensure that the intended changes are correctly implemented and included in the PR. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment is speculative and asks the author to ensure changes are implemented, which is not allowed. It also references the PR description, which is not relevant to the code diff being reviewed.
I might be missing the context of the PR description, but the rules clearly state not to ask for confirmation or ensure behavior, and not to comment on the PR description.
Even if the PR description is relevant, the comment still violates the rules by asking for confirmation and ensuring behavior.
The comment should be removed as it violates the rules by being speculative and asking for confirmation.
Workflow ID: wflow_oV78mGotO6ol6wn1
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Signed-off-by: Diwank Tomer <[email protected]>
345ddfe
to
2fbd877
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 2fbd877 in 34 seconds
More details
- Looked at
252
lines of code in3
files - Skipped
0
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. agents-api/agents_api/workflows/task_execution.py:162
- Draft comment:
The PR description mentions modifications toevaluate_step
to usecontext.current_step
, but the changes are not visible in the provided code. Please ensure that the intended changes are correctly implemented or update the PR description accordingly. - Reason this comment was not posted:
Decided after close inspection that this draft comment was likely wrong and/or not actionable:
The comment violates the rule against asking the PR author to update the PR description or confirm intentions. It is not directly pointing out a code issue but rather a discrepancy between the PR description and the code, which is not actionable in terms of code changes.
I might be missing the importance of ensuring the PR description matches the code, but according to the rules, this is not a valid reason to comment.
The rules are clear that comments should focus on code changes, not on the PR description or intentions.
The comment should be removed as it violates the rule against asking for PR description updates or confirmations.
Workflow ID: wflow_Ea10OXHSTFLZ2oDE
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Signed-off-by: Diwank Tomer <[email protected]>
Signed-off-by: Diwank Tomer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Looks good to me! Incremental review on 8e0c9c0 in 1 minute and 25 seconds
More details
- Looked at
6159
lines of code in77
files - Skipped
2
files when reviewing. - Skipped posting
1
drafted comments based on config settings.
1. sdks/ts/src/api/models/Tasks_BaseWorkflowStep.ts:9
- Draft comment:
The addition of new step kinds likeembed
,search
,set
,get
, etc., is well-reflected in the TypeScript SDK. Ensure that all new step kinds are properly documented and that their behavior is clearly defined in the backend logic to prevent any runtime issues. - Reason this comment was not posted:
Confidence changes required:33%
The PR adds a variety of new steps and modifies existing ones in the workflow system. The changes include new step kinds likeembed
,search
,set
,get
,foreach
,map_reduce
,parallel
,switch
,sleep
,return
, and updates to existing steps to accommodate these changes. The PR also updates the TypeScript SDK to reflect these new and modified steps, ensuring that the types and schemas are consistent with the backend changes. The PR seems well-structured and follows the project's patterns for defining and handling workflow steps.
Workflow ID: wflow_VwXw1pPOimZl4wa1
You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
Signed-off-by: Diwank Tomer <[email protected]>
Signed-off-by: Diwank Tomer <[email protected]>
Signed-off-by: Diwank Tomer <[email protected]>
Signed-off-by: Diwank Tomer <[email protected]>
Signed-off-by: Diwank Tomer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 6cd98ae in 1 minute and 36 seconds
More details
- Looked at
587
lines of code in20
files - Skipped
0
files when reviewing. - Skipped posting
0
drafted comments based on config settings.
Workflow ID: wflow_5i1ME6KRe8tMe0gX
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
) -> StepOutcome[dict[str, Any]]: | ||
exprs = context.definition.arguments | ||
output = simple_eval_dict(exprs, values=context.model_dump()) | ||
async def evaluate_step(context: StepContext) -> StepOutcome: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The evaluate_step
function signature should include the specific step type EvaluateStep
for the context
parameter to ensure type safety and clarity in the function's usage.
async def evaluate_step(context: StepContext) -> StepOutcome: | |
async def evaluate_step(context: StepContext[EvaluateStep]) -> StepOutcome: |
return result | ||
|
||
except Exception as e: | ||
logging.error(f"Error in log_step: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error logging message incorrectly references log_step
instead of wait_for_input_step
. This could lead to confusion when debugging.
logging.error(f"Error in log_step: {e}") | |
logging.error(f"Error in wait_for_input_step: {e}") |
return StepOutcome(output=arguments, transition_to=("step", transition_target)) | ||
|
||
except Exception as e: | ||
logging.error(f"Error in log_step: {e}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The error logging message incorrectly references log_step
instead of yield_step
. This could lead to confusion when debugging.
logging.error(f"Error in log_step: {e}") | |
logging.error(f"Error in yield_step: {e}") |
Signed-off-by: Diwank Tomer <[email protected]>
49ddd62
to
2c6dada
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 2c6dada in 41 seconds
More details
- Looked at
206
lines of code in4
files - Skipped
0
files when reviewing. - Skipped posting
2
drafted comments based on config settings.
1. agents-api/agents_api/activities/task_steps/if_else_step.py:16
- Draft comment:
Theif_else_step
function is missing type annotations for its parameters. Consider adding these to enhance code clarity and error prevention.
async def if_else_step(context: StepContext[IfElseWorkflowStep]) -> StepOutcome:
- Reason this comment was not posted:
Marked as duplicate.
2. agents-api/agents_api/activities/task_steps/log_step.py:16
- Draft comment:
Thelog_step
function is missing type annotations for its parameters. Consider adding these to enhance code clarity and error prevention.
async def log_step(context: StepContext[LogStep]) -> StepOutcome:
- Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_P8MY5sXEYAQJHYuW
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
) -> StepOutcome[dict[str, Any]]: | ||
exprs = context.definition.arguments | ||
output = simple_eval_dict(exprs, values=context.model_dump()) | ||
async def evaluate_step(context: StepContext) -> StepOutcome: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The evaluate_step
function is missing type annotations for its parameters. Consider adding these to enhance code clarity and error prevention.
async def evaluate_step(context: StepContext) -> StepOutcome: | |
async def evaluate_step(context: StepContext[EvaluateStep]) -> StepOutcome: |
Signed-off-by: Diwank Tomer <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 9fdafaf in 1 minute and 13 seconds
More details
- Looked at
768
lines of code in17
files - Skipped
2
files when reviewing. - Skipped posting
4
drafted comments based on config settings.
1. agents-api/agents_api/activities/task_steps/if_else_step.py:29
- Draft comment:
CatchingBaseException
is generally not recommended as it can catch system-exiting exceptions such asSystemExit
andKeyboardInterrupt
, which should normally be allowed to propagate. Consider catchingException
instead, which does not include these system-exiting exceptions. - Reason this comment was not posted:
Marked as duplicate.
2. agents-api/agents_api/activities/task_steps/log_step.py:28
- Draft comment:
CatchingBaseException
is generally not recommended as it can catch system-exiting exceptions such asSystemExit
andKeyboardInterrupt
, which should normally be allowed to propagate. Consider catchingException
instead, which does not include these system-exiting exceptions. - Reason this comment was not posted:
Marked as duplicate.
3. agents-api/agents_api/activities/task_steps/return_step.py:27
- Draft comment:
CatchingBaseException
is generally not recommended as it can catch system-exiting exceptions such asSystemExit
andKeyboardInterrupt
, which should normally be allowed to propagate. Consider catchingException
instead, which does not include these system-exiting exceptions. - Reason this comment was not posted:
Marked as duplicate.
4. agents-api/agents_api/activities/task_steps/yield_step.py:41
- Draft comment:
CatchingBaseException
is generally not recommended as it can catch system-exiting exceptions such asSystemExit
andKeyboardInterrupt
, which should normally be allowed to propagate. Consider catchingException
instead, which does not include these system-exiting exceptions. - Reason this comment was not posted:
Marked as duplicate.
Workflow ID: wflow_sEkgyJl3lZASS1Cu
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
|
||
return StepOutcome(output=output) | ||
except BaseException as e: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Catching BaseException
is generally not recommended as it can catch system-exiting exceptions such as SystemExit
and KeyboardInterrupt
, which should normally be allowed to propagate. Consider catching Exception
instead, which does not include these system-exiting exceptions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on 69100c5 in 24 seconds
More details
- Looked at
192
lines of code in5
files - Skipped
0
files when reviewing. - Skipped posting
0
drafted comments based on config settings.
Workflow ID: wflow_HdtBKqyanbQguNZy
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
cases: list[str] = [c.case for c in context.current_step.switch] | ||
|
||
for i, case in enumerate(cases): | ||
result = simple_eval(case, names=context.model_dump()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using simple_eval
for evaluating expressions can be a security risk if user input is not properly sanitized. Consider using a more secure method or ensure that the input is strictly controlled.
Signed-off-by: Diwank Tomer <[email protected]>
69100c5
to
aead2ab
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
❌ Changes requested. Incremental review on aead2ab in 30 seconds
More details
- Looked at
201
lines of code in5
files - Skipped
0
files when reviewing. - Skipped posting
0
drafted comments based on config settings.
Workflow ID: wflow_VnzNU2wtqzPVokzZ
Want Ellipsis to fix these issues? Tag @ellipsis-dev
in a comment. You can customize Ellipsis with 👍 / 👎 feedback, review rules, user-specific overrides, quiet
mode, and more.
cases: list[str] = [c.case for c in context.current_step.switch] | ||
|
||
for i, case in enumerate(cases): | ||
result = simple_eval(case, names=context.model_dump()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Using simple_eval
for evaluating expressions can be a security risk if user input is directly evaluated. Consider using a more secure method or ensure proper sandboxing to prevent arbitrary code execution.
72da581
into
dev-tasks-disable-routes
Signed-off-by: Diwank Tomer [email protected]
Summary:
This PR enhances
evaluate_step
, introducesswitch_step
, and integrates both with tests and exception handling for improved workflow robustness.Key points:
evaluate_step
with tests and exception handling.switch_step
function to evaluate cases in workflows.switch_step
intocreate_worker
.STEP_TO_ACTIVITY
andSTEP_TO_LOCAL_ACTIVITY
to includeSwitchStep
.switch_step
andevaluate_step
.BaseException
in steps and log errors.pyproject.toml
.Generated with ❤️ by ellipsis.dev