-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consolidating AgentState metadata #814
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for putting this together!
Have a few minor questions but overall looks good.
# Backwards compatibility for times | ||
if "start_time" in state: | ||
self.metadata.task_start = state["start_time"] | ||
self.metadata.task_end = state["end_time"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for this migration consideration
Class to track the first-class feature fields of info about an AgentState. | ||
|
||
AgentState subclasses may choose to track additional metadata, but should | ||
attach to the agent state subclass directly. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is meant by "attach" here? is that some Python specific terminology?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah I just mean use their own attributes, this is leftover commentary from when I intended to standardize more. I'll update
|
||
task_start: Optional[float] = None | ||
task_end: Optional[float] = None | ||
# TODO other metadata fields can be initialized |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was imagining that adding the tips plugin would be as simple as doing yarn add mephisto-worker-addons
and using it without having to modify the back-end code much. Would this require end users to update the python code to use different subclasses for AgentStateMetadata?
Also what if they wanted to compose different plugins? For example, let's say in the future we introduce a plugin for gamification or leaderboards. It would be nice if adding those plugins just worked with the metadata storage, without having to define and compose the dataclasses.
Not sure if this is the right solution, but just wanted to articulate the use case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Valid concern, but I'm wary of projecting too far here. Additional plugins could piggyback on metadata by adding directly to this data class on functionality we feel is standard or general enough. Anything additional or in development should probably just store on an AgentState+Blueprint if the functionality is too complex to simply be in this metadata dataclass.
if "times" in self.state: | ||
assert isinstance(self.state["times"], dict) | ||
self.metadata.task_start = self.state["times"]["task_start"] | ||
self.metadata.task_end = self.state["times"]["task_end"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tbh I'm happy keeping the times within the "times" top level field, as library managed metadata... and the new metadata field for plugins and extensions to use. it might make some of the compatibility stuff here easier to manage though i see you've already handled it. I don't have a strong opinion on this though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the cost of standardization here for these simple fields is overall worth it. I likely should have done this before.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, I don't see any problems and it ran as expected.
One thing that would make sense to think about (I could do this when I merge into add-tips-example):
There isn't a way to update the metadata generally. As of now this isn't a problem because the metadata only has a task_start and task_end field. There is no reason to update these outside of init and submit.
However, if there is a feature like Tips or something else, it might not make too much sense to create an update method for each new field in metadata.
Something like below can be created in agent_state.py:
def update_metadata(self, property_name: str, property_value):
if property_name not in self.metadata:
"""
put some message here to tell user that the metadata field does not exist
(could help user know that they have to add a field to the _AgentStateMetadata)
"""
return
else:
self.metadata[property_name] = property_value
@Etesam913 noted in the PR description:
Definitely would make sense to create an Rebasing #791 on this should let you clean up the code a bit while keeping relatively the same design. |
Agreed on the general edit part, that's sort of what I was sort of getting at with my comment here. Also, given (from the OP):
I think the idea is that edits are to be made directly to the metadata object and then persisted via a call to |
Correct - the I only didn't implement |
Codecov Report
@@ Coverage Diff @@
## main #814 +/- ##
==========================================
+ Coverage 64.37% 64.92% +0.55%
==========================================
Files 107 107
Lines 9178 9195 +17
==========================================
+ Hits 5908 5970 +62
+ Misses 3270 3225 -45
Continue to review full report at Codecov.
|
Overview
To unblock @Etesam913 on #791, this PR standardizes a flow for reading and writing metadata from the top-level
AgentState
classAfter much (internal) deliberation on what should go in the main
AgentState
class and what should still be fragmented out into the subclasses, I decided that only metadata would be considered a first-class feature ofAgentState
's (rather than all data types) but that the save+load flow would be standardized and updated towards the direction of #567. Happy to revisit this in discussions, but I couldn't find any other clean way to allow new agent types to be flexible with how they handle their data.Implementation
New functionality:
AgentState
abstract methods have been moved behind wrappers such that I can put the save+load lifecycle functions insideAgentState.save_metadata
andAgentState.load_metadata
have been created as ways to handle IO for the metadata, while direct edits can be made to theAgentState.metadata
object (a dataclass of type_AgentStateMetadata
).Towards future design, this allows the
update_metadata
method to exist inside of the baseAgentState
class rather than needing implementations throughout the similar classes.