-
Notifications
You must be signed in to change notification settings - Fork 77
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds a tips example #791
Adds a tips example #791
Conversation
❎ Added a drop_table method to help with testing when changing table configuration
ℹ️ get_tip() is useful for getting a single tip by its id. ✏️ Added doc strings for methods
🐛 Fixed bug with get_tip() not returning _get_tip() but rather get_tip() causing an infinite loop.
🗣 When a tip is added and submitted, it updates the tip field of the metadata of the agent. This property can then be used later in the reviewing process
ℹ️ The script allows for a researcher to review and remove a tip for a task that have accepted into the database. 🔍 The script can be useful for removing a tip that you accidently added.
🚨 Test was failing probably because of this
🔒 Added type-checking for review_tips_for_task.py when entering a bonus number.
…acebookresearch/Mephisto into add-tips-to-toxicity-detection
💩 The handle_tip_submit endpoint only prints as of know. There is no socket message sending currently.
✏️ Added some comments for functions.
🎨 This improves the structure of the code later on when feedback is implemented because feedback will use the same endpoint.
ℹ️ Metadata is set for taskConfig in the build_router.py file
…Mephisto into add-tips-to-toxicity-detection
ℹ️ This is done by using the mephisto-task lib inside of the mephisto-worker-experience lib
mephisto/abstractions/blueprints/remote_procedure/remote_procedure_agent_state.py
Outdated
Show resolved
Hide resolved
def main(): | ||
db = LocalMephistoDB() | ||
mephisto_data_browser = MephistoDataBrowser(db) | ||
task_names = mephisto_data_browser.get_task_name_list() | ||
print("\nTask Names:") | ||
for task_name in task_names: | ||
print(task_name) | ||
print("") | ||
task_name = input( | ||
"Enter the name of the task that you want to look at the tips of: \n" | ||
) | ||
print("") | ||
while task_name not in task_names: | ||
print("That task name is not valid\n") | ||
task_name = input( | ||
"Enter the name of the task that you want to look at the tips of: \n" | ||
) | ||
print("") | ||
units = mephisto_data_browser.get_all_units_for_task_name(task_name) | ||
if len(units) == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This command line interface of this scripts gets improved in #813
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added simple comments
ℹ️ As the Tips component is in a shadow dom there is no style clashing so bulma can be used again.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your patience on this one - I think there's one last round of changes to consolidate and clean up here, but afterwards it should be good to ship!
@@ -127,7 +127,7 @@ def set_init_state(self, data: Any) -> bool: | |||
raise NotImplementedError() | |||
|
|||
@abstractmethod | |||
def get_init_state(self) -> Optional[Any]: | |||
def get_init_state(self, get_all_state: Optional[bool] = False) -> Optional[Any]: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this get_all_state
parameter still necessary now that there's direct access to metadata
?
task_name = self.task_run.to_dict()["task_name"] | ||
db = LocalMephistoDB() | ||
mephisto_data_browser = MephistoDataBrowser(db) | ||
tips = mephisto_data_browser.get_metadata_property_from_task_name( | ||
task_name=task_name, property_name="tips" | ||
) | ||
front_end_task_config_with_metadata = super().update_task_config_with_tips( | ||
tips, frontend_task_config | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the amount of interacting with super()
and the fact that we have similar code for this across most (if not all) Blueprint
s, shouldn't this just go in the main Blueprint
's get_frontend_args()
method?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I did do this in the past, but reverted it because there were some concerns about using the DataBrowser and LocalMephistoDB() in the base blueprint class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was also one problem when I did that. I wasn't able to use the DataBrowser type for a parameter when I had the function in the base blueprint class. Some weird bug would emerge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah okay I recall now what I had considered - Ideally we'd be pulling the tips
from the task folder on launch, which will be part of the hydra DictConfig
default arguments, rather than from a query by task name. To this end it may be necessary to implement the "storing tips in an accessible file" in this PR.
The implementation here would instead look something like:
tips = self._load_tips_from_dir(self.args.blueprint.tips_directory)
and that would be safe in the super
.
For the tips saving method, in your review script I imagine something like:
save_dir = unit.get_task_run().args.blueprint.tips_directory
# write tips out to file
And then the default value for tips_directory
should be interpolatable from ${task_dir}/tips
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am planning on keeping the tips file in the task's folder.
For example, the static_react_task_with_tips task should have its tips in the examples/static_react_task_with_tips
folder. The run_task.py file seems to have a property called cfg.task_dir that gets the appropriate task path.
Is there anyway to get this cfg.task_dir in the blueprint.py file?
I looked into the get_dir_for_task() method in the dirs.py file, but that doesn't seem to work for my purposes.
I could maybe store it in the data directory instead, as that is accessible from the blueprint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cfg.task_dir
is a configuration value that is only available from a TaskConfig
used from a run script. Adding the tips_directory
argument (or similar) to the blueprint and then autofilling with ${task_dir}/...
locks in that value at runtime and would ensure that future loads of the blueprint have the correct directory filled.
🔥 Removed get_all_state property
mephisto/tools/data_browser.py
Outdated
AssignmentState.REJECTED, | ||
AssignmentState.SOFT_REJECTED, | ||
] | ||
return self.collect_matching_units_from_task_runs(task_runs, statuses) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think a method where you just pass in the negation (filtering) would feel better here, e.g. get all tasks except those the ones which are the following status, e.g. EXPIRED, - but I'm just nit-picking, this is fine
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like that would just make the code harder to understand as you would have to go to assignment_state.py to figure out which AssignmentState's are being accepted.
It wouldn't really make it more concise either as the code would become
filtered_statuses=[
AssignmentState.CREATED,
AssignmentState.LAUNCHED,
AssignmentState.ASSIGNED,
AssignmentState.MIXED,
AssignmentState.EXPIRED
]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally for legibility, it could also be OK to put these collections inside of the AssignmentState
class as class functions. AssignmentState.completed()
and AssignmentState.final_agent()
come to mind as relevant for these two lists at the moment.
✨ Allows for r easy transferring of tips from one task to another ⚡️ Task startup should be faster as no database queries are needed to get metadata
🔥 Removed unneeded comments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I know this has gotten a lot of rounds of feedback, but I think this should be the last one! Thanks for your patience here as we've been working through a new design.
@@ -7,6 +7,7 @@ mephisto: | |||
blueprint: | |||
task_source: ${task_dir}/webapp/build/bundle.js | |||
link_task_source: true | |||
tips_location: ${task_dir}/outputs/tips.csv |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@pringshia any thoughts on the directory name? I feel like outputs
may push too far to the expectation that this is output data, but I do like having a container for possible assets to a run task.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, I do like this idea of having a "container" for possible assets. Outputs is sort of an overloaded term, since we use it elsewhere in the task json (i.e. the inputs and outputs field). So that's one reason I'd be hesitant to use outputs, unless you think it jives well with that?
I do like having a container for possible assets to a run task
Why not just call it "assets"? :D
Tangentially: I am a little confused by this direction.
Tips should be addable to any task, I'm not sure why the parlai blueprint needs to have "tips_location" defined as part of its blueprint.
This means that any other task that wants to add the Tips addon will need it's blueprint modified, add the Hydra config, etc. Does this seem modular enough? Ideally a mixin or something would be preferred? It should be easy to add Tips across blueprints.
Also this adds the Tips overhead to all parlai tasks. This doesn't seem like a lean and future-proof path.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I am adding it to the BlueprintArgs class instead to avoid the need to add the tips_location property itself to each blueprint.
This is causing a github actions testing error that I have to solve though.
mephisto/abstractions/blueprint.py
Outdated
@@ -229,7 +231,7 @@ def __init__( | |||
self.args = args | |||
self.shared_state = shared_state | |||
self.frontend_task_config = shared_state.task_config | |||
|
|||
self.task_run = task_run |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any reason we're adding the task_run
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: this is still here, but I don't see why it's added. Please remove.
@@ -127,6 +128,13 @@ class ParlAIChatBlueprintArgs(OnboardingRequiredArgs, BlueprintArgs): | |||
"help": "Optional count of conversations to have if no context provided" | |||
}, | |||
) | |||
tips_location: str = field( | |||
default=MISSING, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than default to MISSING
which is giving the test error, use '${task_dir}/outputs/tips.csv'
if this is what you want the default to be for these. If something isn't required we should be providing a default value.
It should also be included just in the base BlueprintArgs
rather than needing to duplicate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's what I originally kept it as, but I was getting the test error so I thought that setting it to MISSING would solve the error. It did not, I had to do something else. I will revert it back
mephisto/abstractions/blueprints/parlai_chat/parlai_chat_blueprint.py
Outdated
Show resolved
Hide resolved
# Use overrides provided downstream | ||
frontend_task_config.update(self.frontend_task_config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still need this, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't really know what it does. It looks completely redundant, but I'll add it back though.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
self.frontend_task_config
may not actually be captured by super().get_frontend_task_config()
, and importantly self.frontend_task_config
is something we can set values for directly from the command line. This means, if the user wants to, by doing this at the end we can always ensure they have a path to force override something set in here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agree that the code looks a bit confusing. ideally if there's a way to capture the above explanation in source (whether via comments, more explicit variable names etc., or other) that would be good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just one comment for a 1 line change. Otherwise this is good to go! Thanks for the patience and debugging on this one, I'm excited to see tips available in the coming release of Mephisto!
Summary
Backend Changes
Adds a new example named, "static_react_task_with_tips".
When running the task, using
python run_task.py
, there is a tips button that shows. Clicking this button will open a popup window that displays currently accepted tips and an area where a new tip can be submitted for review.Adds a new python script, "review_tips_for_task.py", that allows the user to accept or reject each of the submitted tips for a task. A tip can only show in the popup if it has been accepted. Accepted tips go into the tips.csv which is in the
outputs
folder of the task.Adds a new python script, "remove_accepted_tip.py", that allows the user to remove a tip that was already accepted. This is useful if you by accidentally accepted an undesirable tip.
Adds handleSubmitMetadata(), a new function in the mephisto-task package. This function makes a post request to the "/submit_metadata" endpoint as seen in the server.js file.
Data can be sent to the backend by the new handleSubmitMetadata method. This method is used as the default submit method in the Tips component to reduce the work that has to be done by the user when adding tips to their task. The data is sent to the new "/submit-metadata" endpoint.
Then the ClientIOHandler receives the packet from the server.js and operates the code required to save the metadata to the appropriate agent.
Metadata can be retrieved from the frontend using the taskConfig property from the useMephistoTask() hook that comes from the
mephisto-task
package. To retrieve tips you can dotaskConfig["tips"]
. Ideally you would not really have to do this as the Tips component does this for you behind the hood.The code that modified the taskConfig to add metadata to it is done in the build_router.py file. This is causing some problems with passing the python_build test so I might need some guidance on that
Frontend Changes
The tips component is put in a shadow dom so that outside styles do not conflict with the component
New Video
add-tips-example.mp4
In the video it is clear that the worker submits two tips. Then the assignment is closed and the tips are reviewed using the review script. The first tip is accepted and the second tip is rejected. When the task is relaunched, opening the popup modal shows only the accepted tip.
Architecture Stuff