Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a tips example #791

Merged
merged 108 commits into from
Jul 14, 2022
Merged

Adds a tips example #791

merged 108 commits into from
Jul 14, 2022

Conversation

Etesam913
Copy link
Contributor

@Etesam913 Etesam913 commented Jun 11, 2022

Summary

Backend Changes

Adds a new example named, "static_react_task_with_tips".

When running the task, using python run_task.py, there is a tips button that shows. Clicking this button will open a popup window that displays currently accepted tips and an area where a new tip can be submitted for review.

Adds a new python script, "review_tips_for_task.py", that allows the user to accept or reject each of the submitted tips for a task. A tip can only show in the popup if it has been accepted. Accepted tips go into the tips.csv which is in the outputs folder of the task.

Adds a new python script, "remove_accepted_tip.py", that allows the user to remove a tip that was already accepted. This is useful if you by accidentally accepted an undesirable tip.

Adds handleSubmitMetadata(), a new function in the mephisto-task package. This function makes a post request to the "/submit_metadata" endpoint as seen in the server.js file.

Data can be sent to the backend by the new handleSubmitMetadata method. This method is used as the default submit method in the Tips component to reduce the work that has to be done by the user when adding tips to their task. The data is sent to the new "/submit-metadata" endpoint.

Then the ClientIOHandler receives the packet from the server.js and operates the code required to save the metadata to the appropriate agent.

Metadata can be retrieved from the frontend using the taskConfig property from the useMephistoTask() hook that comes from the mephisto-task package. To retrieve tips you can do taskConfig["tips"] . Ideally you would not really have to do this as the Tips component does this for you behind the hood.

The code that modified the taskConfig to add metadata to it is done in the build_router.py file. This is causing some problems with passing the python_build test so I might need some guidance on that

Frontend Changes

The tips component is put in a shadow dom so that outside styles do not conflict with the component

New Video

add-tips-example.mp4

In the video it is clear that the worker submits two tips. Then the assignment is closed and the tips are reviewed using the review script. The first tip is accepted and the second tip is rejected. When the task is relaunched, opening the popup modal shows only the accepted tip.

Architecture Stuff

Screenshot 2022-07-05 at 11-31-57 tldraw

Etesam913 added 29 commits June 2, 2022 15:04
❎ Added a drop_table method to help with testing when changing table configuration
ℹ️  get_tip() is useful for getting a single tip by its id.

✏️  Added doc strings for methods
🐛 Fixed bug with get_tip() not returning _get_tip() but rather get_tip() causing an infinite loop.
🗣 When a tip is added and submitted, it updates the tip field of the metadata of the agent. This property can then be used later in the reviewing process
ℹ️ The script allows for a researcher to review and remove a tip for a task that have accepted into the database.

🔍 The script can be useful for removing a tip that you accidently added.
🚨 Test was failing probably because of this
🔒 Added type-checking for review_tips_for_task.py when entering a bonus number.
💩 The handle_tip_submit endpoint only prints as of know. There is no socket message sending currently.
✏️ Added some comments for functions.
🎨 This improves the structure of the code later on when feedback is implemented
because feedback will use the same endpoint.
ℹ️ Metadata is set for taskConfig in the build_router.py file
ℹ️ This is done by using the mephisto-task lib inside of the mephisto-worker-experience lib
@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 11, 2022
@Etesam913 Etesam913 requested a review from JackUrb July 8, 2022 21:09
Comment on lines +7 to +26
def main():
db = LocalMephistoDB()
mephisto_data_browser = MephistoDataBrowser(db)
task_names = mephisto_data_browser.get_task_name_list()
print("\nTask Names:")
for task_name in task_names:
print(task_name)
print("")
task_name = input(
"Enter the name of the task that you want to look at the tips of: \n"
)
print("")
while task_name not in task_names:
print("That task name is not valid\n")
task_name = input(
"Enter the name of the task that you want to look at the tips of: \n"
)
print("")
units = mephisto_data_browser.get_all_units_for_task_name(task_name)
if len(units) == 0:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This command line interface of this scripts gets improved in #813

Copy link
Contributor Author

@Etesam913 Etesam913 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added simple comments

Copy link
Contributor

@JackUrb JackUrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your patience on this one - I think there's one last round of changes to consolidate and clean up here, but afterwards it should be good to ship!

@@ -127,7 +127,7 @@ def set_init_state(self, data: Any) -> bool:
raise NotImplementedError()

@abstractmethod
def get_init_state(self) -> Optional[Any]:
def get_init_state(self, get_all_state: Optional[bool] = False) -> Optional[Any]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is this get_all_state parameter still necessary now that there's direct access to metadata?

mephisto/tools/data_browser.py Outdated Show resolved Hide resolved
Comment on lines 311 to 319
task_name = self.task_run.to_dict()["task_name"]
db = LocalMephistoDB()
mephisto_data_browser = MephistoDataBrowser(db)
tips = mephisto_data_browser.get_metadata_property_from_task_name(
task_name=task_name, property_name="tips"
)
front_end_task_config_with_metadata = super().update_task_config_with_tips(
tips, frontend_task_config
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the amount of interacting with super() and the fact that we have similar code for this across most (if not all) Blueprints, shouldn't this just go in the main Blueprint's get_frontend_args() method?

Copy link
Contributor Author

@Etesam913 Etesam913 Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I did do this in the past, but reverted it because there were some concerns about using the DataBrowser and LocalMephistoDB() in the base blueprint class.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There was also one problem when I did that. I wasn't able to use the DataBrowser type for a parameter when I had the function in the base blueprint class. Some weird bug would emerge.

Copy link
Contributor

@JackUrb JackUrb Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah okay I recall now what I had considered - Ideally we'd be pulling the tips from the task folder on launch, which will be part of the hydra DictConfig default arguments, rather than from a query by task name. To this end it may be necessary to implement the "storing tips in an accessible file" in this PR.

The implementation here would instead look something like:

tips = self._load_tips_from_dir(self.args.blueprint.tips_directory)

and that would be safe in the super.

For the tips saving method, in your review script I imagine something like:

save_dir = unit.get_task_run().args.blueprint.tips_directory
# write tips out to file

And then the default value for tips_directory should be interpolatable from ${task_dir}/tips.

Copy link
Contributor Author

@Etesam913 Etesam913 Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am planning on keeping the tips file in the task's folder.

For example, the static_react_task_with_tips task should have its tips in the examples/static_react_task_with_tips folder. The run_task.py file seems to have a property called cfg.task_dir that gets the appropriate task path.

Is there anyway to get this cfg.task_dir in the blueprint.py file?

I looked into the get_dir_for_task() method in the dirs.py file, but that doesn't seem to work for my purposes.

I could maybe store it in the data directory instead, as that is accessible from the blueprint.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cfg.task_dir is a configuration value that is only available from a TaskConfig used from a run script. Adding the tips_directory argument (or similar) to the blueprint and then autofilling with ${task_dir}/... locks in that value at runtime and would ensure that future loads of the blueprint have the correct directory filled.

🔥 Removed get_all_state property
AssignmentState.REJECTED,
AssignmentState.SOFT_REJECTED,
]
return self.collect_matching_units_from_task_runs(task_runs, statuses)
Copy link
Contributor

@pringshia pringshia Jul 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think a method where you just pass in the negation (filtering) would feel better here, e.g. get all tasks except those the ones which are the following status, e.g. EXPIRED, - but I'm just nit-picking, this is fine

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like that would just make the code harder to understand as you would have to go to assignment_state.py to figure out which AssignmentState's are being accepted.

It wouldn't really make it more concise either as the code would become

filtered_statuses=[
  AssignmentState.CREATED,
  AssignmentState.LAUNCHED,
  AssignmentState.ASSIGNED,
  AssignmentState.MIXED,
  AssignmentState.EXPIRED  
]

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally for legibility, it could also be OK to put these collections inside of the AssignmentState class as class functions. AssignmentState.completed() and AssignmentState.final_agent() come to mind as relevant for these two lists at the moment.

✨ Allows for r easy transferring of tips from one task to another

⚡️ Task startup should be faster as no database queries are needed to get metadata
@Etesam913 Etesam913 requested review from pringshia and JackUrb July 13, 2022 20:41
Copy link
Contributor

@JackUrb JackUrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this has gotten a lot of rounds of feedback, but I think this should be the last one! Thanks for your patience here as we've been working through a new design.

@@ -7,6 +7,7 @@ mephisto:
blueprint:
task_source: ${task_dir}/webapp/build/bundle.js
link_task_source: true
tips_location: ${task_dir}/outputs/tips.csv
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pringshia any thoughts on the directory name? I feel like outputs may push too far to the expectation that this is output data, but I do like having a container for possible assets to a run task.

Copy link
Contributor

@pringshia pringshia Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I do like this idea of having a "container" for possible assets. Outputs is sort of an overloaded term, since we use it elsewhere in the task json (i.e. the inputs and outputs field). So that's one reason I'd be hesitant to use outputs, unless you think it jives well with that?

I do like having a container for possible assets to a run task

Why not just call it "assets"? :D


Tangentially: I am a little confused by this direction.

Tips should be addable to any task, I'm not sure why the parlai blueprint needs to have "tips_location" defined as part of its blueprint.

This means that any other task that wants to add the Tips addon will need it's blueprint modified, add the Hydra config, etc. Does this seem modular enough? Ideally a mixin or something would be preferred? It should be easy to add Tips across blueprints.

Also this adds the Tips overhead to all parlai tasks. This doesn't seem like a lean and future-proof path.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I am adding it to the BlueprintArgs class instead to avoid the need to add the tips_location property itself to each blueprint.

This is causing a github actions testing error that I have to solve though.

mephisto/abstractions/blueprint.py Outdated Show resolved Hide resolved
@@ -229,7 +231,7 @@ def __init__(
self.args = args
self.shared_state = shared_state
self.frontend_task_config = shared_state.task_config

self.task_run = task_run
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason we're adding the task_run here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: this is still here, but I don't see why it's added. Please remove.

@@ -127,6 +128,13 @@ class ParlAIChatBlueprintArgs(OnboardingRequiredArgs, BlueprintArgs):
"help": "Optional count of conversations to have if no context provided"
},
)
tips_location: str = field(
default=MISSING,
Copy link
Contributor

@JackUrb JackUrb Jul 13, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than default to MISSING which is giving the test error, use '${task_dir}/outputs/tips.csv' if this is what you want the default to be for these. If something isn't required we should be providing a default value.

It should also be included just in the base BlueprintArgs rather than needing to duplicate.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's what I originally kept it as, but I was getting the test error so I thought that setting it to MISSING would solve the error. It did not, I had to do something else. I will revert it back

Comment on lines 309 to 310
# Use overrides provided downstream
frontend_task_config.update(self.frontend_task_config)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We still need this, right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really know what it does. It looks completely redundant, but I'll add it back though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

self.frontend_task_config may not actually be captured by super().get_frontend_task_config(), and importantly self.frontend_task_config is something we can set values for directly from the command line. This means, if the user wants to, by doing this at the end we can always ensure they have a path to force override something set in here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree that the code looks a bit confusing. ideally if there's a way to capture the above explanation in source (whether via comments, more explicit variable names etc., or other) that would be good

@Etesam913 Etesam913 requested a review from JackUrb July 13, 2022 21:40
Copy link
Contributor

@JackUrb JackUrb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one comment for a 1 line change. Otherwise this is good to go! Thanks for the patience and debugging on this one, I'm excited to see tips available in the coming release of Mephisto!

@Etesam913 Etesam913 merged commit 20f87e1 into main Jul 14, 2022
@JackUrb JackUrb deleted the add-tips-example branch July 15, 2022 18:53
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants