-
Notifications
You must be signed in to change notification settings - Fork 77
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Added example of GoldUnit usage, updated the docs
- Loading branch information
Showing
9 changed files
with
540 additions
and
62 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,7 +11,7 @@ import Link from '@docusaurus/Link'; | |
|
||
# Check against standards with Gold Labels | ||
|
||
Gold labeling is commonly used for ensuring worker quality over the full duration of a task. It's valuable as an automated measure to track the consistency your workers. For this Mephisto provides the `UseGoldUnit` blueprint mixin. | ||
Gold labeling is commonly used for ensuring worker quality over the full duration of a task. It's valuable as an automated measure to track the consistency your workers. For this Mephisto provides the `UseGoldUnit` blueprint mixin. | ||
|
||
|
||
## Basic configuration | ||
|
@@ -20,8 +20,10 @@ There are a few primary configuration parts for using gold units: | |
- Hydra args | ||
- `blueprint.gold_qualification_base`: A string representing the base qualification that required qualifications keeping track of success will be built from. | ||
- `blueprint.use_golds`: Set to `True` to enable the feature. | ||
- `min_golds`: An int for the minimum number of golds a worker needs to complete for the first time before receiving real units. | ||
- `max_incorrect_golds`: An int for the number of golds a worker can get incorrect before being disqualified from this task. | ||
- `blueprint.max_gold_units`: The maximum number of additional units you will pay out for evaluating on gold units. Note that you do pay for gold units, they are just like any other units. | ||
- `blueprint.min_golds`: An int for the minimum number of golds a worker needs to complete for the first time before receiving real units. | ||
- `blueprint.max_incorrect_golds`: An int for the number of golds a worker can get incorrect before being disqualified from this task. | ||
- `task.allowed_concurrent`: Can only run this task type with one allowed concurrent unit at a time per worker, to ensure golds are completed in order. | ||
- `GoldUnitSharedState`: | ||
- `get_gold_for_worker`: A factory that generates input data for a gold unit for a worker. Explained in-depth below. | ||
|
||
|
@@ -36,10 +38,10 @@ def validate_gold_unit(unit: "Unit"): | |
data = agent.state.get_data() | ||
return data['outputs']['val'] == gold_ans[data['inputs']['ans_key']] | ||
|
||
shared_state = SharedTaskState( | ||
shared_state = SharedStaticTaskState( | ||
... | ||
get_gold_for_worker=get_gold_factory(gold_data) | ||
on_unit_submitted=UseGoldUnit.create_validation_function(cfg.mephisto, validate_gold_unit) | ||
get_gold_for_worker=get_gold_factory(gold_data), | ||
on_unit_submitted=UseGoldUnit.create_validation_function(cfg.mephisto, validate_gold_unit), | ||
) | ||
shared_state.qualifications += UseGoldUnit.get_mixin_qualifications(cfg.mephisto, shared_state) | ||
... | ||
|
@@ -51,12 +53,42 @@ The core functionality to provide to your `SharedTaskState` to enable gold units | |
|
||
We provide a helper `get_gold_factory` method which takes in a list of _all_ possible gold data inputs, and returns a factory that randomly selects a gold not yet completed by the given worker. This should be sufficient for most cases, though you can write your own factory if you want to be even more specific about how you assign golds. | ||
|
||
## Example project | ||
|
||
You can run an example project to try gold units for yourself. | ||
|
||
```shell | ||
docker-compose -f docker/docker-compose.dev.yml up | ||
docker exec -it mephisto_dc bash | ||
cd /mephisto/examples/form_composer_demo | ||
python ./run_task_with_gold_unit.py | ||
``` | ||
|
||
The first unit that you will see will be the gold one. | ||
To get past these example gold units, provide these predefined values: | ||
|
||
- `First name` - type "First" | ||
- `Last name` - type "Last" | ||
- `Email address for Mephisto` - type "[email protected]" | ||
- `Country` - select "United States of America" | ||
- `Language` - select "English" and "Spanish" | ||
- `Biography since age of 18` - type a string that is longer than 10 chars, contains a word "Gold" and does not contain a word "Bad" | ||
|
||
### Understanding the code | ||
|
||
For an in-depth look at code underlying this example, you can read these Python files in `examples/form_composer_demo` directory: | ||
|
||
- `run_task_with_gold_unit.py` - script to configure and launch this Task | ||
- `hydra_configs/conf/example_local_mock_with_gold_unit.yaml` - YAML configuration for this Task | ||
- `data/simple/gold_units/gold_units_data.json` - configuration for form that will be used specifically for gold units | ||
- `data/simple/gold_units/gold_units_validation.py` - logic of validating worker's output in gold unit form | ||
|
||
## Advanced configuration | ||
|
||
There are additional arguments that you can use for more advanced configuration of gold units: | ||
There are a few primary configuration parts for using gold units: | ||
- `GoldUnitSharedState`: | ||
- `worker_needs_gold`: A function that, given the counts of completed, correct, and incorrect golds for a given worker, as well as the minimum number of required golds, returns whether or not the worker should be shown a gold task. | ||
- `worker_needs_gold`: A function that, given the counts of completed, correct, and incorrect golds for a given worker, as well as the minimum number of required golds, returns whether or not the worker should be shown a gold task. | ||
- `worker_qualifies`: A function that, given the counts of completed, correct, and incorrect golds for a given worker, as well as the maximum number of incorrect, returns whether or not the worker is eligible to work on the task. | ||
|
||
### `worker_needs_gold` | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
177 changes: 177 additions & 0 deletions
177
examples/form_composer_demo/data/simple/gold_units/gold_units_data.json
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,177 @@ | ||
[ | ||
{ | ||
"expecting_answers": { | ||
"name_first": "First", | ||
"name_last": "Last", | ||
"email": "[email protected]", | ||
"country": "USA", | ||
"language": ["en", "es"], | ||
"bio": "custom validation" | ||
}, | ||
"form": { | ||
"title": "Form example (Gold)", | ||
"instruction": "Please answer all questions to the best of your ability as part of our study.", | ||
"sections": [ | ||
{ | ||
"name": "section_about", | ||
"title": "About you", | ||
"instruction": "Please introduce yourself. We would like to know more about your background, personal information, etc.", | ||
"fieldsets": [ | ||
{ | ||
"title": "Personal information", | ||
"instruction": "", | ||
"rows": [ | ||
{ | ||
"fields": [ | ||
{ | ||
"help": "", | ||
"id": "id_name_first", | ||
"label": "First name", | ||
"name": "name_first", | ||
"placeholder": "Type first name", | ||
"tooltip": "Your first name", | ||
"type": "input", | ||
"validators": { | ||
"required": true, | ||
"minLength": 2, | ||
"maxLength": 20 | ||
}, | ||
"value": "" | ||
}, | ||
{ | ||
"help": "Optional", | ||
"id": "id_name_last", | ||
"label": "Last name", | ||
"name": "name_last", | ||
"placeholder": "Type last name", | ||
"tooltip": "Your last name", | ||
"type": "input", | ||
"validators": { "required": true }, | ||
"value": "" | ||
} | ||
], | ||
"help": "Please use your legal name" | ||
}, | ||
{ | ||
"fields": [ | ||
{ | ||
"help": "We may contact you later for additional information", | ||
"id": "id_email", | ||
"label": "Email address for Mephisto", | ||
"name": "email", | ||
"placeholder": "[email protected]", | ||
"tooltip": "Email address for Mephisto", | ||
"type": "email", | ||
"validators": { | ||
"required": true, | ||
"regexp": ["^[a-zA-Z0-9._-]+@mephisto\\.ai$", "ig"] | ||
}, | ||
"value": "" | ||
} | ||
] | ||
} | ||
] | ||
}, | ||
{ | ||
"title": "Cultural background", | ||
"instruction": "Please tell us about your cultural affiliations and values that you use in your daily life.", | ||
"rows": [ | ||
{ | ||
"fields": [ | ||
{ | ||
"help": "Select country of your residence", | ||
"id": "id_country", | ||
"label": "Country", | ||
"multiple": false, | ||
"name": "country", | ||
"options": [ | ||
{ | ||
"label": "---", | ||
"value": "" | ||
}, | ||
{ | ||
"label": "United States of America", | ||
"value": "USA" | ||
}, | ||
{ | ||
"label": "Canada", | ||
"value": "CAN" | ||
} | ||
], | ||
"placeholder": "", | ||
"tooltip": "Country", | ||
"type": "select", | ||
"validators": { "required": true }, | ||
"value": "" | ||
}, | ||
{ | ||
"help": "Select language spoken in your local community", | ||
"id": "id_language", | ||
"label": "Language", | ||
"multiple": true, | ||
"name": "language", | ||
"options": [ | ||
{ | ||
"label": "English", | ||
"value": "en" | ||
}, | ||
{ | ||
"label": "French", | ||
"value": "fr" | ||
}, | ||
{ | ||
"label": "Spanish", | ||
"value": "es" | ||
}, | ||
{ | ||
"label": "Chinese", | ||
"value": "ch" | ||
} | ||
], | ||
"placeholder": "", | ||
"tooltip": "Language", | ||
"type": "select", | ||
"validators": { | ||
"required": true, | ||
"minLength": 2, | ||
"maxLength": 3 | ||
}, | ||
"value": "" | ||
} | ||
] | ||
} | ||
], | ||
"help": "This information will help us compile study statistics" | ||
}, | ||
{ | ||
"title": "Additional information", | ||
"instruction": "Optional details about you. You can fill out what you are most comfortable with.", | ||
"rows": [ | ||
{ | ||
"fields": [ | ||
{ | ||
"help": "", | ||
"id": "id_bio", | ||
"label": "Biography since age of 18", | ||
"name": "bio", | ||
"placeholder": "", | ||
"tooltip": "Your bio in a few paragraphs", | ||
"type": "textarea", | ||
"validators": { "required": false }, | ||
"value": "" | ||
} | ||
] | ||
} | ||
], | ||
"help": "Some additional details about your persona" | ||
} | ||
] | ||
} | ||
], | ||
"submit_button": { | ||
"text": "Submit", | ||
"tooltip": "Submit form" | ||
} | ||
} | ||
} | ||
] |
95 changes: 95 additions & 0 deletions
95
examples/form_composer_demo/data/simple/gold_units/gold_units_validation.py
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,95 @@ | ||
from typing import Any | ||
from typing import Callable | ||
from typing import List | ||
from typing import Optional | ||
|
||
from mephisto.data_model.unit import Unit | ||
|
||
|
||
ValidationFuncType = Callable[[Any, Optional[Any]], bool] | ||
|
||
|
||
def _simple_comparing(worker_value: Any, correct_value: Optional[Any]) -> bool: | ||
if correct_value is None: | ||
# Just skip if there's no value, we do not validate this field at all | ||
return True | ||
|
||
return worker_value == correct_value | ||
|
||
|
||
def _validate_name_first(worker_value: Any, correct_value: Optional[Any]) -> bool: | ||
return _simple_comparing(worker_value, correct_value) | ||
|
||
|
||
def _validate_name_last(worker_value: Any, correct_value: Optional[Any]) -> bool: | ||
return _simple_comparing(worker_value, correct_value) | ||
|
||
|
||
def _validate_email(worker_value: Any, correct_value: Optional[Any]) -> bool: | ||
return _simple_comparing(worker_value, correct_value) | ||
|
||
|
||
def _validate_country(worker_value: Any, correct_value: Optional[Any]) -> bool: | ||
return _simple_comparing(worker_value, correct_value) | ||
|
||
|
||
def _validate_language(worker_value: Any, correct_value: Optional[Any]) -> bool: | ||
return _simple_comparing(worker_value, correct_value) | ||
|
||
|
||
def _validate_bio(worker_value: Any, correct_value: Optional[Any]) -> bool: | ||
# Custom more complicated logic | ||
if len(worker_value) < 10: | ||
return False | ||
|
||
if "Gold" not in worker_value: | ||
return False | ||
|
||
if "Bad" in worker_value: | ||
return False | ||
|
||
return True | ||
|
||
|
||
FIELD_VALIDATOR_MAPPINGS = { | ||
"name_first": _validate_name_first, | ||
"name_last": _validate_name_last, | ||
"email": _validate_email, | ||
"country": _validate_country, | ||
"language": _validate_language, | ||
"bio": _validate_bio, | ||
} | ||
|
||
|
||
def validate_gold_unit(unit: "Unit") -> bool: | ||
agent = unit.get_assigned_agent() | ||
data = agent.state.get_data() | ||
|
||
worker_answeres = data["outputs"] | ||
|
||
expecting_answers: dict = data["inputs"]["expecting_answers"] | ||
|
||
validated_fields: List[bool] = [] | ||
|
||
for fieldname, correct_value in expecting_answers.items(): | ||
# No correct value set for this field, they pass validation | ||
if correct_value is None: | ||
validated_fields.append(True) | ||
continue | ||
|
||
# No validation function set for this field, they pass validation | ||
validation_func: ValidationFuncType = FIELD_VALIDATOR_MAPPINGS.get(fieldname) | ||
if not validation_func: | ||
validated_fields.append(True) | ||
continue | ||
|
||
# No worker answer for this field, they fail validation | ||
worker_value = worker_answeres.get(fieldname) | ||
if not worker_value: | ||
validated_fields.append(False) | ||
continue | ||
|
||
validation_result = validation_func(worker_value, correct_value) | ||
validated_fields.append(validation_result) | ||
|
||
return all(validated_fields) |
32 changes: 32 additions & 0 deletions
32
examples/form_composer_demo/hydra_configs/conf/example_local_mock_with_gold_unit.yaml
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,32 @@ | ||
#@package _global_ | ||
|
||
# Copyright (c) Meta Platforms and its affiliates. | ||
# This source code is licensed under the MIT license found in the | ||
# LICENSE file in the root directory of this source tree. | ||
|
||
defaults: | ||
- /mephisto/blueprint: static_react_task | ||
- /mephisto/architect: local | ||
- /mephisto/provider: mock | ||
|
||
mephisto: | ||
blueprint: | ||
data_json: ${task_dir}/data/simple/task_data.json | ||
task_source: ${task_dir}/webapp/build/bundle.js | ||
task_source_review: ${task_dir}/webapp/build/bundle.review.js | ||
link_task_source: false | ||
extra_source_dir: ${task_dir}/webapp/src/static | ||
units_per_assignment: 2 | ||
gold_qualification_base: "gold_qualification" # Required for Gold Units | ||
use_golds: true # Required for Gold Units | ||
min_golds: 1 # Required for Gold Units | ||
max_incorrect_golds: 1 # Required for Gold Units | ||
max_gold_units: 1 # Required for Gold Units | ||
task: | ||
allowed_concurrent: 1 # Required for Gold Units | ||
task_name: "Sample Questionnaire" | ||
task_title: "Example how to easily create simple form-based Tasks" | ||
task_description: "In this Task, we use FormComposer feature." | ||
task_reward: 0 | ||
task_tags: "test,simple,form,form-composer" | ||
force_rebuild: true |
Oops, something went wrong.