Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example of GoldUnit usage #1211

Merged
merged 1 commit into from
Jul 9, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
46 changes: 39 additions & 7 deletions docs/web/docs/guides/how_to_use/worker_quality/using_golds.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ import Link from '@docusaurus/Link';

# Check against standards with Gold Labels

Gold labeling is commonly used for ensuring worker quality over the full duration of a task. It's valuable as an automated measure to track the consistency your workers. For this Mephisto provides the `UseGoldUnit` blueprint mixin.
Gold labeling is commonly used for ensuring worker quality over the full duration of a task. It's valuable as an automated measure to track the consistency your workers. For this Mephisto provides the `UseGoldUnit` blueprint mixin.


## Basic configuration
Expand All @@ -20,8 +20,10 @@ There are a few primary configuration parts for using gold units:
- Hydra args
- `blueprint.gold_qualification_base`: A string representing the base qualification that required qualifications keeping track of success will be built from.
- `blueprint.use_golds`: Set to `True` to enable the feature.
- `min_golds`: An int for the minimum number of golds a worker needs to complete for the first time before receiving real units.
- `max_incorrect_golds`: An int for the number of golds a worker can get incorrect before being disqualified from this task.
- `blueprint.max_gold_units`: The maximum number of additional units you will pay out for evaluating on gold units. Note that you do pay for gold units, they are just like any other units.
- `blueprint.min_golds`: An int for the minimum number of golds a worker needs to complete for the first time before receiving real units.
- `blueprint.max_incorrect_golds`: An int for the number of golds a worker can get incorrect before being disqualified from this task.
- `task.allowed_concurrent`: Can only run this task type with one allowed concurrent unit at a time per worker, to ensure golds are completed in order.
- `GoldUnitSharedState`:
- `get_gold_for_worker`: A factory that generates input data for a gold unit for a worker. Explained in-depth below.

Expand All @@ -36,10 +38,10 @@ def validate_gold_unit(unit: "Unit"):
data = agent.state.get_data()
return data['outputs']['val'] == gold_ans[data['inputs']['ans_key']]

shared_state = SharedTaskState(
shared_state = SharedStaticTaskState(
...
get_gold_for_worker=get_gold_factory(gold_data)
on_unit_submitted=UseGoldUnit.create_validation_function(cfg.mephisto, validate_gold_unit)
get_gold_for_worker=get_gold_factory(gold_data),
on_unit_submitted=UseGoldUnit.create_validation_function(cfg.mephisto, validate_gold_unit),
)
shared_state.qualifications += UseGoldUnit.get_mixin_qualifications(cfg.mephisto, shared_state)
...
Expand All @@ -51,12 +53,42 @@ The core functionality to provide to your `SharedTaskState` to enable gold units

We provide a helper `get_gold_factory` method which takes in a list of _all_ possible gold data inputs, and returns a factory that randomly selects a gold not yet completed by the given worker. This should be sufficient for most cases, though you can write your own factory if you want to be even more specific about how you assign golds.

## Example project

You can run an example project to try gold units for yourself.

```shell
docker-compose -f docker/docker-compose.dev.yml up
docker exec -it mephisto_dc bash
cd /mephisto/examples/form_composer_demo
python ./run_task_with_gold_unit.py
```

The first unit that you will see will be the gold one.
To get past these example gold units, provide these predefined values:

- `First name` - type "First"
- `Last name` - type "Last"
- `Email address for Mephisto` - type "[email protected]"
- `Country` - select "United States of America"
- `Language` - select "English" and "Spanish"
- `Biography since age of 18` - type a string that is longer than 10 chars, contains a word "Gold" and does not contain a word "Bad"

### Understanding the code

For an in-depth look at code underlying this example, you can read these Python files in `examples/form_composer_demo` directory:

- `run_task_with_gold_unit.py` - script to configure and launch this Task
- `hydra_configs/conf/example_local_mock_with_gold_unit.yaml` - YAML configuration for this Task
- `data/simple/gold_units/gold_units_data.json` - configuration for form that will be used specifically for gold units
- `data/simple/gold_units/gold_units_validation.py` - logic of validating worker's output in gold unit form

## Advanced configuration

There are additional arguments that you can use for more advanced configuration of gold units:
There are a few primary configuration parts for using gold units:
- `GoldUnitSharedState`:
- `worker_needs_gold`: A function that, given the counts of completed, correct, and incorrect golds for a given worker, as well as the minimum number of required golds, returns whether or not the worker should be shown a gold task.
- `worker_needs_gold`: A function that, given the counts of completed, correct, and incorrect golds for a given worker, as well as the minimum number of required golds, returns whether or not the worker should be shown a gold task.
- `worker_qualifies`: A function that, given the counts of completed, correct, and incorrect golds for a given worker, as well as the maximum number of incorrect, returns whether or not the worker is eligible to work on the task.

### `worker_needs_gold`
Expand Down
1 change: 1 addition & 0 deletions examples/form_composer_demo/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,6 +11,7 @@ These form-based questionnaires are example of FormComposer task generator.
- Dynamic form: `cd /mephisto/examples/form_composer_demo && python ./run_task_dynamic.py`
- Dynamic form with Prolific on EC2: `cd /mephisto/examples/form_composer_demo && python ./run_task_dynamic_ec2_prolific.py`
- Dynamic form with Mturk on EC2: `cd /mephisto/examples/form_composer_demo && python ./run_task_dynamic_ec2_mturk_sandbox.py`
- Simple form with Gold Units: `cd /mephisto/examples/form_composer_demo && python ./run_task_with_gold_unit.py`

---

Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,177 @@
[
{
"expecting_answers": {
"name_first": "First",
"name_last": "Last",
"email": "[email protected]",
"country": "USA",
"language": ["en", "es"],
"bio": "custom validation"
},
"form": {
"title": "Form example (Gold)",
"instruction": "Please answer all questions to the best of your ability as part of our study.",
"sections": [
{
"name": "section_about",
"title": "About you",
"instruction": "Please introduce yourself. We would like to know more about your background, personal information, etc.",
"fieldsets": [
{
"title": "Personal information",
"instruction": "",
"rows": [
{
"fields": [
{
"help": "",
"id": "id_name_first",
"label": "First name",
"name": "name_first",
"placeholder": "Type first name",
"tooltip": "Your first name",
"type": "input",
"validators": {
"required": true,
"minLength": 2,
"maxLength": 20
},
"value": ""
},
{
"help": "Optional",
"id": "id_name_last",
"label": "Last name",
"name": "name_last",
"placeholder": "Type last name",
"tooltip": "Your last name",
"type": "input",
"validators": { "required": true },
"value": ""
}
],
"help": "Please use your legal name"
},
{
"fields": [
{
"help": "We may contact you later for additional information",
"id": "id_email",
"label": "Email address for Mephisto",
"name": "email",
"placeholder": "[email protected]",
"tooltip": "Email address for Mephisto",
"type": "email",
"validators": {
"required": true,
"regexp": ["^[a-zA-Z0-9._-]+@mephisto\\.ai$", "ig"]
},
"value": ""
}
]
}
]
},
{
"title": "Cultural background",
"instruction": "Please tell us about your cultural affiliations and values that you use in your daily life.",
"rows": [
{
"fields": [
{
"help": "Select country of your residence",
"id": "id_country",
"label": "Country",
"multiple": false,
"name": "country",
"options": [
{
"label": "---",
"value": ""
},
{
"label": "United States of America",
"value": "USA"
},
{
"label": "Canada",
"value": "CAN"
}
],
"placeholder": "",
"tooltip": "Country",
"type": "select",
"validators": { "required": true },
"value": ""
},
{
"help": "Select language spoken in your local community",
"id": "id_language",
"label": "Language",
"multiple": true,
"name": "language",
"options": [
{
"label": "English",
"value": "en"
},
{
"label": "French",
"value": "fr"
},
{
"label": "Spanish",
"value": "es"
},
{
"label": "Chinese",
"value": "ch"
}
],
"placeholder": "",
"tooltip": "Language",
"type": "select",
"validators": {
"required": true,
"minLength": 2,
"maxLength": 3
},
"value": ""
}
]
}
],
"help": "This information will help us compile study statistics"
},
{
"title": "Additional information",
"instruction": "Optional details about you. You can fill out what you are most comfortable with.",
"rows": [
{
"fields": [
{
"help": "",
"id": "id_bio",
"label": "Biography since age of 18",
"name": "bio",
"placeholder": "",
"tooltip": "Your bio in a few paragraphs",
"type": "textarea",
"validators": { "required": false },
"value": ""
}
]
}
],
"help": "Some additional details about your persona"
}
]
}
],
"submit_button": {
"text": "Submit",
"tooltip": "Submit form"
}
}
}
]
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
from typing import Any
from typing import Callable
from typing import List
from typing import Optional

from mephisto.data_model.unit import Unit


ValidationFuncType = Callable[[Any, Optional[Any]], bool]


def _simple_comparing(worker_value: Any, correct_value: Optional[Any]) -> bool:
if correct_value is None:
# Just skip if there's no value, we do not validate this field at all
return True

return worker_value == correct_value


def _validate_name_first(worker_value: Any, correct_value: Optional[Any]) -> bool:
return _simple_comparing(worker_value, correct_value)


def _validate_name_last(worker_value: Any, correct_value: Optional[Any]) -> bool:
return _simple_comparing(worker_value, correct_value)


def _validate_email(worker_value: Any, correct_value: Optional[Any]) -> bool:
return _simple_comparing(worker_value, correct_value)


def _validate_country(worker_value: Any, correct_value: Optional[Any]) -> bool:
return _simple_comparing(worker_value, correct_value)


def _validate_language(worker_value: Any, correct_value: Optional[Any]) -> bool:
return _simple_comparing(worker_value, correct_value)


def _validate_bio(worker_value: Any, correct_value: Optional[Any]) -> bool:
# Custom more complicated logic
if len(worker_value) < 10:
return False

if "Gold" not in worker_value:
return False

if "Bad" in worker_value:
return False

return True


FIELD_VALIDATOR_MAPPINGS = {
"name_first": _validate_name_first,
"name_last": _validate_name_last,
"email": _validate_email,
"country": _validate_country,
"language": _validate_language,
"bio": _validate_bio,
}


def validate_gold_unit(unit: "Unit") -> bool:
agent = unit.get_assigned_agent()
data = agent.state.get_data()

worker_answeres = data["outputs"]

expecting_answers: dict = data["inputs"]["expecting_answers"]

validated_fields: List[bool] = []

for fieldname, correct_value in expecting_answers.items():
# No correct value set for this field, they pass validation
if correct_value is None:
validated_fields.append(True)
continue

# No validation function set for this field, they pass validation
validation_func: ValidationFuncType = FIELD_VALIDATOR_MAPPINGS.get(fieldname)
if not validation_func:
validated_fields.append(True)
continue

# No worker answer for this field, they fail validation
worker_value = worker_answeres.get(fieldname)
if not worker_value:
validated_fields.append(False)
continue

validation_result = validation_func(worker_value, correct_value)
validated_fields.append(validation_result)

return all(validated_fields)
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
#@package _global_

# Copyright (c) Meta Platforms and its affiliates.
# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

defaults:
- /mephisto/blueprint: static_react_task
- /mephisto/architect: local
- /mephisto/provider: mock

mephisto:
blueprint:
data_json: ${task_dir}/data/simple/task_data.json
task_source: ${task_dir}/webapp/build/bundle.js
task_source_review: ${task_dir}/webapp/build/bundle.review.js
link_task_source: false
extra_source_dir: ${task_dir}/webapp/src/static
units_per_assignment: 2
gold_qualification_base: "gold_qualification" # Required for Gold Units
use_golds: true # Required for Gold Units
min_golds: 1 # Required for Gold Units
max_incorrect_golds: 1 # Required for Gold Units
max_gold_units: 1 # Required for Gold Units
task:
allowed_concurrent: 1 # Required for Gold Units
task_name: "Sample Questionnaire"
task_title: "Example how to easily create simple form-based Tasks"
task_description: "In this Task, we use FormComposer feature."
task_reward: 0
task_tags: "test,simple,form,form-composer"
force_rebuild: true
Loading
Loading