Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metadata Survey Updates #460

Merged
merged 10 commits into from
Oct 4, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions microsetta_private_api/celery_worker.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,10 @@
from microsetta_private_api.celery_utils import celery, init_celery
from microsetta_private_api.util.vioscreen import refresh_headers
from microsetta_private_api.admin.daklapack_polling import poll_dak_orders
from microsetta_private_api.tasks import update_qiita_metadata
init_celery(celery, app.app)

# Run any celery tasks that require initialization on worker start
refresh_headers.delay() # Initialize the vioscreen task with a token
poll_dak_orders.delay() # check for orders
update_qiita_metadata.delay() # run Qiita metadata push
36 changes: 26 additions & 10 deletions microsetta_private_api/repo/metadata_repo/_repo.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,8 +18,14 @@
# the vioscreen survey currently cannot be fetched from the database
# there seems to be some detached survey IDs -- see 000089779
# that account has a long and unusual history though
TEMPLATES_TO_IGNORE = {10001, None}

# Adding the MyFoodRepo, Polyphenol FFQ, and Spain FFQs to the
# ignore list.
TEMPLATES_TO_IGNORE = {10001, 10002, 10003, 10004, None}

# TODO 2022-10-03
# Adding questions from Cooking Oils & Oxalate-rich Foods survey
# to ignore list as they don't exist in Qiita (OILS_*). We're blocked on
# pushing them, pending an update to Qiita's API.
EBI_REMOVE = ['ABOUT_YOURSELF_TEXT', 'ANTIBIOTIC_CONDITION',
'ANTIBIOTIC_MED', 'PM_NAME', 'PM_EMAIL',
'BIRTH_MONTH', 'CAT_CONTACT', 'CAT_LOCATION',
Expand All @@ -37,7 +43,10 @@
'COVID_SYMPTOMS_OTHER', 'FERMENTED_CONSUMED_OTHER',
'FERMENTED_OTHER', 'FERMENTED_PRODUCE_COMMERCIAL_OTHER',
'FERMENTED_PRODUCE_PERSONAL_OTHER',
'OTHER_ANIMALS_FREE_TEXT']
'OTHER_ANIMALS_FREE_TEXT', 'OILS_FREQUENCY_VEGETABLE',
'OILS_FREQUENCY_ANIMAL', 'OILS_FREQUENCY_OTHER',
'OILS_FREQUENCY_MARGARINE', 'OILS_FREQUENCY_OXALATE'
'OILS_FREQUENCY_SOY']
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can a TODO note be added regarding when these were added, and that we're blocked pending an update on Qiita's API?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.



def drop_private_columns(df):
Expand Down Expand Up @@ -177,24 +186,31 @@ def _fetch_survey_template(template_id):
-------
dict
The survey structure as returned from the private API
dict or None
string or None
Any error information associated with the retreival. If an error is
observed, the survey responses should not be considered valid.
"""
with Transaction() as t:
error = None

survey_template_repo = SurveyTemplateRepo(t)
info = survey_template_repo.get_survey_template_link_info(
template_id)

# For local surveys, we generate the json representing the survey
survey_template = survey_template_repo.get_survey_template(
template_id, "en_US")
survey_template_text = vue_adapter.to_vue_schema(survey_template)
try:
survey_template = survey_template_repo.get_survey_template(
template_id, "en_US")
except NotFound as e:
error = repr(e)

if error is None:
survey_template_text = vue_adapter.to_vue_schema(survey_template)

info = info.to_api(None, None)
info['survey_template_text'] = survey_template_text
info = info.to_api(None, None)
info['survey_template_text'] = survey_template_text

return info, None
return info, error
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does it work to return type Exception here? I don't know how flexible the system is with the return types but it would be good to make sure that this does work. I think other areas of the code return dict or None

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cast the Exception to string, which goes into an array in the calling function (if an error is actually returned).



def _to_pandas_dataframe(metadatas, survey_templates):
Expand Down
8 changes: 8 additions & 0 deletions microsetta_private_api/repo/metadata_repo/tests/test_repo.py
Original file line number Diff line number Diff line change
Expand Up @@ -167,6 +167,14 @@ def test_fetch_survey_template(self):
self.assertEqual(survey, exp)
self.assertEqual(errors, None)

def test_fetch_survey_template_remote(self):
# attempt to fetch info for Vioscreen survey
survey, errors = _fetch_survey_template(10001)

# verify that _fetch_survey_template returns an error, reflecting
# that it's a remote survey for which we can't extract local data
self.assertNotEqual(errors, None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't errors be somethign different than None?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm asserting that errors isn't None in this case using assertNotEqual, which represents the intended behavior. We don't strictly care what the content of the error is, but having a non-None value there will prevent the metadata push from attempting to use that survey.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, ya read that fast :) thanks


def test_drop_private_columns(self):
df = pd.DataFrame([[1, 2, 3], [4, 5, 6]],
columns=['pM_foo', 'okay', 'ABOUT_yourSELF_TEXT'])
Expand Down