Metadata Survey Updates #460

cassidysymons · 2022-09-22T15:50:49Z

Updating the Metadata repo to reflect recent survey changes:

Adding MyFoodRepo, the Polyphenol FFQ, and the Spain FFQ to the ignored survey templates, as they're external surveys and can't be retrieved from the database.
Adding all of the fields in the locally hosted Cooking Oils & Oxalate-rich Foods survey to the EBI_REMOVE constant, as they don't exist in Qiita.

Updating the Metadata repo to reflect recent survey changes: 1) Adding MyFoodRepo, the Polyphenol FFQ, and the Spain FFQ to the ignored survey templates, as they're external surveys and can't be retrieved from the database. 2) Adding all of the fields in the locally hosted Cooking Oils & Oxalate-rich Foods survey to the EBI_REMOVE constant, as they don't exist in Qiita.

wasade · 2022-09-22T15:54:21Z

Can we add the fields to Qiita so they are accessible for analysis?

cassidysymons · 2022-09-22T16:00:44Z

Can we add the fields to Qiita so they are accessible for analysis?

I'm all for that as long as it won't cause any problems. Is that something I can/should do, or do I request it from someone else?

wasade · 2022-09-22T16:01:51Z

I thought the code was written to automatically add to qiita but may be misremembering

…

On Sep 22, 2022, at 9:00 AM, Cassidy Symons ***@***.***> wrote: Can we add the fields to Qiita so they are accessible for analysis? I'm all for that as long as it won't cause any problems. Is that something I can/should do, or do I request it from someone else? — Reply to this email directly, view it on GitHub <#460 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AADTZMTU5YAWIY3IAG7ASYTV7R7DNANCNFSM6AAAAAAQTGEV5E>. You are receiving this because you commented.

wasade · 2022-09-22T16:04:06Z

It's not, see

microsetta-private-api/microsetta_private_api/repo/qiita_repo.py

Lines 74 to 77 in 3fd63e9

    
           # gather the categories currently used in qiita. we have to have parity 
        
           # with the categories when pushing 
        
           cats_in_qiita = qclient.get('/api/v1/study/10317/samples/info') 
        
           cats_in_qiita = set(cats_in_qiita['categories'])

@antgonza, is there an easy way to add metadata columns to a qiita study w/o actually needing to create and specify the metadata? In this case, we have variables that need to exist in qiita, but we dont yet have samples that express those variables

antgonza · 2022-09-22T16:20:30Z

Programmatically: You could add a fake sample with all the columns you need and then delete it; this will keep the extra columns around.

If this is only going to be performed once: You can open the DB and insert into QIITA_COLUMN_NAME the new columns you want to add, like this.

Note that this will add new and fully empty columns to all the samples - IMOO not useful for analysis.

Hope this helps.

wasade · 2022-09-22T16:25:32Z

Thanks! Is the programmatic method exposed and usable by an API? Is it possible to set a default value for historical samples where the variable is not collected (e.g., Missing: not collected or something)? I'm hoping there is an automated solution here rather than asking you to do it manually whenever new columns need to be specified

cc @sejsong, this surrounds new questions to the questionnaire

antgonza · 2022-09-22T16:31:07Z

Yes for adding/updating samples/metadata but not for deleting (I think deleting samples can be problematic). No way to define a default value, you would need to add that value for all historical samples.

IMOO the best would be that the TMI pushes whatever changes it requires directly, either automatically or by hand ... and if there is no end-point add one (as long as it makes sense for the rest of the system).

wasade · 2022-09-22T16:36:51Z

If I understand the existing API correctly, the metadata category has to already exist? See https://github.com/qiita-spots/qiita/blob/master/qiita_pet/handlers/rest/study_samples.py#L171-L179

Would the Qiita dev team be able to expose endpoints to allow for creation of metadata variables?

antgonza · 2022-09-22T19:15:42Z

Good point, I didn't remember that - I wonder who added that restriction 😜.

Anyway, I'll be happy to review/merge new PRs with endpoints.

sejsong · 2022-09-22T19:42:46Z

@antgonza is this something you might be able to do? I'm not sure we have the expertise and bandwidth on the TMI dev team to do this the most efficiently.
I'm assuming this is something that will be useful for Qiita more widely?

antgonza · 2022-09-22T20:19:07Z

I'm not sure if it will be useful for other Qiita users (as they do not interact directly with the API - they would need access to the UCSD systems to do it). Also, that original functionality was added specifically for and by the AGP/TMI-dev-team. Now IMOO the best will be that whoever implements that functionality for this repo should issue a PR to qiita to make sure the 2 PRs work properly; does that make sense?

Anyway, I think having an internal meeting to decide what and whom should add new code + timelines for TMI would be beneficial.

wasade · 2022-09-23T17:55:16Z

Just spoke with Se Jin about the motivation for this. It seems surprising that samples are failing from the other surveys, but plausible. Definitely recommend testing against qiita-rc. I thought the code was setup to fail on a per survey basis, not per barcode basis, but there may be edge cases I'm misremembering

cassidysymons · 2022-09-23T18:05:40Z

A "No such survey" error is causing a hardfail in the qiita metadata push:

qiita metadata push errors

[ { "hardfail": "Traceback (most recent call last):\n File "/Users/api_user/microsetta-private-api-2022.9/microsetta_private_api/tasks.py", line 76, in update_qiita_metadata\n n_pushed, error = qiita.push_metadata_to_qiita()\n File "/Users/api_user/microsetta-private-api-2022.9/microsetta_private_api/repo/qiita_repo.py", line 90, in push_metadata_to_qiita\n formatted, error = retrieve_metadata(to_push)\n File "/Users/api_user/microsetta-private-api-2022.9/microsetta_private_api/repo/metadata_repo/_repo.py", line 109, in retrieve_metadata\n survey_templates, st_errors = _fetch_observed_survey_templates(fetched)\n File "/Users/api_user/microsetta-private-api-2022.9/microsetta_private_api/repo/metadata_repo/_repo.py", line 159, in _fetch_observed_survey_templates\n survey, error = _fetch_survey_template(template_id)\n File "/Users/api_user/microsetta-private-api-2022.9/microsetta_private_api/repo/metadata_repo/_repo.py", line 190, in _fetch_survey_template\n survey_template = survey_template_repo.get_survey_template(\n File "/Users/api_user/microsetta-private-api-2022.9/microsetta_private_api/repo/survey_template_repo.py", line 136, in get_survey_template\n raise NotFound("No such survey")\nwerkzeug.exceptions.NotFound: 404 Not Found: No such survey\n" } ]

wasade · 2022-09-23T18:16:04Z

Thanks. It may be good to wrap _fetch_survey_template so errors in the future can be gracefully caught and reported, it looks like this is bringing down the entire metadata push which is bad

cassidysymons · 2022-09-28T16:17:39Z

@wasade I added a try/catch block in _fetch_survey_template to gracefully handle the NotFound exception. Given the urgency to get the metadata push back up and the timeline for having Qiita changes live and available to use, do you see any issues with merging this code as-written? It's admittedly a temporary solution, but it resolves our issue for the time being and we can revisit the Qiita changes in a few weeks.

wasade · 2022-09-28T16:19:45Z

I think that strategy works, I'm unsure if e needs to be str or Exception. However, it is going to be painful to recover the responses and update the metadata in Qiita so this is a bandaid and it will hurt when it gets ripped off.

cassidysymons · 2022-10-03T18:38:49Z

@wasade When you have a minute can you please do a final review on this?

wasade

I'm not sure if the error return is valid. It may be nice to assert it with a unit test

wasade · 2022-10-03T18:44:58Z

microsetta_private_api/repo/metadata_repo/_repo.py

+              'OTHER_ANIMALS_FREE_TEXT', 'OILS_FREQUENCY_VEGETABLE',
+              'OILS_FREQUENCY_ANIMAL', 'OILS_FREQUENCY_OTHER',
+              'OILS_FREQUENCY_MARGARINE', 'OILS_FREQUENCY_OXALATE'
+              'OILS_FREQUENCY_SOY']


Can a TODO note be added regarding when these were added, and that we're blocked pending an update on Qiita's API?

wasade · 2022-10-03T18:47:44Z

microsetta_private_api/repo/metadata_repo/_repo.py


-        return info, None
+        return info, error


Does it work to return type Exception here? I don't know how flexible the system is with the return types but it would be good to make sure that this does work. I think other areas of the code return dict or None

I cast the Exception to string, which goes into an array in the calling function (if an error is actually returned).

cassidysymons · 2022-10-04T18:36:53Z

I'm not sure if the error return is valid. It may be nice to assert it with a unit test

I added a unit test that intentionally calls the function with a remote survey ID to observe that an error is returned with a value other than "None." Since I've specifically cast the Exception to string before returning it, I don't think we need any further test coverage there. What are your thoughts?

wasade · 2022-10-04T18:51:07Z

microsetta_private_api/repo/metadata_repo/tests/test_repo.py

+
+        # verify that _fetch_survey_template returns an error, reflecting
+        # that it's a remote survey for which we can't extract local data
+        self.assertNotEqual(errors, None)


Shouldn't errors be somethign different than None?

I'm asserting that errors isn't None in this case using assertNotEqual, which represents the intended behavior. We don't strictly care what the content of the error is, but having a non-None value there will prevent the metadata push from attempting to use that survey.

Oh, ya read that fast :) thanks

Handle NotFound exception

bf53102

cassidysymons added 2 commits September 30, 2022 08:37

Pausing jobs to test on staging

9371ae4

Revert testing changes + run Qiita push on startup

944b021

wasade requested changes Oct 3, 2022

View reviewed changes

cassidysymons added 6 commits October 3, 2022 14:43

Requested changes

51b8f2d

Lint fix

4cd064c

Test fix

4863e3d

Test fix

5531763

Test fix

a579738

Lint fix

222c9ab

wasade reviewed Oct 4, 2022

View reviewed changes

wasade merged commit 9f86303 into master Oct 4, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Metadata Survey Updates #460

Metadata Survey Updates #460

cassidysymons commented Sep 22, 2022

wasade commented Sep 22, 2022

cassidysymons commented Sep 22, 2022

wasade commented Sep 22, 2022 via email

wasade commented Sep 22, 2022

antgonza commented Sep 22, 2022

wasade commented Sep 22, 2022

antgonza commented Sep 22, 2022

wasade commented Sep 22, 2022

antgonza commented Sep 22, 2022

sejsong commented Sep 22, 2022

antgonza commented Sep 22, 2022

wasade commented Sep 23, 2022

cassidysymons commented Sep 23, 2022

wasade commented Sep 23, 2022

cassidysymons commented Sep 28, 2022

wasade commented Sep 28, 2022

cassidysymons commented Oct 3, 2022

wasade left a comment

wasade Oct 3, 2022

cassidysymons Oct 4, 2022

wasade Oct 3, 2022

cassidysymons Oct 4, 2022

cassidysymons commented Oct 4, 2022

wasade Oct 4, 2022

cassidysymons Oct 4, 2022

wasade Oct 4, 2022

Metadata Survey Updates #460

Metadata Survey Updates #460

Conversation

cassidysymons commented Sep 22, 2022

wasade commented Sep 22, 2022

cassidysymons commented Sep 22, 2022

wasade commented Sep 22, 2022 via email

wasade commented Sep 22, 2022

antgonza commented Sep 22, 2022

wasade commented Sep 22, 2022

antgonza commented Sep 22, 2022

wasade commented Sep 22, 2022

antgonza commented Sep 22, 2022

sejsong commented Sep 22, 2022

antgonza commented Sep 22, 2022

wasade commented Sep 23, 2022

cassidysymons commented Sep 23, 2022

wasade commented Sep 23, 2022

cassidysymons commented Sep 28, 2022

wasade commented Sep 28, 2022

cassidysymons commented Oct 3, 2022

wasade left a comment

Choose a reason for hiding this comment

wasade Oct 3, 2022

Choose a reason for hiding this comment

cassidysymons Oct 4, 2022

Choose a reason for hiding this comment

wasade Oct 3, 2022

Choose a reason for hiding this comment

cassidysymons Oct 4, 2022

Choose a reason for hiding this comment

cassidysymons commented Oct 4, 2022

wasade Oct 4, 2022

Choose a reason for hiding this comment

cassidysymons Oct 4, 2022

Choose a reason for hiding this comment

wasade Oct 4, 2022

Choose a reason for hiding this comment