jbeck/AG-1145/transform overall scores testing #81

jaclynbeck-sage · 2023-07-01T00:00:15Z

This PR addresses 2 tickets:
AG-1116: Removes the "literaturescore" column from the overall_scores transform output.
AG-1145: Adds data-driven tests for transform_overall_scores. I wrote the test before modifying the transform and confirmed that it passed, then modified the transform / expected test output, and confirmed the test passed again.
There is no failure case for this test, just "good" data and "missing" data from key columns that both pass.

The "good" data has rows that meet at least one of these cases:

all <X>score values populated, and all isscored_<X> set to "Y"
all <X>score values populated, and all isscored_<X> set to "N"
all <X>score values missing, and all isscored_<X> set to "N"
There are two rows that are duplicates of each other except for different neuropathscores, which happens in the real data too and should be handled by the transform.
There are two rows with all <X>score values populated, but some isscored_<X> columns set to "Y" and some set to "N", just to make sure the transform selectively handles "N" cases.

The "missing" data has rows missing the following values:

One row each with a blank value in one of the isscored_<X> columns, where the corresponding <X>score value is filled in.
One row each with a blank value in one of the <X>score columns, where the corresponding isscored_<X> value is set to "Y".

I also confirmed that removing the literature score from the transform produces output that is identical to the old JSON file, except for the "literature_score: XXX" line is removed from each gene.

…output to match

jaclynbeck-sage · 2023-07-01T00:48:13Z

Something weird is happening with the file uploads to synapse (see the error output of this job: https://github.com/Sage-Bionetworks/agora-data-tools/actions/runs/5427967156/jobs/9871986708) where some of the JSON files successfully upload and some fail because of "access restrictions". I ran CI on the dev branch instead of this one and it had the same failure on the same subset of files so I think something must be up with Synapse right now.

BWMac

Nice! 🚀

JessterB

This looks good! I couldn't find a matching set of test files (from the same processing run) with the modified output - probably due to the synapse upload errors you commented on plus the fact that we don't upload duplicate files anymore. I'll make sure to double check the files when this gets merged, and when we next update Agora Live Data.

jaclynbeck-sage · 2023-08-01T22:12:27Z

I think the file from this branch's CI run is this one: https://www.synapse.org/#!Synapse:syn25740976.146
The timestamp in Synapse is June 30, 6:49 pm, and the timestamp in CI is July 1, 1:49 AM UTC which is 6:49 pm PST.

jaclynbeck-sage added 3 commits June 30, 2023 16:47

Added data-driven test for overall_scores transformation

077762d

Removed literature score from overall_scores transform, updated test …

623f258

…output to match

Fixed a typo in one of the comments in the test script

29f6dcb

jaclynbeck-sage requested review from BWMac and JessterB July 1, 2023 00:16

Removed agora_rename of literaturescore from overall_scores config

25e1303

BWMac approved these changes Jul 3, 2023

View reviewed changes

JessterB approved these changes Aug 1, 2023

View reviewed changes

jaclynbeck-sage merged commit 769baae into dev Aug 15, 2023

jaclynbeck-sage deleted the jbeck/AG-1145/transform_overall_scores_testing branch August 15, 2023 21:40

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

jbeck/AG-1145/transform overall scores testing #81

jbeck/AG-1145/transform overall scores testing #81

jaclynbeck-sage commented Jul 1, 2023 •

edited by jira bot

Loading

jaclynbeck-sage commented Jul 1, 2023

BWMac left a comment

JessterB left a comment

jaclynbeck-sage commented Aug 1, 2023

jbeck/AG-1145/transform overall scores testing #81

jbeck/AG-1145/transform overall scores testing #81

Conversation

jaclynbeck-sage commented Jul 1, 2023 • edited by jira bot Loading

jaclynbeck-sage commented Jul 1, 2023

BWMac left a comment

Choose a reason for hiding this comment

JessterB left a comment

Choose a reason for hiding this comment

jaclynbeck-sage commented Aug 1, 2023

jaclynbeck-sage commented Jul 1, 2023 •

edited by jira bot

Loading