-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[IBCDPE-983] Updates GX functionality to surface warnings #161
Conversation
This pull request sets up GitHub code scanning for this repository. Once the scans have completed and the checks have passed, the analysis results for this pull request branch will appear on this overview. Once you merge this pull request, the 'Security' tab will show more code scanning analysis results (for example, for the default branch). Depending on your configuration and choice of analysis tool, future pull requests will be annotated with code scanning analysis results. For more information about GitHub code scanning, check out the documentation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Quality Gate passedIssues Measures |
Description:
For the purposes of this project, "warnings" refer to messages raised following GX data validation informing us about fields which have some values that fail an expectation, but which still meet the threshold set by the GX
mostly
parameter.This PR adds logic and does some refactoring in the
GreatExpectationsRunner
class and elsewhere to surface these warnings along with any failures. An example of what this looks like in the GX report table on Synapse can bee seen here. In this example, I have also intentionally added a badgenes_biodomains
file so we can see what it looks like when there are failures and warnings. This second example shows what it looks like with warnings only, which is what we expect now usingmostly
in thegene_info
andnetwork
datasets.Notes:
mostly
parameter was added to three expectations as requested here, but I was unsure of what to set the value to (since they were all passing at 100%, I started withmostly=0.95
). Please let me know if this should change.mostly
parameter and its usage in this package inCONTRIBUTING.md
synapseclient
version up to 4.4.1. This is the last stable version that supports Python 3.8, so until we deprecate support for 3.8 inagora-data-tools
we can no longer upgradesynapseclient
versions.