Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

isPhenotype bug(s?) #31

Closed
rgentlem opened this issue Jan 25, 2024 · 2 comments
Closed

isPhenotype bug(s?) #31

rgentlem opened this issue Jan 25, 2024 · 2 comments
Assignees
Labels
bug Something isn't working
Milestone

Comments

@rgentlem
Copy link

Hi Rafael and Jason,
In setting up the non-phenotype work, I had pointed you to the vignette VariableClassification.Rmd in the phonto package.
But somehow the examples given there don't all seem to have been copied over -
I draw your attention to this code:
##comments - most variables end with an LC but not all...so we also

look at the description - the SasLabel doesn't work as they abbreviate

##comments in weird ways
g1 = grep("LC$", xx$Variable)
g2 = grep("*[Cc]omment [Cc]ode$", xx$Description)
outPut[union(g1,g2)] = "Comment"

so any variable whose name ends in LC is a comment, and not a phenotype....but looking at the isPhenotype column in the Metadata, these are all labeled as phenotypes.
I did not exhaustively check the other examples of non-phenotypes, but it would be good to ensure that these are synced up.

Not sure if this is you, or Sam, but in this table
Metadata.QuestionnaireVariables
the column named UseConstraints should be named Target...

@rsgoncalves rsgoncalves added the bug Something isn't working label Jan 26, 2024
@rsgoncalves
Copy link
Contributor

We added all (missing) regular expressions applicable to SasLabels to our blocklist_regexps.txt. Then, for the expressions in VariableClassification.md that are applicable to variable identifiers (rather than labels), we added to our blocklist_table.csv all the variable-table pairs that match such regular expressions. The tags in the R code are preserved in a Tag column in that table, and then in a Comment column we have the reason for inclusion of that variable in the table—here we include the regular expressions that were used to derive the entries.

As I mentioned, there some exceptions to one of the regular expressions — variables that end in "LC" but that are not comments. For example LB2ALC, VIXPLC). Such variables that are phenotypes but match one of of our non-phenotype patterns are listed in the newly created allowlist.csv.

The next release will include these updates.

Regarding the column names of Metadata.QuestionnaireVariables, it is an import error during the DB build.

@rsgoncalves rsgoncalves added this to the v3.10.0 milestone Jan 30, 2024
@rsgoncalves
Copy link
Contributor

closed via a8ba61a and will be in release v3.10.0.

@rsgoncalves rsgoncalves self-assigned this Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants