-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mimic-iv/concepts: fix postgres-make-concepts
and minor updates
#1363
mimic-iv/concepts: fix postgres-make-concepts
and minor updates
#1363
Conversation
Currently, `REGEX_DATETIME_DIFF` doesn't match for all occurrences of `DATETIME_DIFF` and `postgres-make-concepts.sql` errors out. For example: ``` psql:demographics/age.sql:30: ERROR: column "year" does not exist LINE 22: ...mittime, DATETIME(pa.anchor_year, 1, 1, 0, 0, 0), YEAR) + pa.. ``` Also, `PERL_REGEX_ROUND` doesn't correctly match `ROUND(...)` with nested functions, for example with a nested `DATETIME_DIFF`: ``` psql:demographics/icustay_detail.sql:33: ERROR: syntax error at or near "as" LINE 17: , ROUND( CAST( DATETIME_DIFF(ie.outtime as numeric),ie.intim... ``` Update both regular expressions accordingly.
Since 8ed4060 the relation name prefix is `mimiciv_`.
The following auto generated files for PostgreSQL were not added to the repo yet: postgres/medication/milrinone.sql postgres/medication/norepinephrine_equivalent_dose.sql postgres/medication/vasoactive_agent.sql
Some concepts have a dependency on others, for example `norepinephrine_equivalent_dose.sql` expects `mimiciv_derived.vasoactive_agent` to exist. Fixup for 43a4245
export REGEX_DATETIME_TRUNC="s/DATETIME_TRUNC\(([^,]+), ?(DAY|MINUTE|SECOND|HOUR|YEAR)\)/DATE_TRUNC('\2', \1)/g" | ||
# Add necessary quotes to INTERVAL, e.g. "INTERVAL 5 hour" to "INTERVAL '5' hour" | ||
export REGEX_INTERVAL="s/interval ([[:digit:]]+) (hour|day|month|year)/INTERVAL '\1' \2/gI" | ||
# Add numeric cast to ROUND(), e.g. "ROUND(1.234, 2)" to "ROUND( CAST(1.234 as numeric), 2)". | ||
export PERL_REGEX_ROUND='s/ROUND\(((.|\n)*?)\, /ROUND\( CAST\( \1 as numeric\)\,/g' | ||
export PERL_REGEX_ROUND='s/ROUND\(((.|\n)*?)(\, [24]\))/ROUND\( CAST\( \1 as numeric\)\3/g' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The regex is rather fragile.
Would it be possible to always ROUND(CAST(... as numeric), ...)
in the BigQuery SQL code to make the transformation for postgres unnecessary? For example
, CASE WHEN uo_tm_6hr >= 6 THEN ROUND(CAST((ur.urineoutput_6hr/wd.weight/uo_tm_6hr) AS NUMERIC), 4) END AS uo_mlkghr_6hr |
CAST
. In the postgres code this leads to ROUND(CAST(CAST(... as numeric) as numeric), 4)
.
Ideas welcome.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah these regexes are in general terrifying.... since we're starting to come up with non-standard rules for the SQL, we should probably put them somewhere. First thought is the style guide might be a reasonable place (albeit not perfect).
I think the change you're suggesting makes sense, and I'd be happy to get rid of the perl regex, so I'll investigate by bringing this into a branch (tests won't run successfully on external PRs).
Thanks for the changes (and the other ones as well!). Which OS are you running this on? For the regexes, they are quite brittle as I'd like to support two common flavours of shell:
Trying to allow for this compatibility eliminates the possibility of using some common syntax. |
Fedora 36 with Postgres 14.3 (the latest version in dnf). |
Hey, any news? :)
If you want, I can test and push an alternative patch and drop the regex. |
I got way too distracted by other things and the MySQL container build - but yes especially with people running into issues (e.g. #1393) this should definitely be fixed... making some changes now |
OK this looks to work well - the checks will fail since they're running on a fork which is unprivileged.. it's a TODO of mine to make PRs trigger a privileged workflow rather than trying to run the tests themselves. I tested it locally and built all the tables successfully (BSD sed over here). @schu can you give convert_bigquery_to_postgres.sh and postgres-make-concepts.sql a run with the latest commits pulled down? If it works for you, then we should be good to merge. |
I'll merge this as I'll test the functionality with gh actions |
Please see the individual commits for context.