-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Capgen cleanup #493
Capgen cleanup #493
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yow, lots of nice cleanup!
For the record, I shed no tears for Python2.
input_vars = [set(), set(), set()] # leaves, arrays, leaf elements | ||
inout_vars = [set(), set(), set()] # leaves, arrays, leaf elements | ||
output_vars = [set(), set(), set()] # leaves, arrays, leaf elements |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No more pirates?
@@ -12,7 +12,7 @@ | |||
|
|||
######################################################################## | |||
|
|||
_UNITS_RE = re.compile(r"^[^/@#$%^&*()\|<>\[\]{}?,.]+$") | |||
_UNITS_RE = re.compile(r"^[^/!@#$%^&*=()\|<>\[\]{}?,.]+$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Really? What do examples that use these look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's impossible to decipher regular expressions like this, it might be good to add a few examples above what we want to match. This would also help addressing @gold2718's question.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After staring at this for a minute, I think we can actually clean this up a bit if it makes sense with something like:
unitless_regex = "1"
unit_regex = "[a-zA-Z]+[-]?[0-9]+"
units_regex = f"^({unit_regex}(\s{unit_regex})?|{unitless_regex})$"
# >>> print(units_regex)
# ^([a-zA-Z]+[-]?[0-9]+(\s[a-zA-Z]+[-]?[0-9]+)?|1)$
_UNITS_RE = re.compile(units_regex)
Testing locally and on regex101 looks to allow only what we want minus the exponents with leading 0
's (like m s-01
) but that's easy enough to add in, but I was just curious if this was a reasonable replacement first?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@climbfuji @nusbaume thoughts on @mwaxmonsky 's regex cleanup?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did a little more testing and I missed two common cases so I think a more flushed out version could be:
unitless_regex = "1"
non_leading_zero_num = "[1-9]\d*"
negative_non_leading_zero_num = f"[-]{non_leading_zero_num}"
unit_exponent = f"({negative_non_leading_zero_num}|{non_leading_zero_num})"
unit_regex = f"[a-zA-Z]+{unit_exponent}?"
units_regex = f"^({unit_regex}(\s{unit_regex})*|{unitless_regex})$"
# >>> print(units_regex)
# ^([a-zA-Z]+([-][1-9]\d*|[1-9]\d*)?(\s[a-zA-Z]+([-][1-9]\d*|[1-9]\d*)?)*|1)$
_UNITS_RE = re.compile(units_regex)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not as familiar with the backwards compatibility needed for the whitespace support but I think we can leave that out of the regex and pass the proposed units str
through python's strip()
function to remove all surrounding whitespace and then pass it to _UNITS_RE.match(...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like @mwaxmonsky's solution!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update pushed. Just let me know if the new version has any issues!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also on the documentation front, I noticed that there's a few doctest
tests which pass in the units and error on invalid strings. Would we be duplicating documentation by putting in examples outside the doctest for check_units()
right below the regex declaration?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mwaxmonsky has pushed his regex updates, which is both cleaner and easier to decipher!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peverwhee Looks good to me. Thanks for this contribution.
(And good riddance to python v2!!!)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor comments below. Since this PR removes Python2 support (yay), is this mentioned or encoded somewhere, for example on a top-level README.md
and/or CMakeLists.txt
?
scripts/ccpp_database_obj.py
Outdated
datatable.xml file created by capgen. | ||
""" | ||
metadata_tables = {} | ||
host_name = "cam" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see that host_name
is hardcoded to cam
. What is the plan for making this configurable in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@peverwhee Can you comment on this please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@climbfuji Since the creating a database object from a file is not yet implemented, this host name is not currently doing anything. I did change it to something more general in the interim, but I think long-term, it will be parsed from the datafile.
@@ -12,7 +12,7 @@ | |||
|
|||
######################################################################## | |||
|
|||
_UNITS_RE = re.compile(r"^[^/@#$%^&*()\|<>\[\]{}?,.]+$") | |||
_UNITS_RE = re.compile(r"^[^/!@#$%^&*=()\|<>\[\]{}?,.]+$") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it's impossible to decipher regular expressions like this, it might be good to add a few examples above what we want to match. This would also help addressing @gold2718's question.
RE: python2 I'm pretty sure python 2 support was de-facto ended long ago (there are a lot of python-3 specific constructs like f-strings deep in the framework, and have been there for more than a year). I'm not sure that requires any specific notes. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great! Just two small syntax things I noticed.
Merge main into feature/capgen (merge after #493)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for addressing my comments and for simplifying the regular expressions!
@@ -56,7 +56,7 @@ def db_from_file(self, run_env, database_file): | |||
datatable.xml file created by capgen. | |||
""" | |||
metadata_tables = {} | |||
host_name = "cam" | |||
host_name = "host" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me!
@peverwhee @nusbaume This has been approved by a bunch of people. @nusbaume Can you look at your previous comments and approve if ok, so that we can merge this? Thanks! |
Summary
This PR contains mostly clean-up for readability and clarity.
Description
User interface changes?: Yes
Can now include optional return_db argument to capgen to return a database for use by host model.
Addresses #380
Testing:
All tests passing except for unit conversion test that is expected to fail.