Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Glob code refactoring vol. 2 #2138

Closed
wants to merge 12 commits into from

Conversation

dmnks
Copy link
Contributor

@dmnks dmnks commented Aug 2, 2022

Some more fixes and refactoring of the glob logic.

Depends on #2137 so marking this one as a draft until it's merged.

dmnks added 12 commits August 2, 2022 10:11
Replacing our non-glob short-circuitry with the GLOB_NOMAGIC flag might
have seemed like a straightforward change in commit
c10e231 but it turns out it's not a 1:1
replacement due to the following differences from the original behavior:

  - Non-glob filenames still make it past glob(3) to the URL and
    lstat(2) code

  - Patterns that are not well-formed (e.g. no closing bracket or brace)
    are still treated as globs

  - Curly braces aren't treated as "magic" chars by GLOB_NOMAGIC

  - Trailing slash in a non-glob directory is stripped away by glob(3)

Avoid any surprise in the existing callers by re-adding the glob check,
only slightly simpler than the original rpmIsGlob() and
__glob_pattern_p() combo.

Treat the backslash as a "magic" char, though, as we want those to
always be discarded by glob(3) from the resulting filename, in order to
preserve the escape feature added in the above commit.
Instead of constructing a new list from scratch and returning that, just
extend the passed list.  This makes it easier to call this function
incrementally when expanding multiple patterns.

No functional change.
By definition, glob(3) matches pathnames on the file system, so no
pattern containing a URL protocol (e.g. file:// or http://) will ever
produce any meaningful results.

This wasn't always the case, we used to call a Glob() function in the
past, which knew how to handle URLs, but that was axed in commit
9cbf034 some 15 years ago.
Rather than listing all the rules, just refer the reader to the man
page.  Although brace expansion isn't part of the standard rules, we
still support it through glob(3), so make a note.
Do set *argcPtr even if the actual count is 0, rather than leaving it
unchanged and thus possibly undefined in the caller.  This is also
consistent with how glob(3) works.
As per GNU glob(3), GLOB_ONLYDIR does not guarantee the matches are in
fact directories, that's why we check them with lstat(2).

That may lead to the resulting match list of length 0, so for
consistency, return GLOB_NOMATCH in that case, just like we do for
missing files.

No functional change since we don't check for the exact return code in
the callers, only whether it's positive or not.
Rather than failing right away, try to match the glob literally, just
like we do it with regular files (see processBinaryFile) and just like a
typical shell like bash does it (unless the "failglob" shell option is
set).  This makes the behavior between regular and special entries in
%files consistent.
No functional change, just shuffle some lines around for better
readability.
Instead of trying to be clever and coming up with our own flags, re-use
some of those supported by glob(3), GLOB_NOMAGIC in particular, we'll
add more in the following commit.

Don't actually pass GLOB_NOMAGIC to glob(3), though, as we already
emulate this behavior ourselves.

Adapt the callers and while at it, don't use GLOB_NOMAGIC where we deal
with hard-coded patterns which are always globs, to make the code more
explicit.

No functional change.
Don't special-case a glob pattern with no matches, just use GLOB_NOCHECK
and process the pattern within the same loop as any other match.

In order to keep the debug message, don't actually pass GLOB_NOCHECK to
glob(3), just emulate it ourselves so we can detect a failure.

For simplicity, only use the "File" variant of the original message, we
have translations for it already.

No functional change (besides the message).
Metacharacters meant literally should always be properly escaped in the
spec file to avoid any surprises, so make the fallback case more visible
to the user, also adapt the tests accordingly and add a note to docs.
Less nesting, easier on the eyes.

No functional change.
@dmnks dmnks changed the title Glob code fixes and DRY clean-up Glob code refactoring vol. 2 Aug 2, 2022
@dmnks
Copy link
Contributor Author

dmnks commented Aug 17, 2022

Will be superseded by another PR.

@dmnks dmnks closed this Aug 17, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant