[Bug] Using pattern
with "copyright" always yields a (?i).*.*
regex
#11218
Labels
Milestone
pattern
with "copyright" always yields a (?i).*.*
regex
#11218
Search before asking
Apache SkyWalking Component
License Tools (apache/skywalking-eyes)
What happened
TL;DR: if the
pattern
includes "Copyright", then the whole regex will be replaced by""
by theOneLineNormalizer
due to a wrong (IMHO) text replacement.My copyright header is supposed to be:
In the
.licenserc.yaml
file, I have setpattern: Copyright: (?:\d{4}-\d{4}|\d{4}) the Kubeapps contributors\.'
.Then I run
license-eye -c .\.licenserc.yaml header check -v debug
and it returns OK, that's OK.However, when I play around with my header and change it to
Copyright whatever
... the "header check" result is still OK, when it clearly shouldn't be.What you expected to happen
I'd rather expect the check to fail when passing
Copyright whatever
and settingpattern: Copyright: (?:\d{4}-\d{4}|\d{4}) the Kubeapps contributors\.'
.How to reproduce
go install github.com/apache/skywalking-eyes/cmd/license-eye@latest
license-eye header check
in a directory containing the files below:.licenserc.yaml
main.go
Anything else
Inspecting the code, I guess this is caused by a wrong regex replacement when normalizing the regex. Let me explain:
All the patterns are normalized here (this is where we add
(?i).*" + pattern + ".*"
)https://github.com/apache/skywalking-eyes/blob/16b9726be37536a05279e061f0da02d205a2af77/pkg/header/config.go#L113
Then,
NormalizePattern
applies all the normalizers:https://github.com/apache/skywalking-eyes/blob/3a6d3090d78b7c104cb55ce4cc63a4333d66ecd0/pkg/license/norm.go#L268
One of them is the
OneLinerNormalizer
, which replaces the string with the corresponding replacement string:https://github.com/apache/skywalking-eyes/blob/3a6d3090d78b7c104cb55ce4cc63a4333d66ecd0/pkg/license/norm.go#L296-L301
And... here comes the issue: there are two replacements that match til the end of the line, which is erasing the whole regex in the pattern (change added in apache/skywalking-eyes@3a6d309)
https://github.com/apache/skywalking-eyes/blob/3a6d3090d78b7c104cb55ce4cc63a4333d66ecd0/pkg/license/norm.go#L237C8-L247
So the the regex becomes now
""
and the yielded normalized regex is therefore(?i).*" + ""+ ".*"
, that is(?i).*.*
... which always would return a match :SSee how this regex is matching the whole line:
Note this
OneLinerNormalizer
not only affects the pattern, but also the files (see how the copyright line disappears)So, if I just replace the
pattern: Copyright: foo'
. withpattern: foo'
(removing the word "copyright"), I'd have yet another problem: the check will always fail sincefoo
would have been erased.In short, I think the regexes
(?m)^\s*([cC©])?\s*Copyright (\([cC©]\))?.+$
and(?m)^\s*Portions Copyright (\([cC©]\))?.+$
, should be removed the.+$
part to avoid undesired matches. What do you think? Am I missing something?Are you willing to submit PR?
Code of Conduct
The text was updated successfully, but these errors were encountered: