kaizen: add equals-ignore-case pattern #340

timbray · 2024-07-20T16:44:38Z

addresses: #186

This adds support for the equals-ignore-case pattern. The automaton-building is a bit complex but reasonably well tested.

What is controversial is that this PR contains Quamina's first generated code - the case-mappings are pulled from the file CaseFolding.txt in the Unicode Character Database. The generated code is in case_folding.go. It is built by code_gen/build_casefolding_table.go which is in package main and has main()

The problem is that the CaseFolding.txt is regularly changed with Unicode releases. So build_casefolding_table checks to see if the case_folding.go file is older than 3 months and if so, rebuilds it. There is a Makefile to take care of building and running the code generator. So if you're doing a PR you should at some point type make and if you get a new case_folding.go then check it in with your PR.

There are a couple of problems. First, I've never done generated-Go code before and maybe there's a much better well-known approach. Second, I haven't figured out how to unit test the code generator. The coverage tools ignore it because it's not in the quamina package. Implicitly, it's tested by the tests in monocase_test.go and elsewhere having the desired results but I'm going to open an issue to figure out how to test properly.

addresses: #186 Signed-off-by: Tim Bray <[email protected]>

codecov-commenter · 2024-07-20T16:46:27Z

⚠️ Please install the to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 97.50000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 96.59%. Comparing base (8da5067) to head (0a03480).

Files	Patch %	Lines
monocase.go	96.07%	1 Missing and 1 partial ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #340      +/-   ##
==========================================
+ Coverage   96.56%   96.59%   +0.03%     
==========================================
  Files          18       19       +1     
  Lines        1863     1940      +77     
==========================================
+ Hits         1799     1874      +75     
- Misses         36       37       +1     
- Partials       28       29       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

timbray · 2024-07-20T16:48:36Z

Hmm, surprised by the benchmark failures, I didn't think I changed anything that should affect an existing benchmark. Will investigate. Could be trailing fallout from the Q-numbers work.

arnehormann · 2024-07-20T17:03:42Z

@timbray Go itself (the x package is semi-official and comes close to the standard library) appears to do it this way:
https://cs.opensource.google/go/x/text/+/master:gen.go
with generated code like this: https://cs.opensource.google/go/x/text/+/master:cases/tables15.0.0.go

timbray · 2024-07-20T18:52:25Z

This is annoying; the same problem with the same benchmark showed up on the Q numbers PR, and I accepted the numeric-match slowdown in the interests of correctness. So I changed the CI/CD to warn but not fail, and then after everything was clean I put the CI/CD back to fail mode, assuming that would reset the cached previous value. Apparently this did not happen; I have poked around and failed (so far) to figure out how/where the benchmark tool caches previous results.

Anyhow, for now, I'm going to reset the CI/CD again not to fail on slowdown.

Signed-off-by: Tim Bray <[email protected]>

github-actions

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark 'Go Benchmark'.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite	Current: `0a03480`	Previous: `78e2ec8`	Ratio
`BenchmarkCityLots`	`6894` ns/op 830 B/op 34 allocs/op	`5592` ns/op 773 B/op 31 allocs/op	`1.23`
`BenchmarkCityLots - ns/op`	`6894` ns/op	`5592` ns/op	`1.23`
`Benchmark_JsonFlattner_Evaluate_ContextFields`	`1247` ns/op 96 B/op 8 allocs/op	`726.2` ns/op 56 B/op 4 allocs/op	`1.72`
`Benchmark_JsonFlattner_Evaluate_ContextFields - ns/op`	`1247` ns/op	`726.2` ns/op	`1.72`
`Benchmark_JsonFlattner_Evaluate_ContextFields - B/op`	`96` B/op	`56` B/op	`1.71`
`Benchmark_JsonFlattner_Evaluate_ContextFields - allocs/op`	`8` allocs/op	`4` allocs/op	`2`

This comment was automatically generated by workflow using github-action-benchmark.

kaizen: add equals-ignore-case pattern

407f600

addresses: #186 Signed-off-by: Tim Bray <[email protected]>

timbray added 2 commits July 20, 2024 11:53

reset CI/CD to not fail on slowdown

306b034

Signed-off-by: Tim Bray <[email protected]>

enable github token for benchmarker

d574520

Signed-off-by: Tim Bray <[email protected]>

github-actions bot reviewed Jul 20, 2024

View reviewed changes

timbray added 2 commits July 20, 2024 14:30

Merge branch 'main' into ignore-case

323a403

Merge branch 'main' into ignore-case

0a03480

timbray merged commit 6d34146 into main Jul 22, 2024
7 checks passed

timbray deleted the ignore-case branch July 22, 2024 16:40

timbray mentioned this pull request Jul 22, 2024

pat: Add equals-ignore-case matching #186

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

kaizen: add equals-ignore-case pattern #340

kaizen: add equals-ignore-case pattern #340

timbray commented Jul 20, 2024

codecov-commenter commented Jul 20, 2024 •

edited

Loading

timbray commented Jul 20, 2024

arnehormann commented Jul 20, 2024

timbray commented Jul 20, 2024

github-actions bot left a comment •

edited

Loading

kaizen: add equals-ignore-case pattern #340

kaizen: add equals-ignore-case pattern #340

Conversation

timbray commented Jul 20, 2024

codecov-commenter commented Jul 20, 2024 • edited Loading

Codecov Report

timbray commented Jul 20, 2024

arnehormann commented Jul 20, 2024

timbray commented Jul 20, 2024

github-actions bot left a comment • edited Loading

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

codecov-commenter commented Jul 20, 2024 •

edited

Loading

github-actions bot left a comment •

edited

Loading