Skip to content

Commit

Permalink
Squash of transliterator-compiler
Browse files Browse the repository at this point in the history
commit abb91cc
Author: Niels Saurer <[email protected]>
Date:   Wed Aug 9 01:12:13 2023 +0200

    reformat tests

commit f6a10f5
Author: Niels Saurer <[email protected]>
Date:   Wed Aug 9 00:30:09 2023 +0200

    sizes => counts

commit 9ffc2f0
Author: Niels Saurer <[email protected]>
Date:   Wed Aug 9 00:26:27 2023 +0200

    add more docs

commit eae5748
Author: Niels Saurer <[email protected]>
Date:   Tue Aug 8 23:46:20 2023 +0200

    remove TODO

commit 6b09689
Author: Niels Saurer <[email protected]>
Date:   Tue Aug 8 23:28:42 2023 +0200

    improve docs

commit c9b16d5
Author: Niels Saurer <[email protected]>
Date:   Tue Aug 8 23:15:23 2023 +0200

    clippy

commit 020a677
Author: Niels Saurer <[email protected]>
Date:   Tue Aug 8 22:53:14 2023 +0200

    add result aggregation to first pass

commit 2d1bfd7
Author: Niels Saurer <[email protected]>
Date:   Tue Aug 8 16:28:23 2023 +0200

    add tests

commit 6f35ea5
Author: Niels Saurer <[email protected]>
Date:   Mon Aug 7 22:25:56 2023 +0200

    CI fixes

commit c6c4844
Author: Niels Saurer <[email protected]>
Date:   Sun Aug 6 20:06:31 2023 +0200

    first steps

commit fb68218
Author: Niels Saurer <[email protected]>
Date:   Wed Jul 19 16:21:33 2023 +0000

    Squash transliterator-parser

    structure for transliterator parser

    start parsing ':: ... ;' rules

    complete ::-rule parsing

    add more global filter tests

    add negative tests for '::'-rules, be more restrictive

    update error docs

    add comment about static UnicodeSet type alias

    add variable defs

    escaping and fix unicodeset handling

    fix unicodeset tests

    function calls

    add variable-inside-unicodesets

    update tests

    rewrite parse_section using parse_element

    fix unquoted literal handling

    add cursor/placeholder tests

    add cursor support

    add allow(unused) for this PR

    remove unused dependencies

    add todo about inefficient unicodeset variablemap handling

    allow usage of UnicodeSet's VariableMap directly in TransliteratorParser

    avoid one allocation per parsed unicodeset

    remove done todo about allocation-free unicodeset parser hook

    avoid allocations for number parsing

    invalid num err with offset

    update comment

    switch to allocation free hex parsing (and support for multi escapes)

    fix main merge conflict

    support \p unicodesets

    remove todo for \p unicodeset parsing

    turn low-prio todo about avoiding clones into note

    turn non-memory-safety safety comments into regular comments

    add issue number to TODOs

    add transliteration component crate
  • Loading branch information
skius committed Aug 8, 2023
1 parent ff3c7e1 commit c6cbb0a
Show file tree
Hide file tree
Showing 9 changed files with 3,107 additions and 1 deletion.
12 changes: 12 additions & 0 deletions Cargo.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

1 change: 1 addition & 0 deletions Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ members = [
"experimental/relativetime",
"experimental/relativetime/data",
"experimental/transliteration",
"experimental/transliterator_parser",
"experimental/unicodeset_parser",
"ffi/capi_cdylib",
"ffi/capi_staticlib",
Expand Down
2 changes: 1 addition & 1 deletion experimental/transliteration/Cargo.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,4 +31,4 @@ icu_collections = { version = "1.2.0", path = "../../components/collections", fe
serde = { version = "1.0", features = ["derive"] }
zerovec = { version = "0.9.4", path = "../../utils/zerovec", features = ["derive"] }

# TODO: Add serde, datagen, compiled_data features
# TODO: Add serde, datagen, compiled_data features
37 changes: 37 additions & 0 deletions experimental/transliterator_parser/Cargo.toml
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# This file is part of ICU4X. For terms of use, please see the file
# called LICENSE at the top level of the ICU4X source tree
# (online at: https://github.com/unicode-org/icu4x/blob/main/LICENSE ).

[package]
name = "icu_transliterator_parser"
description = "API to parse transform rules into transliterators as defined in UTS35"
version = "0.0.0"
authors = ["The ICU4X Project Developers"]
edition = "2021"
readme = "README.md"
repository = "https://github.com/unicode-org/icu4x"
license = "Unicode-DFS-2016"
categories = ["internationalization"]
# Keep this in sync with other crates unless there are exceptions
include = [
"src/**/*",
"tests/**/*",
"Cargo.toml",
"LICENSE",
"README.md"
]

[package.metadata.docs.rs]
all-features = true

[dependencies]
icu_collections = { path = "../../components/collections" }
icu_properties = { path = "../../components/properties", default-features = false }
icu_provider = { path = "../../provider/core" }
icu_unicodeset_parser = { path = "../unicodeset_parser" }
icu_transliteration = { path = "../transliteration" }

log = "0.4"

[features]
compiled_data = ["icu_properties/compiled_data"]
51 changes: 51 additions & 0 deletions experimental/transliterator_parser/LICENSE
Original file line number Diff line number Diff line change
@@ -0,0 +1,51 @@
UNICODE, INC. LICENSE AGREEMENT - DATA FILES AND SOFTWARE

See Terms of Use <https://www.unicode.org/copyright.html>
for definitions of Unicode Inc.’s Data Files and Software.

NOTICE TO USER: Carefully read the following legal agreement.
BY DOWNLOADING, INSTALLING, COPYING OR OTHERWISE USING UNICODE INC.'S
DATA FILES ("DATA FILES"), AND/OR SOFTWARE ("SOFTWARE"),
YOU UNEQUIVOCALLY ACCEPT, AND AGREE TO BE BOUND BY, ALL OF THE
TERMS AND CONDITIONS OF THIS AGREEMENT.
IF YOU DO NOT AGREE, DO NOT DOWNLOAD, INSTALL, COPY, DISTRIBUTE OR USE
THE DATA FILES OR SOFTWARE.

COPYRIGHT AND PERMISSION NOTICE

Copyright © 1991-2022 Unicode, Inc. All rights reserved.
Distributed under the Terms of Use in https://www.unicode.org/copyright.html.

Permission is hereby granted, free of charge, to any person obtaining
a copy of the Unicode data files and any associated documentation
(the "Data Files") or Unicode software and any associated documentation
(the "Software") to deal in the Data Files or Software
without restriction, including without limitation the rights to use,
copy, modify, merge, publish, distribute, and/or sell copies of
the Data Files or Software, and to permit persons to whom the Data Files
or Software are furnished to do so, provided that either
(a) this copyright and permission notice appear with all copies
of the Data Files or Software, or
(b) this copyright and permission notice appear in associated
Documentation.

THE DATA FILES AND SOFTWARE ARE PROVIDED "AS IS", WITHOUT WARRANTY OF
ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
NONINFRINGEMENT OF THIRD PARTY RIGHTS.
IN NO EVENT SHALL THE COPYRIGHT HOLDER OR HOLDERS INCLUDED IN THIS
NOTICE BE LIABLE FOR ANY CLAIM, OR ANY SPECIAL INDIRECT OR CONSEQUENTIAL
DAMAGES, OR ANY DAMAGES WHATSOEVER RESULTING FROM LOSS OF USE,
DATA OR PROFITS, WHETHER IN AN ACTION OF CONTRACT, NEGLIGENCE OR OTHER
TORTIOUS ACTION, ARISING OUT OF OR IN CONNECTION WITH THE USE OR
PERFORMANCE OF THE DATA FILES OR SOFTWARE.

Except as contained in this notice, the name of a copyright holder
shall not be used in advertising or otherwise to promote the sale,
use or other dealings in these Data Files or Software without prior
written authorization of the copyright holder.


Portions of ICU4X may have been adapted from ICU4C and/or ICU4J.
ICU 1.8.1 to ICU 57.1 © 1995-2016 International Business Machines Corporation and others.
13 changes: 13 additions & 0 deletions experimental/transliterator_parser/README.md

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading

0 comments on commit c6cbb0a

Please sign in to comment.