Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[fix] accept tabs as tokens after : and - #211

Merged
merged 5 commits into from
Feb 10, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
78 changes: 58 additions & 20 deletions .github/workflows/clang.yml
Original file line number Diff line number Diff line change
Expand Up @@ -42,7 +42,55 @@ jobs:
- {std: 20, cxx: clang++-10 , bt: Release, os: ubuntu-18.04, bitlinks: shared64 static32}
- {std: 11, cxx: clang++-6.0, bt: Debug , os: ubuntu-18.04, bitlinks: shared64 static32}
- {std: 11, cxx: clang++-6.0, bt: Release, os: ubuntu-18.04, bitlinks: shared64 static32}
env: {STD: "${{matrix.std}}", CXX_: "${{matrix.cxx}}", BT: "${{matrix.bt}}", BITLINKS: "${{matrix.bitlinks}}", VG: "${{matrix.vg}}", SAN: "${{matrix.san}}", LINT: "${{matrix.lint}}", OS: "${{matrix.os}}"}
env: {STD: "${{matrix.std}}", CXX_: "${{matrix.cxx}}", BT: "${{matrix.bt}}", BITLINKS: "${{matrix.bitlinks}}",
CMAKE_FLAGS: "${{matrix.cmkflags}}",
VG: "${{matrix.vg}}", SAN: "${{matrix.san}}", LINT: "${{matrix.lint}}", OS: "${{matrix.os}}"}
steps:
- {name: checkout, uses: actions/checkout@v2, with: {submodules: recursive}}
- {name: install requirements, run: source .github/reqs.sh && c4_install_test_requirements $OS}
- {name: show info, run: source .github/setenv.sh && c4_show_info}
- name: shared64-configure---------------------------------------------------
run: source .github/setenv.sh && c4_cfg_test shared64
- {name: shared64-build, run: source .github/setenv.sh && c4_build_test shared64}
- {name: shared64-run, run: source .github/setenv.sh && c4_run_test shared64}
- {name: shared64-pack, run: source .github/setenv.sh && c4_package shared64}
- name: static64-configure---------------------------------------------------
run: source .github/setenv.sh && c4_cfg_test static64
- {name: static64-build, run: source .github/setenv.sh && c4_build_test static64}
- {name: static64-run, run: source .github/setenv.sh && c4_run_test static64}
- {name: static64-pack, run: source .github/setenv.sh && c4_package static64}
- name: static32-configure---------------------------------------------------
run: source .github/setenv.sh && c4_cfg_test static32
- {name: static32-build, run: source .github/setenv.sh && c4_build_test static32}
- {name: static32-run, run: source .github/setenv.sh && c4_run_test static32}
- {name: static32-pack, run: source .github/setenv.sh && c4_package static32}
- name: shared32-configure---------------------------------------------------
run: source .github/setenv.sh && c4_cfg_test shared32
- {name: shared32-build, run: source .github/setenv.sh && c4_build_test shared32}
- {name: shared32-run, run: source .github/setenv.sh && c4_run_test shared32}
- {name: shared32-pack, run: source .github/setenv.sh && c4_package shared32}

clang_canary_tabtokens:
name: tabtokens/${{matrix.cxx}}/canary/c++${{matrix.std}}/${{matrix.bt}}
if: |
(!contains(github.event.head_commit.message, 'skip all')) ||
(!contains(github.event.head_commit.message, 'skip clang')) ||
contains(github.event.head_commit.message, 'only clang')
continue-on-error: true
runs-on: ${{matrix.os}}
strategy:
fail-fast: false
matrix:
include:
- {std: 17, cxx: clang++-10 , bt: Debug , os: ubuntu-18.04, bitlinks: static64, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 17, cxx: clang++-10 , bt: Release, os: ubuntu-18.04, bitlinks: static64, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 20, cxx: clang++-10 , bt: Debug , os: ubuntu-18.04, bitlinks: static64, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 20, cxx: clang++-10 , bt: Release, os: ubuntu-18.04, bitlinks: static64, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 11, cxx: clang++-6.0, bt: Debug , os: ubuntu-18.04, bitlinks: static64, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 11, cxx: clang++-6.0, bt: Release, os: ubuntu-18.04, bitlinks: static64, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
env: {STD: "${{matrix.std}}", CXX_: "${{matrix.cxx}}", BT: "${{matrix.bt}}", BITLINKS: "${{matrix.bitlinks}}",
CMAKE_FLAGS: "${{matrix.cmkflags}}",
VG: "${{matrix.vg}}", SAN: "${{matrix.san}}", LINT: "${{matrix.lint}}", OS: "${{matrix.os}}"}
steps:
- {name: checkout, uses: actions/checkout@v2, with: {submodules: recursive}}
- {name: install requirements, run: source .github/reqs.sh && c4_install_test_requirements $OS}
Expand Down Expand Up @@ -82,21 +130,11 @@ jobs:
fail-fast: false
matrix:
include:
- {std: 11, cxx: clang++-9 , bt: Debug , vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-9 , bt: Release, vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-8 , bt: Debug , vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-8 , bt: Release, vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-7 , bt: Debug , vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-7 , bt: Release, vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-6.0, bt: Debug , vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-6.0, bt: Release, vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-5.0, bt: Debug , vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-5.0, bt: Release, vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-4.0, bt: Debug , vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-4.0, bt: Release, vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-3.9, bt: Debug , vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-3.9, bt: Release, vg: on, os: ubuntu-18.04}
env: {STD: "${{matrix.std}}", CXX_: "${{matrix.cxx}}", BT: "${{matrix.bt}}", BITLINKS: "${{matrix.bitlinks}}", VG: "${{matrix.vg}}", SAN: "${{matrix.san}}", LINT: "${{matrix.lint}}", OS: "${{matrix.os}}"}
- {std: 11, cxx: clang++-10 , bt: Debug , vg: on, os: ubuntu-18.04}
- {std: 11, cxx: clang++-10 , bt: Release, vg: on, os: ubuntu-18.04}
env: {STD: "${{matrix.std}}", CXX_: "${{matrix.cxx}}", BT: "${{matrix.bt}}", BITLINKS: "${{matrix.bitlinks}}",
CMAKE_FLAGS: "${{matrix.cmkflags}}",
VG: "${{matrix.vg}}", SAN: "${{matrix.san}}", LINT: "${{matrix.lint}}", OS: "${{matrix.os}}"}
steps:
- {name: checkout, uses: actions/checkout@v2, with: {submodules: recursive}}
- {name: install requirements, run: source .github/reqs.sh && c4_install_test_requirements $OS}
Expand Down Expand Up @@ -137,10 +175,10 @@ jobs:
matrix:
include:
# clang tidy takes a long time, so don't do multiple bits/linktypes
- {std: 11, cxx: clang++-9, bt: Debug , lint: clang-tidy, bitlinks: shared64 static64, os: ubuntu-18.04}
- {std: 11, cxx: clang++-9, bt: Debug , lint: clang-tidy, bitlinks: shared32 static32, os: ubuntu-18.04}
- {std: 11, cxx: clang++-9, bt: ReleaseWithDebInfo, lint: clang-tidy, bitlinks: shared64 static64, os: ubuntu-18.04}
- {std: 11, cxx: clang++-9, bt: ReleaseWithDebInfo, lint: clang-tidy, bitlinks: shared32 static32, os: ubuntu-18.04}
- {std: 11, cxx: clang++-10, bt: Debug , lint: clang-tidy, bitlinks: shared64 static64, os: ubuntu-18.04}
- {std: 11, cxx: clang++-10, bt: Debug , lint: clang-tidy, bitlinks: shared32 static32, os: ubuntu-18.04}
- {std: 11, cxx: clang++-10, bt: ReleaseWithDebInfo, lint: clang-tidy, bitlinks: shared64 static64, os: ubuntu-18.04}
- {std: 11, cxx: clang++-10, bt: ReleaseWithDebInfo, lint: clang-tidy, bitlinks: shared32 static32, os: ubuntu-18.04}
env: {STD: "${{matrix.std}}", CXX_: "${{matrix.cxx}}", BT: "${{matrix.bt}}", BITLINKS: "${{matrix.bitlinks}}", VG: "${{matrix.vg}}", SAN: "${{matrix.san}}", LINT: "${{matrix.lint}}", OS: "${{matrix.os}}"}
steps:
- {name: checkout, uses: actions/checkout@v2, with: {submodules: recursive}}
Expand Down
57 changes: 46 additions & 11 deletions .github/workflows/gcc.yml
Original file line number Diff line number Diff line change
Expand Up @@ -68,6 +68,52 @@ jobs:
- {name: shared32-run, run: source .github/setenv.sh && c4_run_test shared32}
- {name: shared32-pack, run: source .github/setenv.sh && c4_package shared32}

gcc_tabtokens:
name: tabtokens/${{matrix.cxx}}/canary/${{matrix.bt}}
if: |
(!contains(github.event.head_commit.message, 'skip all')) ||
(!contains(github.event.head_commit.message, 'skip gcc')) ||
contains(github.event.head_commit.message, 'only gcc')
continue-on-error: true
runs-on: ${{matrix.os}}
strategy:
fail-fast: false
matrix:
include:
- {std: 11, cxx: g++-7 , bt: Debug , os: ubuntu-18.04, bitlinks: shared64 static32, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 11, cxx: g++-7 , bt: Release, os: ubuntu-18.04, bitlinks: shared64 static32, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 20, cxx: g++-10 , bt: Debug , os: ubuntu-18.04, bitlinks: shared64 static32, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 20, cxx: g++-10 , bt: Release, os: ubuntu-18.04, bitlinks: shared64 static32, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 11, cxx: g++-5 , bt: Debug , os: ubuntu-18.04, bitlinks: shared64 static32, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
- {std: 11, cxx: g++-5 , bt: Release, os: ubuntu-18.04, bitlinks: shared64 static32, cmkflags: "-DRYML_WITH_TAB_TOKENS=ON"}
env: {STD: "${{matrix.std}}", CXX_: "${{matrix.cxx}}", BT: "${{matrix.bt}}", BITLINKS: "${{matrix.bitlinks}}",
CMAKE_FLAGS: "${{matrix.cmkflags}}",
VG: "${{matrix.vg}}", SAN: "${{matrix.san}}", LINT: "${{matrix.lint}}", OS: "${{matrix.os}}"}
steps:
- {name: checkout, uses: actions/checkout@v2, with: {submodules: recursive}}
- {name: install requirements, run: source .github/reqs.sh && c4_install_test_requirements $OS}
- {name: show info, run: source .github/setenv.sh && c4_show_info}
- name: shared64-configure---------------------------------------------------
run: source .github/setenv.sh && c4_cfg_test shared64
- {name: shared64-build, run: source .github/setenv.sh && c4_build_test shared64}
- {name: shared64-run, run: source .github/setenv.sh && c4_run_test shared64}
- {name: shared64-pack, run: source .github/setenv.sh && c4_package shared64}
- name: static64-configure---------------------------------------------------
run: source .github/setenv.sh && c4_cfg_test static64
- {name: static64-build, run: source .github/setenv.sh && c4_build_test static64}
- {name: static64-run, run: source .github/setenv.sh && c4_run_test static64}
- {name: static64-pack, run: source .github/setenv.sh && c4_package static64}
- name: static32-configure---------------------------------------------------
run: source .github/setenv.sh && c4_cfg_test static32
- {name: static32-build, run: source .github/setenv.sh && c4_build_test static32}
- {name: static32-run, run: source .github/setenv.sh && c4_run_test static32}
- {name: static32-pack, run: source .github/setenv.sh && c4_package static32}
- name: shared32-configure---------------------------------------------------
run: source .github/setenv.sh && c4_cfg_test shared32
- {name: shared32-build, run: source .github/setenv.sh && c4_build_test shared32}
- {name: shared32-run, run: source .github/setenv.sh && c4_run_test shared32}
- {name: shared32-pack, run: source .github/setenv.sh && c4_package shared32}

#----------------------------------------------------------------------------
gcc_extended:
name: ${{matrix.cxx}}/extended/${{matrix.bt}}
Expand All @@ -91,17 +137,6 @@ jobs:
- {std: 17, cxx: g++-10, bt: Release, vg: ON, os: ubuntu-18.04}
- {std: 20, cxx: g++-10, bt: Debug , vg: ON, os: ubuntu-18.04}
- {std: 20, cxx: g++-10, bt: Release, vg: ON, os: ubuntu-18.04}
#
- {std: 11, cxx: g++-9, bt: Debug , os: ubuntu-18.04}
- {std: 11, cxx: g++-9, bt: Release, os: ubuntu-18.04}
- {std: 11, cxx: g++-8, bt: Debug , os: ubuntu-18.04}
- {std: 11, cxx: g++-8, bt: Release, os: ubuntu-18.04}
- {std: 11, cxx: g++-7, bt: Debug , os: ubuntu-18.04}
- {std: 11, cxx: g++-7, bt: Release, os: ubuntu-18.04}
- {std: 11, cxx: g++-6, bt: Debug , os: ubuntu-18.04}
- {std: 11, cxx: g++-6, bt: Release, os: ubuntu-18.04}
- {std: 11, cxx: g++-5, bt: Debug , os: ubuntu-18.04}
- {std: 11, cxx: g++-5, bt: Release, os: ubuntu-18.04}
env: {STD: "${{matrix.std}}", CXX_: "${{matrix.cxx}}", BT: "${{matrix.bt}}", BITLINKS: "${{matrix.bitlinks}}", VG: "${{matrix.vg}}", SAN: "${{matrix.san}}", LINT: "${{matrix.lint}}", OS: "${{matrix.os}}"}
steps:
- {name: checkout, uses: actions/checkout@v2, with: {submodules: recursive}}
Expand Down
5 changes: 5 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -10,6 +10,7 @@ c4_project(VERSION 0.3.0 STANDALONE

#-------------------------------------------------------

option(RYML_WITH_TAB_TOKENS "Enable parsing of tabs after ':' and '-'. This is costly and disabled by default." OFF)
option(RYML_DEFAULT_CALLBACKS "Enable ryml's default implementation of callbacks: allocate(), free(), error()" ON)
option(RYML_BUILD_TOOLS "build tools" OFF)
option(RYML_BUILD_API "Enable API generation (python, etc)" OFF)
Expand Down Expand Up @@ -57,6 +58,10 @@ c4_add_library(ryml
INCORPORATE c4core
)

if(RYML_WITH_TAB_TOKENS)
target_compile_definitions(ryml PUBLIC RYML_WITH_TAB_TOKENS)
endif()

if(NOT RYML_DEFAULT_CALLBACKS)
target_compile_definitions(ryml PRIVATE RYML_NO_DEFAULT_CALLBACKS)
endif()
Expand Down
37 changes: 23 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -657,7 +657,8 @@ sample_location_tracking(); ///< track node locations in the parsed source tr

### Package managers

If you opt for package managers, here's where ryml is available so far (thanks to all the contributors!):
If you opt for package managers, here's where ryml is available so far
(thanks to all the contributors!):
* [vcpkg](https://vcpkg.io/en/packages.html): `vcpkg install ryml`
* Arch Linux/Manjaro:
* [rapidyaml-git (AUR)](https://aur.archlinux.org/packages/rapidyaml-git/)
Expand Down Expand Up @@ -766,6 +767,9 @@ more about each sample:
The following cmake variables can be used to control the build behavior of
ryml:

* `RYML_WITH_TAB_TOKENS=ON/OFF`. Enable/disable support for tabs as
valid container tokens after `:` and `-`. Defaults to `OFF`,
because this may cost up to 10% in processing time.
* `RYML_DEFAULT_CALLBACKS=ON/OFF`. Enable/disable ryml's default
implementation of error and allocation callbacks. Defaults to `ON`.
* `RYML_STANDALONE=ON/OFF`. ryml uses
Expand All @@ -787,7 +791,8 @@ ryml is strongly coupled to c4core, and this is reinforced by the fact
that c4core is a submodule of the current repo. However, it is still
possible to use a c4core version different from the one in the repo
(of course, only if there are no incompatibilities between the
versions). You can find out how to achieve this by looking at the [`custom_c4core` sample](./samples/custom_c4core/CMakeLists.txt).
versions). You can find out how to achieve this by looking at the
[`custom_c4core` sample](./samples/custom_c4core/CMakeLists.txt).


------
Expand All @@ -814,8 +819,8 @@ be changed.) With that said, here's an example of the Python API:
```python
import ryml

# because ryml does not take ownership of the source buffer
# ryml cannot accept strings; only bytes or bytearrays
# ryml cannot accept strings because it does not take ownership of the
# source buffer; only bytes or bytearrays are accepted.
src = b"{HELLO: a, foo: b, bar: c, baz: d, seq: [0, 1, 2, 3]}"

def check(tree):
Expand Down Expand Up @@ -914,17 +919,20 @@ See also [the roadmap](./ROADMAP.md) for a list of future work.

ryml deliberately makes no effort to follow the standard in the following situations:

* Tab characters after `:` and `-` are not accepted tokens, unless
ryml is compiled with the macro `RYML_WITH_TAB_TOKENS`. This
requirement exists because checking for tabs introduces branching
into the parser's hot code and in some cases costs as much as 10%
in parsing time.
* Containers are not accepted as mapping keys: keys must be scalar strings.
* Tags are parsed as-is; tag lookup is not supported.
* Anchor names must not end with a terminating colon: eg `&anchor: key: val`.
* Tabs after `:` or `-` are not supported.
* `%TAG` directives have no effect and are ignored. All schemas are assumed
to be the default YAML 2002 schema.
* `%YAML` directives have no effect and are ignored.

Some of the limitations above will be worked on (tag lookups, tab
tokens). Others (notably container keys) absolutely will not, not in
the data tree at least.
Some of the limitations above will be worked on, (eg tag
lookups). Others (notably container keys) most likely will not.

Also, ryml tends to be on the permissive side where the YAML standard
dictates there should be an error; in many of these cases, ryml will
Expand All @@ -937,12 +945,13 @@ problems, which is a good practice anyway.
If you do run into trouble and would like to investigate conformance
of your YAML code, beware of existing online YAML linters, many of
which are not fully conformant; instead, try using
[https://play.yaml.io](https://play.yaml.io), an amazing tool from the
YAML people which lets you dynamically input your YAML and continuously
see the results from all the existing parsers (kudos to
@ingydotnet). And of course, if you detect anything bad with ryml,
please [open an issue](https://github.com/biojppm/rapidyaml/issues) so
that we can improve.
[https://play.yaml.io](https://play.yaml.io), an amazing tool which
lets you dynamically input your YAML and continuously see the results
from all the existing parsers (kudos to @ingydotnet and the people
from the YAML test suite). And of course, if you detect anything wrong
with ryml, please [open an
issue](https://github.com/biojppm/rapidyaml/issues) so that we can
improve.


### Test suite status
Expand Down
3 changes: 2 additions & 1 deletion changelog/current.md
Original file line number Diff line number Diff line change
Expand Up @@ -109,7 +109,8 @@ As part of the [new feature to track source locations](https://github.com/biojpp
? explicit key # this comment was not parsed correctly
? # trailing empty key was not added to the map
```
- ryml now parses successfully compact JSON code `{"like":"this"}` without any need for preprocessing. So the `preprocess_json()` functions and utilities are no longer necessary and have been removed. If you were using these functions, just remove the calls and pass the original source directly to ryml ([PR#210](https://github.com/biojppm/rapidyaml/pulls/210)).
- Fixed parsing of tabs used as whitespace tokens after `:` or `-`. This feature [is costly (see some benchmark results here)](https://github.com/biojppm/rapidyaml/pull/211#issuecomment-1030688035) and thus it is disabled by default, and requires defining a macro or cmake option `RYML_WITH_TAB_TOKENS` to enable ([PR#211](https://github.com/biojppm/rapidyaml/pulls/211)).
- ryml now parses successfully compact JSON code `{"like":"this"}` without any need for preprocessing. This code was not valid YAML 1.1, but was made valid in YAML 1.2. So the `preprocess_json()` functions, used to insert spaces after `:` are no longer necessary and have been removed. If you were using these functions, remove the calls and just pass the original source directly to ryml's parser ([PR#210](https://github.com/biojppm/rapidyaml/pulls/210)).
- Fix handling of indentation when parsing block scalars ([PR#210](https://github.com/biojppm/rapidyaml/pulls/210)):
```yaml
---
Expand Down
Loading