Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add bundled-vectorscan feature #28

Closed
wants to merge 4 commits into from
Closed

Conversation

bradlarsen
Copy link

The new bundled-vectorscan feature addresses #20, allowing rust-hyperscan to build and run on non-Intel architectures, including Apple Silicon. The way this feature works is to build against a bundled version of the vectorscan fork of Hyperscan, which supports additional architectures but keeps the same API.

This feature is not enabled by default. The bundled version of vectorscan is added via a Git submodule.

Building vectorscan (or hyperscan) from source requires several build-time dependencies, including Boost, CMake, Ragel, and Python. Isolating the vectorscan (or hyperscan) code to eliminate these dependencies would be significantly more work.

I also updated the README to describe this new feature, as well as expanding slightly on the existing features.

@bradlarsen
Copy link
Author

@flier It would likely be very little additional effort to add a similar bundled-hyperscan feature that would build Hyperscan from source, if that's something you're interested in.

@bradlarsen
Copy link
Author

Also, I see that there is some appveyor-based CI, though it seems to be broken. Would you be interested in a simple GitHub Actions setup for the rust-hyperscan project?

@bradlarsen
Copy link
Author

Some benchmark runs from my M1 Max using the bundled-vectorscan feature:

cargo criterion --features bundled-vectorscan,gen,static,contained
warning: unused import: `anyhow`
 --> hyperscan-sys/build.rs:4:14
  |
4 | use anyhow::{anyhow, bail, Context, Result};
  |              ^^^^^^
  |
  = note: `#[warn(unused_imports)]` on by default

warning: `hyperscan-sys` (build script) generated 1 warning
warning: building bundled vectorscan at bundled-vectorscan
warning: building with Hyperscan with static library @ "bundled-vectorscan", link_paths=["/Users/blarsen/projects/rust-hyperscan/target/release/build/hyperscan-sys-980142e75ccd9015/out/lib"], include_paths=["/Users/blarsen/projects/rust-hyperscan/target/release/build/hyperscan-sys-980142e75ccd9015/out/include/hs"]
warning: generating raw Hyperscan binding file @ /Users/blarsen/projects/rust-hyperscan/target/release/build/hyperscan-sys-980142e75ccd9015/out/hyperscan.rs
   Compiling hyperscan-sys v0.3.2 (/Users/blarsen/projects/rust-hyperscan/hyperscan-sys)
   Compiling hyperscan v0.3.2 (/Users/blarsen/projects/rust-hyperscan/hyperscan)
warning: unused variable: `c`
  --> hyperscan/benches/bench.rs:92:18
   |
92 | fn chimera_bench(c: &mut Criterion) {}
   |                  ^ help: if this is intentional, prefix it with an underscore: `_c`
   |
   = note: `#[warn(unused_variables)]` on by default

warning: `hyperscan` (bench "bench") generated 1 warning
    Finished bench [optimized] target(s) in 1.25s
hyperscan/Easy0/16      time:   [4.0350 ns 4.0371 ns 4.0397 ns]
                        thrpt:  [3.6886 GiB/s 3.6910 GiB/s 3.6930 GiB/s]
hyperscan/Easy0/32      time:   [38.377 ns 38.400 ns 38.422 ns]
                        thrpt:  [794.26 MiB/s 794.74 MiB/s 795.21 MiB/s]
hyperscan/Easy0/1024    time:   [94.624 ns 94.678 ns 94.743 ns]
                        thrpt:  [10.066 GiB/s 10.073 GiB/s 10.079 GiB/s]
hyperscan/Easy0/32768   time:   [1.9961 µs 1.9974 µs 1.9990 µs]
                        thrpt:  [15.266 GiB/s 15.278 GiB/s 15.288 GiB/s]
hyperscan/Easy0/1048576 time:   [62.666 µs 62.741 µs 62.844 µs]
                        thrpt:  [15.540 GiB/s 15.565 GiB/s 15.584 GiB/s]
hyperscan/Easy0/33554432
                        time:   [2.0216 ms 2.0242 ms 2.0274 ms]
                        thrpt:  [15.414 GiB/s 15.438 GiB/s 15.458 GiB/s]
hyperscan/Easy0i/16     time:   [4.0718 ns 4.0810 ns 4.0896 ns]
                        thrpt:  [3.6437 GiB/s 3.6514 GiB/s 3.6596 GiB/s]
hyperscan/Easy0i/32     time:   [38.174 ns 38.199 ns 38.229 ns]
                        thrpt:  [798.29 MiB/s 798.91 MiB/s 799.43 MiB/s]
hyperscan/Easy0i/1024   time:   [94.619 ns 94.680 ns 94.752 ns]
                        thrpt:  [10.065 GiB/s 10.073 GiB/s 10.079 GiB/s]
hyperscan/Easy0i/32768  time:   [1.9954 µs 1.9964 µs 1.9977 µs]
                        thrpt:  [15.276 GiB/s 15.286 GiB/s 15.294 GiB/s]
hyperscan/Easy0i/1048576
                        time:   [62.596 µs 62.641 µs 62.693 µs]
                        thrpt:  [15.577 GiB/s 15.590 GiB/s 15.601 GiB/s]
hyperscan/Easy0i/33554432
                        time:   [2.0249 ms 2.0262 ms 2.0278 ms]
                        thrpt:  [15.411 GiB/s 15.423 GiB/s 15.433 GiB/s]
hyperscan/Easy1/16      time:   [4.0587 ns 4.0609 ns 4.0633 ns]
                        thrpt:  [3.6673 GiB/s 3.6694 GiB/s 3.6714 GiB/s]
hyperscan/Easy1/32      time:   [47.254 ns 47.299 ns 47.344 ns]
                        thrpt:  [644.60 MiB/s 645.20 MiB/s 645.82 MiB/s]
hyperscan/Easy1/1024    time:   [91.493 ns 91.553 ns 91.621 ns]
                        thrpt:  [10.409 GiB/s 10.417 GiB/s 10.424 GiB/s]
hyperscan/Easy1/32768   time:   [1.4973 µs 1.5028 µs 1.5092 µs]
                        thrpt:  [20.222 GiB/s 20.307 GiB/s 20.382 GiB/s]
hyperscan/Easy1/1048576 time:   [46.460 µs 46.831 µs 47.181 µs]
                        thrpt:  [20.698 GiB/s 20.853 GiB/s 21.019 GiB/s]
Benchmarking hyperscan/Easy1/33554432: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.9s, enable flat sampling, or reduce sample count to 50.
hyperscan/Easy1/33554432
                        time:   [1.5571 ms 1.5593 ms 1.5618 ms]
                        thrpt:  [20.009 GiB/s 20.040 GiB/s 20.069 GiB/s]
hyperscan/Hard1/16      time:   [21.134 ns 21.146 ns 21.158 ns]
                        thrpt:  [721.19 MiB/s 721.61 MiB/s 722.00 MiB/s]
hyperscan/Hard1/32      time:   [21.805 ns 21.816 ns 21.827 ns]
                        thrpt:  [1.3654 GiB/s 1.3661 GiB/s 1.3667 GiB/s]
hyperscan/Hard1/1024    time:   [65.389 ns 65.428 ns 65.473 ns]
                        thrpt:  [14.566 GiB/s 14.576 GiB/s 14.585 GiB/s]
hyperscan/Hard1/32768   time:   [1.4709 µs 1.4719 µs 1.4730 µs]
                        thrpt:  [20.719 GiB/s 20.733 GiB/s 20.747 GiB/s]
hyperscan/Hard1/1048576 time:   [45.038 µs 45.172 µs 45.329 µs]
                        thrpt:  [21.544 GiB/s 21.619 GiB/s 21.683 GiB/s]
Benchmarking hyperscan/Hard1/33554432: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.5s, enable flat sampling, or reduce sample count to 50.
hyperscan/Hard1/33554432
                        time:   [1.4771 ms 1.4778 ms 1.4785 ms]
                        thrpt:  [21.136 GiB/s 21.147 GiB/s 21.157 GiB/s]
hyperscan/Medium/16     time:   [4.2929 ns 4.2951 ns 4.2973 ns]
                        thrpt:  [3.4676 GiB/s 3.4694 GiB/s 3.4711 GiB/s]
hyperscan/Medium/32     time:   [45.269 ns 45.292 ns 45.315 ns]
                        thrpt:  [673.45 MiB/s 673.80 MiB/s 674.13 MiB/s]
hyperscan/Medium/1024   time:   [105.69 ns 105.75 ns 105.83 ns]
                        thrpt:  [9.0116 GiB/s 9.0180 GiB/s 9.0234 GiB/s]
hyperscan/Medium/32768  time:   [2.1265 µs 2.1316 µs 2.1367 µs]
                        thrpt:  [14.283 GiB/s 14.317 GiB/s 14.351 GiB/s]
hyperscan/Medium/1048576
                        time:   [66.180 µs 66.444 µs 66.701 µs]
                        thrpt:  [14.641 GiB/s 14.698 GiB/s 14.756 GiB/s]
hyperscan/Medium/33554432
                        time:   [2.1319 ms 2.1333 ms 2.1350 ms]
                        thrpt:  [14.637 GiB/s 14.649 GiB/s 14.658 GiB/s]
hyperscan/Hard/16       time:   [4.3193 ns 4.3393 ns 4.3642 ns]
                        thrpt:  [3.4144 GiB/s 3.4340 GiB/s 3.4499 GiB/s]
hyperscan/Hard/32       time:   [33.107 ns 33.142 ns 33.182 ns]
                        thrpt:  [919.71 MiB/s 920.80 MiB/s 921.79 MiB/s]
hyperscan/Hard/1024     time:   [92.542 ns 92.650 ns 92.771 ns]
                        thrpt:  [10.280 GiB/s 10.293 GiB/s 10.305 GiB/s]
hyperscan/Hard/32768    time:   [2.1157 µs 2.1172 µs 2.1194 µs]
                        thrpt:  [14.399 GiB/s 14.414 GiB/s 14.425 GiB/s]
hyperscan/Hard/1048576  time:   [66.431 µs 66.489 µs 66.538 µs]
                        thrpt:  [14.677 GiB/s 14.688 GiB/s 14.700 GiB/s]
hyperscan/Hard/33554432 time:   [2.1369 ms 2.1387 ms 2.1406 ms]
                        thrpt:  [14.598 GiB/s 14.612 GiB/s 14.624 GiB/s]

regex/Easy0/16          time:   [18.393 ns 18.422 ns 18.456 ns]
                        thrpt:  [826.79 MiB/s 828.30 MiB/s 829.58 MiB/s]
regex/Easy0/32          time:   [28.066 ns 28.089 ns 28.114 ns]
                        thrpt:  [1.0601 GiB/s 1.0610 GiB/s 1.0619 GiB/s]
regex/Easy0/1024        time:   [71.856 ns 71.954 ns 72.060 ns]
                        thrpt:  [13.234 GiB/s 13.254 GiB/s 13.272 GiB/s]
regex/Easy0/32768       time:   [1.5336 µs 1.5419 µs 1.5509 µs]
                        thrpt:  [19.677 GiB/s 19.793 GiB/s 19.899 GiB/s]
regex/Easy0/1048576     time:   [46.521 µs 46.552 µs 46.590 µs]
                        thrpt:  [20.961 GiB/s 20.978 GiB/s 20.992 GiB/s]
Benchmarking regex/Easy0/33554432: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.6s, enable flat sampling, or reduce sample count to 50.
regex/Easy0/33554432    time:   [1.5150 ms 1.5162 ms 1.5176 ms]
                        thrpt:  [20.592 GiB/s 20.611 GiB/s 20.627 GiB/s]
regex/Easy0i/16         time:   [25.760 ns 25.777 ns 25.794 ns]
                        thrpt:  [591.56 MiB/s 591.95 MiB/s 592.34 MiB/s]
regex/Easy0i/32         time:   [27.175 ns 27.221 ns 27.289 ns]
                        thrpt:  [1.0921 GiB/s 1.0948 GiB/s 1.0967 GiB/s]
regex/Easy0i/1024       time:   [139.18 ns 139.32 ns 139.48 ns]
                        thrpt:  [6.8373 GiB/s 6.8453 GiB/s 6.8522 GiB/s]
regex/Easy0i/32768      time:   [3.4513 µs 3.4561 µs 3.4607 µs]
                        thrpt:  [8.8184 GiB/s 8.8302 GiB/s 8.8424 GiB/s]
regex/Easy0i/1048576    time:   [108.50 µs 108.57 µs 108.66 µs]
                        thrpt:  [8.9870 GiB/s 8.9946 GiB/s 9.0009 GiB/s]
regex/Easy0i/33554432   time:   [3.5146 ms 3.5161 ms 3.5177 ms]
                        thrpt:  [8.8836 GiB/s 8.8877 GiB/s 8.8915 GiB/s]
regex/Easy1/16          time:   [26.911 ns 26.964 ns 27.021 ns]
                        thrpt:  [564.71 MiB/s 565.89 MiB/s 567.00 MiB/s]
regex/Easy1/32          time:   [27.474 ns 27.574 ns 27.675 ns]
                        thrpt:  [1.0769 GiB/s 1.0808 GiB/s 1.0848 GiB/s]
regex/Easy1/1024        time:   [68.223 ns 68.306 ns 68.394 ns]
                        thrpt:  [13.944 GiB/s 13.962 GiB/s 13.979 GiB/s]
regex/Easy1/32768       time:   [1.4740 µs 1.4748 µs 1.4759 µs]
                        thrpt:  [20.677 GiB/s 20.692 GiB/s 20.704 GiB/s]
regex/Easy1/1048576     time:   [45.745 µs 45.792 µs 45.840 µs]
                        thrpt:  [21.304 GiB/s 21.326 GiB/s 21.348 GiB/s]
Benchmarking regex/Easy1/33554432: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.5s, enable flat sampling, or reduce sample count to 50.
regex/Easy1/33554432    time:   [1.4879 ms 1.4888 ms 1.4900 ms]
                        thrpt:  [20.973 GiB/s 20.990 GiB/s 21.003 GiB/s]
regex/Hard1/16          time:   [26.948 ns 26.998 ns 27.055 ns]
                        thrpt:  [563.99 MiB/s 565.17 MiB/s 566.22 MiB/s]
regex/Hard1/32          time:   [45.607 ns 45.712 ns 45.827 ns]
                        thrpt:  [665.93 MiB/s 667.61 MiB/s 669.14 MiB/s]
regex/Hard1/1024        time:   [1.9496 µs 1.9529 µs 1.9568 µs]
                        thrpt:  [499.07 MiB/s 500.06 MiB/s 500.90 MiB/s]
regex/Hard1/32768       time:   [62.416 µs 62.594 µs 62.772 µs]
                        thrpt:  [497.84 MiB/s 499.25 MiB/s 500.68 MiB/s]
regex/Hard1/1048576     time:   [1.9822 ms 1.9838 ms 1.9856 ms]
                        thrpt:  [503.62 MiB/s 504.08 MiB/s 504.50 MiB/s]
Benchmarking regex/Hard1/33554432: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 6.4s, or reduce sample count to 70.
regex/Hard1/33554432    time:   [63.588 ms 63.642 ms 63.700 ms]
                        thrpt:  [502.35 MiB/s 502.81 MiB/s 503.24 MiB/s]
regex/Medium/16         time:   [26.970 ns 27.025 ns 27.085 ns]
                        thrpt:  [563.36 MiB/s 564.62 MiB/s 565.76 MiB/s]
regex/Medium/32         time:   [27.053 ns 27.097 ns 27.146 ns]
                        thrpt:  [1.0979 GiB/s 1.0999 GiB/s 1.1016 GiB/s]
regex/Medium/1024       time:   [68.923 ns 68.971 ns 69.031 ns]
                        thrpt:  [13.815 GiB/s 13.827 GiB/s 13.837 GiB/s]
regex/Medium/32768      time:   [1.4870 µs 1.4903 µs 1.4937 µs]
                        thrpt:  [20.431 GiB/s 20.477 GiB/s 20.522 GiB/s]
regex/Medium/1048576    time:   [45.837 µs 45.889 µs 45.941 µs]
                        thrpt:  [21.257 GiB/s 21.281 GiB/s 21.305 GiB/s]
Benchmarking regex/Medium/33554432: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 7.5s, enable flat sampling, or reduce sample count to 50.
regex/Medium/33554432   time:   [1.4867 ms 1.4872 ms 1.4876 ms]
                        thrpt:  [21.007 GiB/s 21.013 GiB/s 21.020 GiB/s]
regex/Hard/16           time:   [31.056 ns 31.106 ns 31.169 ns]
                        thrpt:  [489.55 MiB/s 490.54 MiB/s 491.33 MiB/s]
regex/Hard/32           time:   [49.316 ns 49.374 ns 49.433 ns]
                        thrpt:  [617.35 MiB/s 618.10 MiB/s 618.81 MiB/s]
regex/Hard/1024         time:   [1.8268 µs 1.8288 µs 1.8313 µs]
                        thrpt:  [533.26 MiB/s 533.98 MiB/s 534.58 MiB/s]
regex/Hard/32768        time:   [56.483 µs 56.526 µs 56.578 µs]
                        thrpt:  [552.33 MiB/s 552.84 MiB/s 553.27 MiB/s]
Benchmarking regex/Hard/1048576: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 9.1s, enable flat sampling, or reduce sample count to 50.
regex/Hard/1048576      time:   [1.8094 ms 1.8106 ms 1.8118 ms]
                        thrpt:  [551.95 MiB/s 552.31 MiB/s 552.67 MiB/s]
Benchmarking regex/Hard/33554432: Warming up for 3.0000 s
Warning: Unable to complete 100 samples in 5.0s. You may wish to increase target time to 5.8s, or reduce sample count to 80.
regex/Hard/33554432     time:   [58.188 ms 58.228 ms 58.272 ms]
                        thrpt:  [549.15 MiB/s 549.57 MiB/s 549.95 MiB/s]


running 20 tests
test common::database::tests::test_database ... ignored
test common::serialized::tests::test_database_deserialize ... ignored
test common::serialized::tests::test_database_deserialize_at ... ignored
test common::serialized::tests::test_database_serialize ... ignored
test compile::builder::tests::test_database_compile ... ignored
test compile::literal::tests::test_compile_flags ... ignored
test compile::literal::tests::test_literal ... ignored
test compile::literal::tests::test_pattern_build ... ignored
test compile::literal::tests::test_pattern_build_with_flags ... ignored
test compile::literal::tests::test_patterns_build ... ignored
test compile::literal::tests::test_patterns_build_with_flags ... ignored
test compile::pattern::tests::test_compile_flags ... ignored
test compile::pattern::tests::test_pattern ... ignored
test compile::pattern::tests::test_pattern_build ... ignored
test compile::pattern::tests::test_pattern_build_with_flags ... ignored
test compile::pattern::tests::test_patterns_build ... ignored
test compile::pattern::tests::test_patterns_build_with_flags ... ignored
test compile::platform::tests::test_platform ... ignored
test regex::re::tests::test_find_iter ... ignored
test runtime::scratch::tests::test_scratch ... ignored

test result: ok. 0 passed; 0 failed; 20 ignored; 0 measured; 0 filtered out; finished in 0.00s


running 4 tests
test bindgen_test_layout_hs_compile_error ... ignored
test bindgen_test_layout_hs_expr_ext ... ignored
test bindgen_test_layout_hs_expr_info ... ignored
test bindgen_test_layout_hs_platform_info ... ignored

test result: ok. 0 passed; 0 failed; 4 ignored; 0 measured; 0 filtered out; finished in 0.00s

@flier
Copy link
Owner

flier commented Jan 5, 2023

The new bundled-vectorscan feature addresses #20, allowing rust-hyperscan to build and run on non-Intel architectures, including Apple Silicon. The way this feature works is to build against a bundled version of the vectorscan fork of Hyperscan, which supports additional architectures but keeps the same API.

This feature is not enabled by default. The bundled version of vectorscan is added via a Git submodule.

Building vectorscan (or hyperscan) from source requires several build-time dependencies, including Boost, CMake, Ragel, and Python. Isolating the vectorscan (or hyperscan) code to eliminate these dependencies would be significantly more work.

I also updated the README to describe this new feature, as well as expanding slightly on the existing features.

I think we may move the bundled version of vectorscan to a new crate, just like openssl-src did.
Then an optional vendored or bundled feature can be enabled at build time, instead of using the git submodule.

@flier
Copy link
Owner

flier commented Jan 5, 2023

@flier It would likely be very little additional effort to add a similar bundled-hyperscan feature that would build Hyperscan from source, if that's something you're interested in.

I'm not sure, because build a hyperscan from source code is a very long time task, we need several steps to prepare it.

@flier
Copy link
Owner

flier commented Jan 5, 2023

Also, I see that there is some appveyor-based CI, though it seems to be broken. Would you be interested in a simple GitHub Actions setup for the rust-hyperscan project?

We already have a CI workflow, but it may not works on ARM and windows platforms, if you have time you can finish it, Thanks

@bradlarsen
Copy link
Author

We already have a CI workflow, but it may not works on ARM and windows platforms, if you have time you can finish it, Thanks

Oh! Somehow I missed that. Nevermind :)

@bradlarsen
Copy link
Author

I think we may move the bundled version of vectorscan to a new crate, just like openssl-src did.
Then an optional vendored or bundled feature can be enabled at build time, instead of using the git submodule.

Thank you; I will give this a try.

@bradlarsen
Copy link
Author

In Nosey Parker, I ended up putting together my own minimal bindings for Vectorscan specific to that project's use case: it only links statically against Vectorscan (not Hyperscan), builds that from source, and skips building parts of the Vectorscan project that aren't needed by Nosey Parker.

See here:

It would be useful still to have support for using bundled vectorscan in this rust-hyperscan crate, but it would certainly be more work to make it work for more general use cases.

@bradlarsen bradlarsen closed this Apr 5, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants