Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update fork with latest changes from Google master #16

Open
wants to merge 257 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
257 commits
Select commit Hold shift + click to select a range
f6d4a93
fix appveyor python 3.9 build
lunixbochs Dec 11, 2020
e3e271c
Merge pull request #596 from lunixbochs/master
taku910 Dec 12, 2020
152a87f
Upgrade protobuf to 3.14.0
taku910 Dec 28, 2020
b8a4e75
Upgrade protobuf to 3.14.0
taku910 Dec 28, 2020
4c411a1
remove i386 to fix build break.
taku910 Dec 28, 2020
345e64f
merge the bug fix around byte fallback
taku910 Dec 28, 2020
7e87d68
update version
taku910 Dec 28, 2020
f0d3d78
fix build break when using external protobuf
taku910 Dec 30, 2020
c4fba49
fix build break when using external protobuf
taku910 Dec 30, 2020
844ac06
prepare for external absl library
taku910 Jan 2, 2021
4c34ed6
fixed build break.
taku910 Jan 2, 2021
36c3d56
Update README.md
boba-and-beer Jan 4, 2021
6e9bbec
fixed build break.
taku910 Jan 6, 2021
faf73f1
Merge pull request #606 from boba-and-beer/feature/add_gentle_introdu…
taku910 Jan 6, 2021
7e7913f
support to build spm with external absl
taku910 Jan 8, 2021
da6f3a8
Merge branch 'master' of https://github.com/google/sentencepiece
taku910 Jan 8, 2021
8eaa672
change the type of input_sentence_size from int32 to uint64
taku910 Jan 8, 2021
fe046e1
change the type of input_sentence_size from int32 to uint64
taku910 Jan 8, 2021
0e03b57
change the type of input_sentence_size from int32 to uint64
taku910 Jan 8, 2021
8083d4f
checks the range of id in Decode method
taku910 Jan 9, 2021
3589bfb
fixed build break.
taku910 Jan 9, 2021
0e6dfbf
fixed python module to check the id range.
taku910 Jan 10, 2021
d9a0892
add ubuntu focal image to travis target
taku910 Jan 10, 2021
4bc9ae0
Add missing #include for BYTE_ORDER
mhsmith Jan 12, 2021
ba7e11a
Merge pull request #617 from mhsmith/master
taku910 Jan 12, 2021
e03761f
add spm proto headers to install
erasaur Feb 10, 2021
f78087f
only install proto headers if not using builtin proto
erasaur Feb 10, 2021
c970ded
Removed codes where Zero Width Joiner replaced with whitespace.
sarubi Feb 23, 2021
d429804
Merge pull request #623 from erasaur/master
taku910 Feb 23, 2021
bc53923
Merge pull request #630 from sarubi/zwj_fix
taku910 Feb 25, 2021
72be462
Fix typo in readme
brettfazio Feb 26, 2021
9f544a6
Restore the sentence boundary marker insertion for the unigram
AdolfVonKleist Mar 22, 2021
35cc60b
Merge pull request #632 from brettfazio/patch-1
taku910 Apr 20, 2021
2f4da31
Merge pull request #642 from AdolfVonKleist/restore_boundaries
taku910 Apr 20, 2021
351600c
Update README.md
taku910 Apr 20, 2021
c9ea0f3
fix typo
wanchichen Apr 22, 2021
6256ef2
Merge pull request #651 from wanchichen/master
taku910 May 8, 2021
7316ef1
Fixed mistake in README
George-Ogden May 25, 2021
6a9706c
Merge pull request #658 from George-Ogden/patch-1
taku910 May 26, 2021
8a176d8
use latest ubuntu image
taku910 Jun 14, 2021
687885b
update the mac pywhell script to support py3.5
taku910 Jun 15, 2021
faa76a0
fixed build error on mac
taku910 Jun 15, 2021
897fe9d
fixed build error on mac
taku910 Jun 15, 2021
05db089
sync from internal
taku910 Jun 15, 2021
3a5bc58
Revert "sync from internal"
taku910 Jun 16, 2021
fab966a
sync from internal
taku910 Jun 16, 2021
5c194ec
fixed link error
taku910 Jun 16, 2021
d1e3333
fixed build error
taku910 Jun 16, 2021
cc2d2c8
fixed build error
taku910 Jun 16, 2021
705cdc1
added endian.h
taku910 Jun 17, 2021
f4f8309
fixed build error.
taku910 Jun 17, 2021
a61f985
fixed build error.
taku910 Jun 17, 2021
62eafec
updated the comment.
taku910 Jun 17, 2021
cab2e3c
Strip build directory prefix in the __FILE__ macro
danieldk Jun 17, 2021
cbfc6b3
updated *.tsv file.
taku910 Jun 17, 2021
d8711f5
Merge pull request #664 from danieldk/remove-build-path
taku910 Jun 17, 2021
31505e0
Update options.md
felixdae Jun 30, 2021
9954603
Merge pull request #666 from felixdae/patch-1
taku910 Jul 1, 2021
cefb97b
Fix link to nfkc.tsv in normalization.md
reiyw Jul 2, 2021
8420f21
Merge pull request #669 from reiyw/patch-1
taku910 Jul 2, 2021
142662e
Add FreeBSD support
apraga Sep 10, 2021
38278a0
Fix typo in doc
monologg Oct 3, 2021
d972932
Fix typo in cmakelist
zy566 Oct 28, 2021
f144921
fix address sanitizers on clang problem
xiefangqi Dec 21, 2021
dbc8341
Fixed typo error
matteobaccan Jan 25, 2022
4c8a377
Merge pull request #721 from matteobaccan/patch-1
taku910 Feb 16, 2022
27103a9
Merge pull request #701 from monologg/docs/typo
taku910 Feb 16, 2022
9ecdc91
Merge pull request #693 from apraga/freebsd
taku910 Feb 16, 2022
43cd284
Merge pull request #705 from zy566/minor
taku910 Feb 16, 2022
0c09e31
Merge pull request #715 from xiefangqi/fix_address_sanitizers_on_clang
taku910 Feb 16, 2022
ea1c0ae
Update README.md
akashhansda Mar 31, 2022
3a6021f
add python 3.10 support for windows
taku910 Apr 11, 2022
a25db30
Fix incorrect handling of "Inherited" script characters
ddaspit Apr 11, 2022
bb6d442
Merge pull request #743 from akashhansda/patch-1
taku910 Apr 12, 2022
b47b10c
Merge pull request #745 from ddaspit/inherited-script
taku910 Apr 12, 2022
7a9cb24
Fix compile error in clang 13
jaepil May 12, 2022
b56a1d6
Fix compile error in clang 13
jaepil May 12, 2022
24e925b
Update setup.cfg to use underscore for 'description_file' instead of …
sidpagariya May 13, 2022
8b44d9d
Merge pull request #753 from jaepil/master
taku910 May 17, 2022
d06cd79
Merge pull request #750 from sidpagariya/update-setup-cfg
taku910 May 17, 2022
9fe52f1
Sync internal to github. DP related features are added.
taku910 May 25, 2022
9f3ed99
Sync internal to github. DP related features are added.
taku910 May 25, 2022
9ca65fa
add optional NFKD support.
taku910 May 29, 2022
7d8fabe
1) override logging stream in training, 2) Makes 1-best and viterbi d…
taku910 May 29, 2022
c86a8a6
addd nbest|sample encoding method to python wrapper
taku910 May 30, 2022
60bb206
updated test case
taku910 May 30, 2022
b108472
fixed CI errors
taku910 May 30, 2022
1a425bd
Create cmake.yml
taku910 May 31, 2022
e9507a3
Update cmake.yml
taku910 May 31, 2022
892bbbe
Update cmake.yml
taku910 Jun 1, 2022
82b3804
Update cmake.yml
taku910 Jun 1, 2022
4ba845d
add python build
taku910 Jun 1, 2022
7b326eb
updated the python setup script for github actions
taku910 Jun 1, 2022
e90e90c
Update cmake.yml
taku910 Jun 1, 2022
3bdfd88
Update cmake.yml
taku910 Jun 1, 2022
db0339c
Update cmake.yml
taku910 Jun 1, 2022
a61584b
update setup.py
taku910 Jun 1, 2022
c1e40b7
update setup.py
taku910 Jun 1, 2022
bc28729
update setup.py
taku910 Jun 1, 2022
188f8ce
update setup.py
taku910 Jun 1, 2022
98ce877
Update cmake.yml
taku910 Jun 1, 2022
fa74a79
Update cmake.yml
taku910 Jun 1, 2022
70af5d9
Update cmake.yml
taku910 Jun 1, 2022
e7f1616
Create wheel.yml
taku910 Jun 2, 2022
d4b1f5b
Update wheel.yml
taku910 Jun 2, 2022
a7602de
Update wheel.yml
taku910 Jun 2, 2022
2202148
Update wheel.yml
taku910 Jun 2, 2022
5379e88
Update wheel.yml
taku910 Jun 2, 2022
3028663
update setup.py
taku910 Jun 2, 2022
0502146
Update wheel.yml
taku910 Jun 2, 2022
b2d56f0
Update wheel.yml
taku910 Jun 2, 2022
4534697
Update wheel.yml
taku910 Jun 2, 2022
7a5d14c
update setup.py
taku910 Jun 2, 2022
4b3d6bf
update setup.py
taku910 Jun 2, 2022
574167f
Update wheel.yml
taku910 Jun 2, 2022
e749e94
test windows wheel
taku910 Jun 3, 2022
d30375a
test windows wheel
taku910 Jun 3, 2022
a57b326
update setup.py
taku910 Jun 3, 2022
4f55d8f
update setup.py
taku910 Jun 3, 2022
5c3c048
Update cmake.yml
taku910 Jun 3, 2022
5de949d
Update cmake.yml
taku910 Jun 3, 2022
a401a5e
Fixed build error on Mac
taku910 Jun 4, 2022
2f44ee4
Uses build/root dir to make python wrapper
taku910 Jun 4, 2022
b5d8ea6
Update cmake.yml
taku910 Jun 4, 2022
5e7bf85
Update wheel.yml
taku910 Jun 4, 2022
f10a3bc
Add support univresal binary
taku910 Jun 4, 2022
477ae07
Update wheel.yml
taku910 Jun 4, 2022
9fb2536
Update wheel.yml
taku910 Jun 4, 2022
2881e23
Update wheel.yml
taku910 Jun 4, 2022
d19648e
update cmake.yml
taku910 Jun 4, 2022
b0e3e03
update cmake.yml
taku910 Jun 4, 2022
dad3682
update cmake.yml
taku910 Jun 4, 2022
173fe97
update cmake.yml
taku910 Jun 5, 2022
f2312a0
Update README.md
taku910 Jun 5, 2022
cfa1355
update cmake.yml
taku910 Jun 5, 2022
e20c695
update cmake.yml
taku910 Jun 5, 2022
3f63c59
update cmake.yml
taku910 Jun 5, 2022
5e5adf2
update cmake.yml
taku910 Jun 6, 2022
b2fd284
update python wrapper.
taku910 Jun 7, 2022
39b902a
update python wrapper.
taku910 Jun 8, 2022
c6aca03
remove debug symbols from wheel package
taku910 Jun 8, 2022
91809e5
remove debug symbols from wheel package
taku910 Jun 8, 2022
5b8fd00
allow tab character to be used in user_defined_symbols.
taku910 Jun 12, 2022
1abd836
add test to use tab as user defined symbols..
taku910 Jun 13, 2022
5b21ad7
Uses C++17 by default
taku910 Jun 13, 2022
68034f9
Uses std::atomic to define global variable
taku910 Jun 13, 2022
59b89b6
Fix a typo
kenhys Jun 14, 2022
631420b
Uses absl::string_view as much as possible
taku910 Jun 14, 2022
83a3505
Fixed build break.
taku910 Jun 14, 2022
8b02b2c
Merge pull request #756 from kenhys/fix-typo-gurantees
taku910 Jun 14, 2022
02555b8
Added ImmutableSentencePiece class
taku910 Jun 19, 2022
d7839fc
add verbose option
taku910 Jun 19, 2022
4999b56
add verbose option
taku910 Jun 19, 2022
f470b93
add verbose option
taku910 Jun 19, 2022
6c97b4d
add verbose option
taku910 Jun 19, 2022
901368e
add verbose option
taku910 Jun 20, 2022
13a8771
Supports ImmutableSentencePieceText from python module
taku910 Aug 1, 2022
6e6add5
Adds more unittests
taku910 Aug 2, 2022
69b88da
Adds more unittests
taku910 Aug 2, 2022
1f21d38
Adds SWIGPYTHON flag
taku910 Aug 3, 2022
b82f9b9
Upgraded to MacOS-11
taku910 Aug 3, 2022
005ad28
remove unused ifdef SWIG macro
taku910 Aug 3, 2022
497ee76
Fixed test failure.
taku910 Aug 3, 2022
b738153
Uses property in immutable proto
taku910 Aug 4, 2022
c14eb2e
automatically detect the number of CPUs in batch processing.
taku910 Aug 5, 2022
5a53be2
support slice in pieces/nbests objects
taku910 Aug 5, 2022
881229a
Updated the document
taku910 Aug 5, 2022
655b944
Updated the document.
taku910 Aug 6, 2022
58f256c
Updated the document
taku910 Aug 6, 2022
df5f7fd
Fixed errors in example notebook
amrzv Aug 9, 2022
f122fb3
Fix dead links
amrzv Aug 9, 2022
9edde78
Merge pull request #771 from amrzv/fix-example-notebook
taku910 Aug 9, 2022
9576f24
update
laurentsimon Aug 12, 2022
0292e58
update
laurentsimon Aug 12, 2022
00b7df6
Merge pull request #772 from laurentsimon/feat/slsa3
taku910 Aug 14, 2022
7b789ee
update
laurentsimon Aug 15, 2022
fda3411
Merge pull request #773 from laurentsimon/doc/slsa
taku910 Aug 17, 2022
de15050
added ShutdownLibrary function to uninitialize global variables
taku910 Aug 20, 2022
460d15b
Fixed the issue of concatinating paths for pkg-config
taku910 Aug 21, 2022
c2c21d4
Enable iOS builds
jplu Sep 7, 2022
1aafd8d
Merge pull request #780 from jplu/add-ios-build
taku910 Sep 9, 2022
57a6d12
add CIFuzz GitHub action
DavidKorczynski Nov 21, 2022
570fb13
Disable shared build on windows
A2va Nov 21, 2022
e2161d5
Merge pull request #792 from DavidKorczynski/cifuzz-int
taku910 Nov 28, 2022
f9dac76
Merge pull request #793 from A2va/master
taku910 Nov 28, 2022
225fb19
CMake need endif
A2va Nov 28, 2022
77a65e0
Merge pull request #795 from A2va/master
taku910 Nov 29, 2022
2ba0a5a
fix the path in add_new_vocab.ipynb
kyoto7250 Dec 12, 2022
31656da
Merge pull request #799 from kyoto7250/fix_tutorial
taku910 Dec 12, 2022
14b67a4
Update README.md
jacek-michalak Jan 17, 2023
c5a49eb
Merge pull request #808 from jacek-michalak/patch-1
taku910 Jan 24, 2023
de2fabe
Update wheel.yml
juliusfrost Feb 15, 2023
4c2a713
Use latest setup-python==4.5
juliusfrost Feb 15, 2023
9c211b6
Merge pull request #819 from juliusfrost/patch-1
taku910 Feb 17, 2023
4de04cc
Fix setup-python version not detected
juliusfrost Feb 17, 2023
f2dacdf
setup-python@v4 parity
juliusfrost Feb 17, 2023
9ffb33a
Merge pull request #820 from juliusfrost/patch-2
taku910 Feb 21, 2023
1983663
Removed replacing of /MD with /MT for MSVC
ilya-lavrenov Mar 26, 2023
8772159
Merge pull request #837 from ilya-lavrenov/msvc-remove-static-runtime
taku910 Mar 29, 2023
7e0137c
added option to /MT flag
taku910 Apr 2, 2023
c0766c9
added option to /MT flag
taku910 Apr 2, 2023
ba466a6
prepare for 0.1.98
taku910 Apr 2, 2023
d4c58fc
handle the exception of std::random_device
taku910 Apr 2, 2023
359c043
handle the exception of std::random_device
taku910 Apr 2, 2023
573cc39
make the error message more descriptive. null termnate string in Utf8…
taku910 Apr 3, 2023
d0d1066
use /MD to build wheel package on windows
taku910 Apr 3, 2023
f54d8ba
includes the sentencepiece source files in python source package
taku910 Apr 4, 2023
59d84ba
Ubuntu 18.04 to 20.04 migration
taku910 Apr 4, 2023
799c025
creates sdist with build_sdist.sh
taku910 Apr 4, 2023
c945229
updated set-output commands
taku910 Apr 4, 2023
5489c0a
add -latomic in static linking
taku910 Apr 4, 2023
c032c26
automatically detect -latomic linker option
taku910 Apr 5, 2023
9b53e21
Update sentencepiece_python_module_example.ipynb
chris-ha458 Apr 8, 2023
6c9fd79
Merge pull request #845 from chris-ha458/patch-1
taku910 Apr 9, 2023
e58bb68
add pretokenization_delimiter options. Initialize seed pieces more ac…
taku910 Apr 10, 2023
2b07137
fixes IS_BIGENDIAN macro places
taku910 Apr 10, 2023
119e58d
Fixes include path when using external protobuf
taku910 Apr 10, 2023
e07ebf7
support pretokenization in BPE mode.
taku910 Apr 11, 2023
8fd5c6b
test loacl sdist build on github actions
taku910 Apr 12, 2023
609a2b7
test loacl sdist build on github actions
taku910 Apr 12, 2023
f2884a1
test loacl sdist build on github actions
taku910 Apr 12, 2023
d6e597b
build wheel from sdist for testing
taku910 Apr 12, 2023
fabfe30
build wheel from sdist for testing
taku910 Apr 12, 2023
518c57c
build wheel from sdist for testing
taku910 Apr 12, 2023
d9a2b21
Fix bugs the seed score computation.
taku910 Apr 15, 2023
69d34c7
prepare for v0.1.99
taku910 Apr 15, 2023
ba44ab1
Fix bugs in the handling of duplicated bigrams
taku910 Apr 24, 2023
bb0b610
Fix the ULM training bugs
taku910 Apr 27, 2023
25b64fc
Fix the test error on windows
taku910 Apr 28, 2023
3863f76
increases the max number of threads
taku910 Apr 30, 2023
827591a
Fixes build test errors in big-endian machines
taku910 May 14, 2023
17f9c6b
Fixes build test errors in big-endian machines
taku910 May 14, 2023
6c901b0
Fixes build test errors in big-endian machines
taku910 May 14, 2023
f2fcd85
Fixes cross build yaml
taku910 May 14, 2023
fad8ae6
Added fail first flag
taku910 May 14, 2023
b857ba9
Split build and test
taku910 May 14, 2023
6693e7e
Fixes test workpath
taku910 May 14, 2023
2f66fbf
Added arm architecture
taku910 May 14, 2023
0b344d0
Added arm architecture
taku910 May 14, 2023
f2219b5
prepare for 0.2.00
taku910 May 14, 2023
3805cbb
Fix nasty bug in BPE position encoding
vmarkovtsev May 18, 2023
e081c67
Remove empty placeholders in pkg-config file
ryandesign May 21, 2023
4183597
Fix pkg-config file to avoid overlinking
ryandesign May 21, 2023
cb22883
Merge pull request #870 from ryandesign/ryandesign-protobuf-lite
taku910 May 24, 2023
7b694e4
Merge pull request #867 from vmarkovtsev/patch-1
taku910 May 25, 2023
635fe84
Upgrade the sentencepiece_model_pb2.py and sentencepiece.py
taku910 Jul 1, 2023
8cbdf13
Improves the thread utilization in batch encoding/decoding
taku910 Aug 5, 2023
362d1c2
Merge branch 'gmaster' into rjai/update
Sep 18, 2023
4ca953c
Recompile protobuf files with protoc
Sep 18, 2023
159efa2
Make it compile
Sep 18, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .github/workflows/cifuzz.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
name: CIFuzz
on: [pull_request]
jobs:
Fuzzing:
runs-on: ubuntu-latest
steps:
- name: Build Fuzzers
id: build
uses: google/oss-fuzz/infra/cifuzz/actions/build_fuzzers@master
with:
oss-fuzz-project-name: 'sentencepiece'
dry-run: false
language: c++
- name: Run Fuzzers
uses: google/oss-fuzz/infra/cifuzz/actions/run_fuzzers@master
with:
oss-fuzz-project-name: 'sentencepiece'
fuzz-seconds: 300
dry-run: false
language: c++
- name: Upload Crash
uses: actions/upload-artifact@v3
if: failure() && steps.build.outcome == 'success'
with:
name: artifacts
path: ./out/artifacts
77 changes: 77 additions & 0 deletions .github/workflows/cmake.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
name: CI for general build

on:
push:
branches: [ master ]
tags:
- 'v*'
pull_request:
branches: [ master ]

jobs:
build:
strategy:
matrix:
os: [ ubuntu-latest, ubuntu-20.04, windows-latest, macOS-11 ]
arch: [ x64 ]
include:
- os: windows-latest
arch: x86
runs-on: ${{ matrix.os }}

steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: '3.x'
architecture: ${{matrix.arch}}

- name: Config for Windows
if: runner.os == 'Windows'
run: |
if ("${{matrix.arch}}" -eq "x64") {
$msbuildPlatform = "x64"
} else {
$msbuildPlatform = "Win32"
}
cmake -A $msbuildPlatform -B ${{github.workspace}}/build -DSPM_BUILD_TEST=ON -DSPM_ENABLE_SHARED=OFF -DCMAKE_INSTALL_PREFIX=${{github.workspace}}/build/root

- name: Config for Unix
if: runner.os != 'Windows'
run: cmake -B ${{github.workspace}}/build -DSPM_BUILD_TEST=ON -DCMAKE_INSTALL_PREFIX=${{github.workspace}}/build/root

- name: Build
run: cmake --build ${{github.workspace}}/build --config Release --target install --parallel 8

- name: Test
working-directory: ${{github.workspace}}/build
run: ctest -C Release --output-on-failure

- name: Package
working-directory: ${{github.workspace}}/build
run: cpack

- name: Build Python wrapper
working-directory: ${{github.workspace}}/python
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
python setup.py test
python setup.py bdist_wheel

- name: Upload artifcacts
uses: actions/upload-artifact@v3
with:
path: ./build/*.7z

- name: Upload Release Assets
if: startsWith(github.ref, 'refs/tags/')
uses: svenstaro/upload-release-action@v2
with:
repo_token: ${{ secrets.GITHUB_TOKEN }}
file: ./build/*.7z
tag: ${{ github.ref }}
overwrite: true
prerelease: true
file_glob: true
body: "This is my release text"
41 changes: 41 additions & 0 deletions .github/workflows/cross_build.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
name: CrossBuild

on:
push:
branches: [ master ]
tags:
- 'v*'
pull_request:
branches: [ master ]

jobs:
build:
runs-on: ubuntu-latest
strategy:
fail-fast: false
matrix:
arch: [ i686, arm, aarch64, riscv64, powerpc, powerpc64, powerpc64le, s390x, sparc64, m68k, sh4, alpha ]

steps:
- uses: actions/checkout@v3

- name: Install cross tools
run: |
sudo apt-get install -y sudo qemu-user gdb zstd dwarfdump {gcc,g++}-10-{i686,aarch64,riscv64,powerpc,powerpc64,powerpc64le,s390x,sparc64,m68k,sh4,alpha}-linux-gnu {gcc,g++}-10-arm-linux-gnueabihf
sudo ln -sf /usr/bin/arm-linux-gnueabihf-gcc-10 /usr/bin/arm-linux-gnu-gcc-10
sudo ln -sf /usr/bin/arm-linux-gnueabihf-g++-10 /usr/bin/arm-linux-gnu-g++-10
sudo ln -sf /usr/arm-linux-gnueabihf /usr/arm-linux-gnu

- name: Build
run: |
mkdir -p ${{github.workspace}}/build
cd ${{github.workspace}}/build
env CXX=/usr/bin/${{matrix.arch}}-linux-gnu-g++-10 CC=/usr/bin/${{matrix.arch}}-linux-gnu-gcc-10 cmake .. -DSPM_BUILD_TEST=ON -DSPM_ENABLE_SHARED=OFF -DCMAKE_FIND_ROOT_PATH=/usr/${{matrix.arch}}-linux-gnu -DSPM_CROSS_SYSTEM_PROCESSOR=${{matrix.arch}}
make -j$(nproc)

- name: Test on QEMU
if: matrix.arch != 'sparc64' && matrix.arch != 'm68k' && matrix.arch != 'sh4'
run: |
cd ${{github.workspace}}/build
qemu_arch=`echo ${{matrix.arch}} | sed -e s/powerpc/ppc/ -e s/686/386/`
qemu-${qemu_arch} -L /usr/${{matrix.arch}}-linux-gnu src/spm_test
147 changes: 147 additions & 0 deletions .github/workflows/wheel.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,147 @@
name: Build Wheels

on:
push:
branches: [ master ]
tags:
- 'v*'
pull_request:
branches: [ master ]

jobs:
build_wheels:
outputs:
digests-linux: ${{ steps.hash-linux.outputs.digests }}
digests-macos: ${{ steps.hash-macos.outputs.digests }}
digests-windows: ${{ steps.hash-windows.outputs.digests }}
strategy:
matrix:
os: [ubuntu-latest, windows-latest, macOS-11]
runs-on: ${{ matrix.os }}
name: Build wheels on ${{ matrix.os }}

steps:
- uses: actions/checkout@v3
- uses: actions/setup-python@v4
with:
python-version: "3.x"

- name: Set up QEMU
if: runner.os == 'Linux'
uses: docker/setup-qemu-action@v2
with:
platforms: arm64

- name: Build for Windows
if: runner.os == 'Windows'
run: |
cmake -A Win32 -B ${{github.workspace}}/build_win32 -DSPM_ENABLE_SHARED=OFF -DCMAKE_INSTALL_PREFIX=${{github.workspace}}/build/root_win32
cmake --build ${{github.workspace}}/build_win32 --config Release --target install --parallel 8
cmake -A x64 -B ${{github.workspace}}/build_amd64 -DSPM_ENABLE_SHARED=OFF -DCMAKE_INSTALL_PREFIX=${{github.workspace}}/build/root_amd64
cmake --build ${{github.workspace}}/build_amd64 --config Release --target install --parallel 8

- name: Build for Mac
if: runner.os == 'macOS'
run: |
cmake -B ${{github.workspace}}/build -DSPM_ENABLE_SHARED=OFF -DCMAKE_INSTALL_PREFIX=${{github.workspace}}/build/root
cmake --build ${{github.workspace}}/build --config Release --target install --parallel 8
env:
CMAKE_OSX_ARCHITECTURES: arm64;x86_64

- name: Install cibuildwheel
working-directory: ${{github.workspace}}/python
run: |
python -m pip install --upgrade pip
pip install setuptools wheel twine
python -m pip install cibuildwheel==2.12.0

- name: Build wheels
working-directory: ${{github.workspace}}/python
run: python -m cibuildwheel --output-dir wheelhouse
env:
CIBW_ARCHS_LINUX: auto aarch64
CIBW_ARCHS_MACOS: x86_64 universal2 arm64
CIBW_SKIP: "pp* *-musllinux_*"
CIBW_BUILD_VERBOSITY: 1

- name: Build sdist archive
working-directory: ${{github.workspace}}/python
run: sh build_sdist.sh

- name: Fetch sdist archive
uses: tj-actions/glob@v17
id: sdist
with:
files: ./python/dist/*.tar.gz

- name: Build wheel from sdist
run: python -m pip wheel "${{ steps.sdist.outputs.paths }}" --verbose

- name: Copy sdist
working-directory: ${{github.workspace}}/python
if: runner.os == 'macOS'
run: cp -f dist/*.tar.gz wheelhouse/

- name: Upload artifact
uses: actions/upload-artifact@v3
with:
path: |
./python/wheelhouse/*.whl
./python/wheelhouse/*.tar.gz

- name: Upload wheel release
if: startsWith(github.ref, 'refs/tags/')
uses: svenstaro/upload-release-action@v2
with:
repo_token: ${{ secrets.GITHUB_TOKEN }}
file: ./python/wheelhouse/*
tag: ${{ github.ref }}
overwrite: true
prerelease: true
file_glob: true

- name: Generate SLSA subjects - Macos
id: hash-macos
if: runner.os == 'macOS'
run: echo "digests=$(shasum -a 256 ./python/wheelhouse/* | base64)" >> $GITHUB_OUTPUT

- name: Generate SLSA subjects - Linux
id: hash-linux
if: runner.os == 'Linux'
run: echo "digests=$(sha256sum ./python/wheelhouse/* | base64 -w0)" >> $GITHUB_OUTPUT

- name: Generate SLSA subjects - Windows
id: hash-windows
if: runner.os == 'Windows'
run: echo "digests=$(sha256sum ./python/wheelhouse/* | base64 -w0)" >> $GITHUB_OUTPUT

gather-disgests:
needs: [build_wheels]
outputs:
digests: ${{ steps.hash.outputs.digests }}
runs-on: ubuntu-latest
steps:
- name: Merge results
id: hash
env:
LINUX_DIGESTS: "${{ needs.build_wheels.outputs.digests-linux }}"
MACOS_DIGESTS: "${{ needs.build_wheels.outputs.digests-macos }}"
WINDOWS_DIGESTS: "${{ needs.build_wheels.outputs.digests-windows }}"
run: |
set -euo pipefail
echo "$LINUX_DIGESTS" | base64 -d > checksums.txt
echo "$MACOS_DIGESTS" | base64 -d >> checksums.txt
echo "$WINDOWS_DIGESTS" | base64 -d >> checksums.txt
echo "digests=$(cat checksums.txt | base64 -w0)" >> $GITHUB_OUTPUT

provenance:
if: startsWith(github.ref, 'refs/tags/')
needs: [build_wheels, gather-disgests]
permissions:
actions: read # To read the workflow path.
id-token: write # To sign the provenance.
contents: write # To add assets to a release.
uses: slsa-framework/slsa-github-generator/.github/workflows/[email protected]
with:
base64-subjects: "${{ needs.gather-disgests.outputs.digests }}"
upload-assets: true # Optional: Upload to a new release
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -72,3 +72,6 @@ libsentencepiece.so*
libsentencepiece_train.so*
python/bundled
_sentencepiece.*.so
third_party/abseil-cpp

python/sentencepiece
Loading