Skip to content

Commit

Permalink
Merge branch 'main' into fix-empty-bbox-serialization
Browse files Browse the repository at this point in the history
  • Loading branch information
PonteIneptique authored Jun 10, 2024
2 parents 4a1ef5c + 0309bc1 commit cbf1cd1
Show file tree
Hide file tree
Showing 74 changed files with 7,098 additions and 6,702 deletions.
81 changes: 40 additions & 41 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
name: Lint, test, build, and publish

on:
on:
push:


jobs:
lint_and_test:
Expand All @@ -13,9 +13,9 @@ jobs:
python-version: [3.8, 3.9, '3.10', '3.11']

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
- name: Set up Python ${{ matrix.python-version }}
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies and kraken
Expand All @@ -30,7 +30,7 @@ jobs:
flake8 . --count --exit-zero --max-complexity=10 --max-line-length=127 --statistics
- name: Run tests, except training tests
run: |
pytest -k 'not test_train and not test_pageseg'
pytest -k 'not test_train'
build-n-publish-pypi:
name: Build and publish Python 🐍 distributions 📦 to PyPI and TestPyPI
Expand All @@ -39,11 +39,11 @@ jobs:
if: startsWith(github.ref, 'refs/tags/')

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Python 3.9
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: 3.9
- name: Build a binary wheel and a source tarball
Expand All @@ -68,55 +68,54 @@ jobs:
if: startsWith(github.ref, 'refs/tags/')

steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
fetch-depth: 0
- uses: conda-incubator/setup-miniconda@v2
- uses: conda-incubator/setup-miniconda@v3
with:
python-version: 3.9
miniforge-variant: Mambaforge
- name: install dependencies build
shell: bash -l {0}
run: mamba install "conda-build>=3.20" colorama pip ruamel ruamel.yaml rich jsonschema conda-verify anaconda-client mamba
# Runs the action with the following inputs or defaults if not specified.
- name: install boa
shell: bash -l {0}
run: pip install https://github.com/mamba-org/boa/archive/refs/tags/0.14.0.zip
- name: validate recipe
shell: bash -l {0}
id: conda_validation
run: |
PACKAGE_PATHS=$(conda mambabuild . --output --check -c conda-forge | tail -n 1)
echo "package_paths=$PACKAGE_PATHS" >> $GITHUB_OUTPUT
- name: run build
shell: bash -l {0}
run: conda mambabuild . -c conda-forge
- name: convert packages
shell: bash -l {0}
run: mamba install colorama pip ruamel ruamel.yaml rich jsonschema conda-verify anaconda-client
- name: Build linux-64 conda package
uses: prefix-dev/[email protected]
with:
recipe-path: "conda/recipe.yaml"
build-args: "--experimental --target-platform linux-64"
- name: Build osx-64 conda package
uses: prefix-dev/[email protected]
with:
recipe-path: "conda/recipe.yaml"
build-args: "--experimental --target-platform osx-64"
# - name: Build osx-arm64 conda package
# uses: prefix-dev/[email protected]
# with:
# recipe-path: "conda/recipe.yaml"
# build-args: "--experimental --target-platform osx-arm64"
- name: Upload conda package
run: |
conda convert -p osx-arm64 -p osx-64 -o conda_convert ${{ steps.conda_validation.outputs.package_paths }}
mkdir conda_convert/linux-64
cp -f ${{ steps.conda_validation.outputs.package_paths }} conda_convert/linux-64
- name: upload to anaconda
shell: bash -l {0}
run: anaconda -t ${{ secrets.ANACONDA_TOKEN }} upload --no-progress --force conda_convert/*/*
for pkg in $(find output -type f \( -name "*.conda" -o -name "*.tar.bz2" \) ); do
echo "Uploading ${pkg}"
rattler-build upload anaconda -o mittagessen -a ${{ secrets.ANACONDA_TOKEN }} "${pkg}"
done
- name: Upload conda artifacts to GH storage
uses: actions/upload-artifact@v3
uses: actions/upload-artifact@v4
with:
name: conda_packages
path: conda_convert/*/*.tar.bz2
path: output/*/*.conda

autodraft-gh-release:
name: Create github release
needs: [build-n-publish-anaconda, build-n-publish-pypi]
runs-on: ubuntu-latest

steps:
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: conda_packages
path: conda
- uses: actions/download-artifact@v3
- uses: actions/download-artifact@v4
with:
name: pypi_packages
path: pypi
Expand All @@ -126,33 +125,33 @@ jobs:
prerelease: false
draft: true
files: |
conda/*/*.tar.bz2
output/*/*.conda
pypi/*
publish-gh-pages:
name: Update kraken.re github pages
name: Update kraken.re github pages
needs: lint_and_test
runs-on: ubuntu-latest
if: |
github.ref == 'refs/heads/main' ||
startsWith(github.ref, 'refs/tags/')
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Set up Python 3.9
uses: actions/setup-python@v4
uses: actions/setup-python@v5
with:
python-version: 3.9
- name: Install sphinx-multiversion
run: python -m pip install sphinx-multiversion sphinx-autoapi
- name: Create docs
- name: Create docs
run: sphinx-multiversion docs build/html
- name: Create redirect
run: cp docs/redirect.html build/html/index.html
- name: Push gh-pages
uses: crazy-max/ghaction-github-pages@v3
uses: crazy-max/ghaction-github-pages@v4
with:
target_branch: gh-pages
build_dir: build/html
Expand Down
26 changes: 13 additions & 13 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ material.

kraken's main features are:

- Fully trainable layout analysis and character recognition
- Fully trainable layout analysis, reading order, and character recognition
- `Right-to-Left <https://en.wikipedia.org/wiki/Right-to-left>`_, `BiDi
<https://en.wikipedia.org/wiki/Bi-directional_text>`_, and Top-to-Bottom
script support
Expand Down Expand Up @@ -44,38 +44,38 @@ install the `pdf` extras package for PyPi:

$ pip install kraken[pdf]

or install `pyvips` manually with conda:
or install `pyvips` manually with pip:

::

$ conda install -c conda-forge pyvips
$ pip install pyvips

Conda environment files are provided which for the seamless installation of the
main branch as well:
Conda environment files are provided for the seamless installation of the main
branch as well:

::

$ git clone https://github.com/mittagessen/kraken.git
$ git clone https://github.com/mittagessen/kraken.git
$ cd kraken
$ conda env create -f environment.yml

or:

::

$ git clone https://github.com/mittagessen/kraken.git
$ git clone https://github.com/mittagessen/kraken.git
$ cd kraken
$ conda env create -f environment_cuda.yml

for CUDA acceleration with the appropriate hardware.

Finally you'll have to scrounge up a model to do the actual recognition of
characters. To download the default model for printed English text and place it
characters. To download the default model for printed French text and place it
in the kraken directory for the current user:

::

$ kraken get 10.5281/zenodo.2577813
$ kraken get 10.5281/zenodo.10592716

A list of libre models available in the central repository can be retrieved by
running:
Expand Down Expand Up @@ -105,13 +105,13 @@ To segment an image (binarized or not) with the new baseline segmenter:
::

$ kraken -i image.tif lines.json segment -bl


To segment and OCR an image using the default model(s):

::

$ kraken -i image.tif image.txt segment -bl ocr
$ kraken -i image.tif image.txt segment -bl ocr -m catmus-print-fondue-large.mlmodel

All subcommands and options are documented. Use the ``help`` option to get more
information.
Expand All @@ -124,8 +124,8 @@ Have a look at the `docs <https://kraken.re>`_.
Related Software
================

These days kraken is quite closely linked to the `escriptorium
<https://escriptorium.fr>`_ project developed in the same eScripta research
These days kraken is quite closely linked to the `eScriptorium
<https://gitlab.com/scripta/escriptorium/>`_ project developed in the same eScripta research
group. eScriptorium provides a user-friendly interface for annotating data,
training models, and inference (but also much more). There is a `gitter channel
<https://gitter.im/escripta/escriptorium>`_ that is mostly intended for
Expand Down
1 change: 0 additions & 1 deletion conda/build.sh

This file was deleted.

4 changes: 0 additions & 4 deletions conda/conda_build_config.yaml

This file was deleted.

41 changes: 0 additions & 41 deletions conda/meta.yaml

This file was deleted.

55 changes: 55 additions & 0 deletions conda/recipe.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,55 @@
context:
git_url: .
git_tag: ${{ git.latest_tag(git_url) }}

package:
name: kraken
version: ${{ git_tag }}

source:
git: ${{ git_url }}
tag: ${{ git_tag }}

build:
script: pip install --no-deps .

requirements:
build:
- python>=3.8,<3.12
- setuptools>=36.6.0,<70.0.0
- pbr
host:
- python>=3.8,<3.12
run:
- python>=3.8,<3.12
- python-bidi
- lxml
- regex
- requests
- click>=8.1
- numpy~=1.23.0
- pillow>=9.2.0
- scipy~=1.11.0
- jinja2~=3.0
- torchvision
- pytorch~=2.1.0
- cudatoolkit
- jsonschema
- scikit-image~=0.21.0
- scikit-learn~=1.2.1
- shapely~=1.8.5
- pyvips
- coremltools
- pyarrow
- lightning~=2.2
- torchmetrics>=1.1.0
- conda-forge::threadpoolctl~=3.4.0
- albumentations
- rich

about:
homepage: https://kraken.re
license: Apache-2.0
summary: 'OCR/HTR engine for all the languages'
repository: https://github.com/mittagessen/kraken
documentation: https://kraken.re
Loading

0 comments on commit cbf1cd1

Please sign in to comment.