This package implements the crc32c checksum algorithm. It automatically chooses between a hardware-based implementation (using the CRC32C SSE 4.2 instruction of Intel CPUs, and the crc32* instructions on ARMv8 CPUs), or a software-based one when no hardware support can be found.
Because crc32c
is in PyPI, you can install it with:
pip install crc32c
Supported platforms are Linux and OSX using the gcc and clang compilers, and Windows using the Visual Studio compiler. Other compilers in Windows (MinGW for instance) might work. Binary wheels are also provided in PyPI for major platforms/architectures.
The project is using certain gcc/clang compiler extensions to support building hardware-specific functions that might not be supported by older compiler versions.
The core function exposed by this module is crc32c(data, value=0, gil_release_mode=-1)
.
It computes the CRC32C checksum of data
starting with an initial value
checksum,
similarly to how the built-in binascii.crc32
works.
It can thus be used like this:
print(crc32c.crc32c(b'hello world'))
# 3381945770
crc = crc32c.crc32c(b'hello')
print(crc32c.crc32c(b' world', value=crc))
# 3381945770
In older versions,
the function exposed by this module was called crc32
.
That name is still present but deprecated,
and will be removed in new versions of the library.
The gil_release_mode
keyword argument
specifies whether a call of this library shall release or keep the Global Interpreter Lock.
It can be set to the following values:
- Negative: Only release the GIL when
data
>= 32KiB - 0: Never release the GIL
- Positive: Always release the GIL
The gil_release_mode
parameter
doesn't have any effect on free-threaded Python builds.
On top of the crc32c
function,
a CRC32CHash(data=b"", gil_release_mode=-1)
class is also offered.
It is modelled after the "hash objects" of the hashlib
module
of the standard library. It also offers a checksum
property:
crc32c_hash = crc32c.CRC32CHash()
crc32c_hash.update(b'hello')
crc32c_hash.update(b' world')
print(crc32c_hash.checksum == crc32c.crc32c(b'hello world'))
# True
print(crc32c_hash.digest())
# b'\xc9\x94e\xaa'
digest_as_int = int.from_bytes(crc32c_hash.digest(), "big")
print(digest_as_int == crc32c.crc32c(b'hello world'))
# True
For more details see the documentation on hash objects.
Additionally one can consult the following module-level values:
hardware_based
indicates if the algorithm in use is software- or hardware-based.big_endian
indicates whether the platform is big endian or not.
A benchmarking utility can be found
when executing the crc32c.benchmark
module.
Consult its help with the -h
flag for options.
By default,
if your CPU doesn't have CRC32C hardware support,
the package will fallback to use a software implementation
of the crc32c checksum algorithm.
This behavior can be changed by setting
the CRC32C_SW_MODE
environment variable
to one of the following values:
auto
: same as if unset, will eventually be discontinued.force
: use software implementation regardless of hardware support.none
: issue aRuntimeWarning
when importing the module, and aRuntimeError
when executing thecrc32c
function, if no hardware support is found. In versions of this package up to 2.6 anImportError
was raised when importing the module instead. In 1.x versions this was the default behaviour.
Setting the CRC32C_SKIP_HW_PROBE
to 1
simulates platforms without hardware support.
This is available mostly for internal testing purposes.
The software algorithm is based on Intel's slice-by-8 package, with some adaptations done by Evan Jones and packaging provided by Ferry Toth. Further adaptations were required to make the code more portable and fit for inclusion within this python package.
The Intel SSE 4.2 algorithm is based on Mark Adler's code, with some modifications required to make the code more portable and fit for inclusion within this python package.
The ARMv8 hardware implementation is based on Google's crc32c C++ library.
This package is copyrighted:
ICRAR - International Centre for Radio Astronomy Research (c) UWA - The University of Western Australia, 2017 Copyright by UWA (in the framework of the ICRAR)
The original slice-by-8 software algorithm is copyrighted by:
Copyright (c) 2004-2006 Intel Corporation - All Rights Reserved
Further adaptations to the slice-by-8 algorithm previous to the inclusion in this package are copyrighted by:
Copyright 2008,2009,2010 Massachusetts Institute of Technology.
The original Intel SSE 4.2 crc32c algorithm is copyrighted by:
Copyright (C) 2013 Mark Adler
The crc32c ARMv8 hardware code is copyrighted by:
Copyright 2017 The CRC32C Authors
A copy of the AUTHORS file from Google's crc32c project as it was at the time of copying the code is included in this repository.
This package is licensed under the LGPL-2.1 license.
The original slice-by-8 software algorithm is licensed under the 2-clause BSD licence.
Further modifications to the slice-by-8 software algorithm are licensed under a 3-clause BSD licence
The original Intel SSE 4.2 crc32c algorithm's code
is licensed under a custom license
embedded in the crc32c_adler.c
file.
The original crc32c ARMv8 hardware code is licensed under a 3-clause BSD license.