Fast GF(256) Cauchy MDS Block Erasure Codec in C++
This is the rewrite in (as much as possible) clean C++ of cm256. In some contexts like Qt programs and plugins the original cm256 library does not work.
cm256cc performance is on par or even better than cm256. This is particularly true for armv7 architecture (Raspberry Pi 2 and 3) and is the most significant with Raspberry Pi 2.
cm256cc is a simple library for erasure codes. From given data it generates redundant data that can be used to recover the originals.
Currently only g++ is supported, other versions of MSVC than Visual Studio 2013 may work. Optimizations for both SSE3 (x86_64) and Neon (armv7) are available.
The original data should be split up into equally-sized chunks. If one of these chunks is erased, the redundant data can fill in the gap through decoding.
The erasure code is parameterized by three values (OriginalCount
, RecoveryCount
, BlockBytes
). These are:
- The number of blocks of original data (
OriginalCount
), which must be less than 256. - The number of blocks of redundant data (
RecoveryCount
), which must be no more than256 - OriginalCount
.
For example, if a file is split into 3 equal pieces and sent over a network, OriginalCount
is 3.
And if 2 additional redundant packets are generated, RecoveryCount
is 2.
In this case up to 256 - 3 = 253 additional redundant packets can be generated.
This is a classical cmake project. Make sure cmake and g++ is installed in your system. create a build
directory and cd into it. If you install the library in a custom location say opt/install/cm256cc
use the following command line for cmake:
cmake -Wno-dev -DCMAKE_INSTALL_PREFIX=/opt/install/cm256cc ..
Result:
- Library will be installed as
/opt/install/cm256cc/lib/libcm256cc.so
- Include files will be installed in
/opt/install/cm256cc/include/cm256cc
- Binary test programs will be installed in
/opt/install/cm256cc/bin
Include the cm256cc library in your project and cm256.h header in your program. Have a look at example programs cm256_test.cpp
, transmit.cpp
and receive.cpp
in the unit_test
folder for usage. Consult the cm256.h header
for details on the encoding / decoding method.
This is a classical cmake project. You may install the software anywhere you like with the -DCMAKE_INSTALL_PREFIX
definition on the cmake command line.
The cmake file will try to find the best compiler optimization options depending on the hardware you are compiling this project. This may not be suitable if you intend to distribute the software or include it in a distribution. In this case you can use the -DENABLE_DISTRIBUTION=1
define on the command line to have just SSSE3 optimization for the x86 based systems and still NEON optimization for arm or arm64.
Documentation is provided in the header file cm256.h.
When your application starts up it should call isInitialized()
to verify that the library is constructed properly:
#include "cm256.h"
CM256 cm256;
if (!cm256.isInitialized()) {
// library not initialized
exit(1);
}
To generate redundancy, use the cm256_encode
function. To solve for the original data use the cm256_decode
function.
Example usage:
bool ExampleFileUsage()
{
CM256 cm256;
if (!cm256.isInitialized()) {
// library not initialized
exit(1);
}
CM256::cm256_encoder_params params;
// Number of bytes per file block
params.BlockBytes = 4321;
// Number of blocks
params.OriginalCount = 33;
// Number of additional recovery blocks generated by encoder
params.RecoveryCount = 12;
// Size of the original file
static const int OriginalFileBytes = params.OriginalCount * params.BlockBytes;
// Allocate and fill the original file data
uint8_t* originalFileData = new uint8_t[OriginalFileBytes];
memset(originalFileData, 1, OriginalFileBytes);
// Pointers to data
CM256::cm256_block blocks[256];
for (int i = 0; i < params.OriginalCount; ++i)
{
blocks[i].Block = originalFileData + i * params.BlockBytes;
}
// Recovery data
uint8_t* recoveryBlocks = new uint8_t[params.RecoveryCount * params.BlockBytes];
// Generate recovery data
if (cm256.cm256_encode(params, blocks, recoveryBlocks))
{
exit(1);
}
// Initialize the indices
for (int i = 0; i < params.OriginalCount; ++i)
{
blocks[i].Index = CM256::cm256_get_original_block_index(params, i);
}
//// Simulate loss of data, subsituting a recovery block in its place ////
blocks[0].Block = recoveryBlocks; // First recovery block
blocks[0].Index = cm256_get_recovery_block_index(params, 0); // First recovery block index
//// Simulate loss of data, subsituting a recovery block in its place ////
if (cm256.cm256_decode(params, blocks))
{
exit(1);
}
// blocks[0].Index will now be 0.
delete[] originalFileData;
delete[] recoveryBlocks;
return true;
}
The example above is just one way to use the cm256_decode
function.
This API was designed to be flexible enough for UDP/IP-based file transfer where the blocks arrive out of order.
The approach taken in CM256 is similar to the Intel Storage Acceleration Library (ISA-L) available here:
https://01.org/intel%C2%AE-storage-acceleration-library-open-source-version/downloads
ISA-L more aggressively optimizes the matrix multiplication operation, which is the most expensive step of encoding.
CM256 takes better advantage of the m=1 case and the first recovery symbol, which is also possible with the Vandermonde matrices supported by ISA-L.
ISA-L uses a O(N^3) Gaussian elimination solver for decoding. The CM256 decoder solves the linear system using a fast O(N^2) LDU-decomposition algorithm from "Pivoting and Backward Stability of Fast Algorithms for Solving Cauchy Linear Equations" (T. Boros, T. Kailath, V. Olshevsky), which was hand-optimized for memory accesses.
This software was written entirely by Christopher A. Taylor [email protected] and converted to clean C++ code by myself Edouard M. Griffiths [email protected].