-
Notifications
You must be signed in to change notification settings - Fork 1
bwt: An ANSI C implementation of the Burrows-Wheeler transformation (BWT)
License
LGPL-3.0, GPL-3.0 licenses found
Licenses found
LGPL-3.0
COPYING.LESSER
GPL-3.0
COPYING
MichaelDipperstein/bwt
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
DESCRIPTION ----------- This archive contains a simple and readable ANSI C implementation of the Burrows-Wheeler transformation (BWT) and its reverse transformation. The code for this implementation is derived directly from "A Block-sorting Lossless Data Compression Algorithm" by M. Burrows and D.J. Wheeler. A PDF copy of the document may be found at: ftp://apotheca.hpl.hp.com/pub/dec/SRC/research-reports/SRC-124.pdf More information on Burrows-Wheeler Transform may be found at: https://michaeldipperstein.github.io/bwt.html FILES ----- bwxform.c - Library of Burrows-Wheeler transform (BWT) routines. bwxform.h - Header containing prototypes for library functions. COPYING - Rules for copying and distributing GPL software COPYING.LESSER - Rules for copying and distributing LGPL software Makefile - makefile for this project (assumes gcc compiler and GNU make) README - this file sample.c - Demonstration of how to use BWT library functions optlist/ - Subtree containing optlist command line option parser library BUILDING -------- To build these files with GNU make and gcc, simply enter "make" from the command line. The executable will be named sample (or sample.exe). GIT NOTE: Updates to the subtree optlist don't get pulled by "git pull" Use the following commands to pull their updates: git subtree pull --prefix optlist https://github.com/MichaelDipperstein/optlist.git master --squash USAGE ----- Usage: sample <options> options: -c : Encode input file to output file. -d : Decode input file to output file. -m : Perform the Move-to-Front coding. -i <filename> : Name of input file. -o <filename> : Name of output file. -h|? : Print out command line options. -c Generate a probability range list for the specified input file (see -i) then use arithmetic coding compresses the file using the range list to the specified output file (see -o). -d Decompresses the specified input file (see -i) writing the results to the specified output file (see -o). Only files compressed by this program may be decompressed. -m Perform move to front encoding/decoding on each block. -i <filename> The name of the input file. There is no valid usage of this program without a specified input file. -o <filename> The name of the output file. If no file is specified, stdout will be used. NOTE: Sending compressed output to stdout may produce undesirable results. LIBRARY API ----------- Transforming Data: int BWXform(FILE *fpIn, FILE *fpOut, const xform_t method); fpIn The file stream to be transformed. It must non-NULL and opened. fpOut The file stream receiving the transformed data. It must non-NULL and opened. method xform_t type value indicating whether indicate whether or not MTF is used. Return Value Zero for success, non-zero for failure. Reverse Transforming Data: int BWXform(FILE *fpIn, FILE *fpOut, const xform_t method); fpIn The file stream to be reverse transformed. It must non-NULL and opened. fpOut The file stream receiving the reverse transformed data. It must non-NULL and opened. method xform_t type value indicating whether indicate whether or not MTF is used. Return Value Zero for success, non-zero for failure. HISTORY ------- 08/20/04 - Initial Release 08/25/04 - Don't prefix blocks with block size. Use value returned by fread() to recognize partial blocks instead. 05/02/05 - Allocating large arrays on heap instead of stack so that gcc can handle larger blocks. 11/03/05 - Sort algorithm is now similar to the "faster method" in the Burrows and Wheeler paper. 08/29/07 - Explicitly licensed under LGPL version 3. - Replaces getopt() with optlist library 10/29/14 - Changed the API so that encode and decode routines accept opened file streams instead of file names. - Upgrade to latest Oplist library - Tighter adherence to Michael Barr's "Top 10 Bug-Killing Coding Standard Rules" (http://www.barrgroup.com/webinars/10rules). 07/15/17 - Directory stucture changes for ease of use with GitHub - Included a test script that I've always used to test things 09/19/19 - Update e-mail address - pull the latest optlist TODO ---- - Use "known" efficient Move-To-Front algorithm - Investigate suffix tree based sorting solutions - There are several papers claiming speed improvements AUTHOR ------ Michael Dipperstein ([email protected]) https://michaeldipperstein.github.io/
About
bwt: An ANSI C implementation of the Burrows-Wheeler transformation (BWT)
Resources
License
LGPL-3.0, GPL-3.0 licenses found
Licenses found
LGPL-3.0
COPYING.LESSER
GPL-3.0
COPYING
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published