Added rle_fast C extension to improve speed #2
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changes Made
Why?
While the algorithm for encoding and decoding operations in rle/init.py is very efficient, it does not perform the said operations very fast. The reason for this is that Python is a slow language.
As such, Python provides a C-API for users to write extensions to Python. This way, we have the speed of C with the flexibility of Python.
Here, I have used the C-API to write the same algorithm as used in rle/init.py in C, provided code for building this extension and wrote a few tests. For the few input values that I tried, the speed seems to have improved at least 4x.
How?
The code for this extension is present in the folder rle_fast.
It contains 3 files, namely-
Wrapper for extension
The wrapper code for the extension is present in the file rle_fast/rle_fast_extension.c.
This code contains two methods encode_c and decode_c that will be called for encode and decode operations respectively. They are responsible for taking the arguments, parsing them, performing type checking, raising appropriate exceptions, etc.
In short, they act as an interface between Python and the C code.
Other than these functions, there are some function-level and module-level definitions too, where we define the names of the functions that will be called from the python script, number of arguments to be passed, docstrings for functions and module, and the name of the module.
Encode and Decode Operations
The file rle_fast/rle_utils.h contains the actual algorithm for performing the encode and decode operations.
It contains two functions, encode_sequence and decode_sequence.
The algorithm used in these two functions is the same as that in rle/init.py.
Documentation
The docstrings are present in the file rle_fast/rle_docs.h.
These are merely variables containing strings describing the module and the methods in it.
In rle_fast_extension.c, these docstrings have been used in the module and function definitions. After building and installing the extension, these docstrings can be accessed using the built-in help or doc method, just like with a normal python package.
Installation
The code for installing the rle_fast extension is present in the setup.py file.
To build the extension,
python setup.py build
To install the extension,
python setup.py install
Usage
To import the package,
Changes as compared to PR #1
In my previous pull request, I had mentioned that the extension fails for non integer values. I have fixed that bug.
Before, in the encode_sequence function, I had converted the elements from the input sequence to integers, before comparing them. This is why the code failed for non integer parameters.
In this version of my code, I have made use of an API function PyObject_RichCompareBool that compares two Python objects.
Now, the code works for almost all data-types, including integers, floats, complex numbers, characters, etc.
TO-DO