XLSParser is reasonably fast PHP library intended to parse Microsoft Excel legacy binary XLS formats. It was written because all PHP implementations where too slow. Code is more or less direct port of python's excellent xlrd package. Library is very early alpha. I made it a long time ago.
Issues and pull requests are accepted.
- PHP 5.4.0 or newer.
- Multibyte string extension (mbstring) to handle UTF-16LE encoding, used in newer Excel files.
- Little endian system because of PHP's unpack/pack. If you're not on Sparc, you should be covered.
Via command line: composer require laacz/xls-parser
.
Install dependencies with composer install --dev
, then run tests with vendor/bin/phpunit
.
KISS. Provide filename and it gets loaded or parsed.
$book = new laacz\XLSParser\Book(file_get_contents('workbook.xls'));
Sheets can be accessed via their numeric index or name. Since Sheet
object implements ArrayAccess and IteratorAggregate, you
can do that too.
$sheet = $book[0];
$sheet = $book['Page1'];
$sheet = $book['Vājprāts'];
Cells also can be accessed as with sheets. Index starts from zero.
$row = $sheet[0];
$cell = $sheet[$sheet->nrows - 1][1];
To get value of a cell, cast it to string (or use it in such context) or get value
attribute:
$val1 = $cell->value;
$val2 = (string)$cell;
Or, if you wish...
$val1 = $book[0][0][0]->value;
By now formatting can be accessed raw. In short - sheet contains mapping array rich_text_runlist_map[][]
, which has
arrays with two elements - position and font reference. First is position where style is being applied from, second is
number which refers to book's font_list[]
, which on its part contains format description.
Library does its best to parse dates found within cells. It returns string in common date format: 'yyyy-mm-dd hh:mi:ss'. For example: 2014-12-31 12:59:59.
- Memory efficiency does not exist in context of this library.
- Performance might be better.
- Abstract formatting.
- Add helper methods for common tasks - returning columns, ranges, etc.