PDF parser

The smalot/pdfparser is a standalone PHP package that provides various tools to extract data from PDF files.

This library is under active maintenance. There is no active development by the author of this library (at the moment), but we welcome any pull request adding/extending functionality!

Features

Load/parse objects and headers
Extract metadata (author, description, ...)
Extract text from ordered pages
Support of compressed PDFs
Support of MAC OS Roman charset encoding
Handling of hexa and octal encoding in text sections
Create custom configurations (see CustomConfig.md).

Currently, secured documents and extracting form data are not supported.

License

This library is under the LGPLv3 license.

Install

This library requires PHP 7.1+ since v1. You can install it via Composer:

composer require smalot/pdfparser

In case you can't use Composer, you can include alt_autoload.php-dist. It will include all required files automatically.

Quick example

<?php

// Parse PDF file and build necessary objects.
$parser = new \Smalot\PdfParser\Parser();
$pdf = $parser->parseFile('/path/to/document.pdf');

$text = $pdf->getText();
echo $text;

Further usage information can be found here.

Documentation

Documentation can be found in the doc folder.

Name		Name	Last commit message	Last commit date
Latest commit History 391 Commits
.github		.github
dev-tools		dev-tools
doc		doc
samples		samples
src/Smalot/PdfParser		src/Smalot/PdfParser
tests		tests
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.php_cs		.php_cs
.scrutinizer.yml		.scrutinizer.yml
LICENSE.txt		LICENSE.txt
Makefile		Makefile
README.md		README.md
alt_autoload.php-dist		alt_autoload.php-dist
composer.json		composer.json
phpstan.neon		phpstan.neon
phpunit.xml		phpunit.xml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PDF parser

Features

License

Install

Quick example

Documentation

About

Releases

Packages

Languages

License

rubenvanerk/pdfparser

Folders and files

Latest commit

History

Repository files navigation

PDF parser

Features

License

Install

Quick example

Documentation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages