Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

revived #257: Properly decode ANSI encodings #349

Merged
merged 37 commits into from
Oct 8, 2020
Merged
Show file tree
Hide file tree
Changes from 30 commits
Commits
Show all changes
37 commits
Select commit Hold shift + click to select a range
a74f5a5
Get correct default font
Sep 17, 2019
49ba4ee
Create header elements with it's respective class
Sep 17, 2019
33a1746
Properly decode ANSI encodings
Sep 17, 2019
b5ddbf7
allow for line breaks when splitting xrefs for id and position
Connum Sep 26, 2020
c800c65
extend TestCase.php with functionality to "catch" E_NOTICE and E_WARNING
Connum Sep 29, 2020
d87b51a
added test case for this fix
Connum Sep 29, 2020
b12df3b
only reset error handler when the current handler is the handler we h…
Connum Sep 29, 2020
e1673b2
work around for failing CI build with PHP 5.6
Connum Sep 29, 2020
5403abb
added comment and link to the workaround getting the current error ha…
Connum Sep 29, 2020
5b5b480
removed unnecessary ini_set call
Connum Sep 29, 2020
2319f85
remove error level constant name before error message
Connum Sep 29, 2020
51b8ea3
restore error from the error handler itself, to prevent PHPUnit's "TH…
Connum Sep 29, 2020
86525f6
reverse the changes made to the TestCase class and the code in the te…
Connum Sep 29, 2020
4e4b3e2
simplified test case, now checking if object has been parsed correctly
Connum Sep 29, 2020
8cb6ed2
Merge branch 'master' into fix-19
Connum Sep 29, 2020
c2cb436
code linting
Connum Sep 29, 2020
8416c42
Merge branch 'fix_encoding' of https://github.com/skyfms/pdfparser in…
Connum Sep 29, 2020
637eae3
applied linting
Connum Sep 29, 2020
ce70a4d
Merge branch 'skyfms-fix_encoding' into 257-revived
Connum Sep 29, 2020
48c472b
handle failed font lookup
Connum Sep 29, 2020
4042e74
look up unfiltered font resource name first, then fall back to filter…
Connum Sep 29, 2020
4994a46
Merge branch 'master' into 257-revived
Connum Sep 30, 2020
6c74459
added unit test for #202 bugfix, code linting
Connum Sep 30, 2020
cb1299b
mb_convert_encoding does not support 'Mac', replace with iconv()
Connum Sep 30, 2020
4f8e812
fallback for decoding single-byte ANSI characters that are not in the…
Connum Sep 30, 2020
d9b1c1c
added test file and unit test for international unicode characters
Connum Sep 30, 2020
9811fac
don't double-encode strings already in UTF-8
Connum Sep 30, 2020
7203d42
code linting
Connum Sep 30, 2020
1c0208d
removed remnants from old decodeContent() function signature
Connum Sep 30, 2020
c620aeb
parseHeaderElement() should not return a PDFObject
Connum Sep 30, 2020
7957624
some minor changes as requested by the review
Connum Oct 1, 2020
1f41ccc
keep $unicode as deprecated parameter in decodeContent function signa…
Connum Oct 3, 2020
c8e05fc
forgot to add default value for $unicode to make it optional
Connum Oct 3, 2020
9e41fbb
added proper doc blocks to PostScriptGlyphs.php
Connum Oct 3, 2020
a9c33b4
return array from PostScriptGlyphs::getGlyphs() directly instead of u…
Connum Oct 3, 2020
09d718d
changed @deprecated to parameter description
Connum Oct 3, 2020
54e953f
Merge branch 'master' into 257-revived
Connum Oct 6, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added samples/InternationalChars.pdf
Binary file not shown.
Binary file added samples/bugs/Issue202.pdf
Binary file not shown.
1 change: 1 addition & 0 deletions src/Smalot/PdfParser/Document.php
Original file line number Diff line number Diff line change
Expand Up @@ -77,6 +77,7 @@ public function init()

// Propagate init to objects.
foreach ($this->objects as $object) {
$object->getHeader()->init();
$object->init();
}
}
Expand Down
9 changes: 4 additions & 5 deletions src/Smalot/PdfParser/Encoding.php
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@
namespace Smalot\PdfParser;

use Smalot\PdfParser\Element\ElementNumeric;
use Smalot\PdfParser\Encoding\PostScriptGlyphs;

/**
* Class Encoding
Expand Down Expand Up @@ -95,12 +96,10 @@ public function init()
++$code;
}

// Build final mapping (custom => standard).
$table = array_flip(array_reverse($this->encoding, true));

$this->mapping = $this->encoding;
foreach ($this->differences as $code => $difference) {
/* @var string $difference */
$this->mapping[$code] = (isset($table[$difference]) ? $table[$difference] : Font::MISSING);
$this->mapping[$code] = $difference;
}
}
}
Expand Down Expand Up @@ -129,6 +128,6 @@ public function translateChar($dec)
$dec = $this->mapping[$dec];
}

return $dec;
return PostScriptGlyphs::getCodePoint($dec);
}
}
Loading