-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encoding issue: PyPDF2.utils.PdfReadError: Illegal character in Name Object #438
Comments
the same problem |
I have the same problem and your fix works for me. 我遇到同一個問題,您的方法可以解決我的問題。 |
台灣的朋友你好:
以及utils.py 中的 238-241行
替換為:
|
try:
r = s.encode('latin-1')
except:
r = s.encode('utf-8')
if len(s) < 2:
bc[s] = r
return r |
遇到同样问题,重新打了个包,发在这里 |
Do you still get the same issue with the latest PyPDF2? Can somebody share a pdf that causes it? |
I'm closing this issue now as it might have been solved with the latest improvements. Please let me know if it wasn't solved by the latest PyPDF2 version. Also, please share a PDF which causes issues! |
I met the same problem again, same as the author |
you can see my branch project
…---- Replied Message ----
| From | ***@***.***> |
| Date | 04/24/2023 15:42 |
| To | ***@***.***> |
| Cc | ***@***.***>***@***.***> |
| Subject | Re: [py-pdf/pypdf] Encoding issue: PyPDF2.utils.PdfReadError: Illegal character in Name Object (#438) |
I met the same problem again, same as the author
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you commented.Message ID: ***@***.***>
|
@michelle-chou25 |
来 用我这个 |
@lwdsw Please open a new issue for it with your code and the PDF file as well as an English description. Note that PyPDF2 is deprecated and should be migrated to pypdf. |
(已解决)我在给一个PDF添加水印的时候遇到了如下错误,提示我的Name Object中有非法字符:
从代码中发现文件流已经合并完成,理论上我的水印是已经加上了的,但是往文件中写入的时候抛出了异常
我发现是generic.py的484行:
return NameObject(name.decode('utf-8'))
抛出的异常,因为我的PDF是中文所以我想到是因为编码问题,于是我把utf-8改成了GBK,
但是又出现了另外一个异常:
UnicodeEncodeError: 'latin-1' codec can't encode characters in position 8-9: ordinal not in range(256)
又找到这个异常是 utils的第238行导致的:
r = s.encode('latin-1')
又是一个编码问题,一开始我将latin-1换成了utf-8发现可以输出文件,但是文字排版错乱,而且少了许多文字,于是我想到可能是因为PDF中存在不同编码的文字导致的,所以我将此处代码改为了:
问题成功解决,但是我感觉还会发生其他类似的异常,希望官方能关注一下PDF不同字符编码的兼容问题.
The text was updated successfully, but these errors were encountered: