UnicodeDecodeError #49

Nando-bog · 2014-05-21T18:07:34Z

I am getting a UnicodeDecodeError when decrypting a text that was encrypted using the library. However, the text is decypted properly despite the error.

Details:
OS: Mac OS Mavericks 10.9.2
Python: 2.7.6
gnupg: 1.2.5

Sample error from my Terminal:

from gnupg import GPG
g=GPG(homedir='MY GPG HOME DIR')
c=g.encrypt('hola', 'KEY ID')
p=g.decrypt(str(c), passphrase='MYPASSWORD')
Exception in thread Thread-8:
Traceback (most recent call last):
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 810, in __bootstrap_inner
self.run()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/threading.py", line 763, in run
self.__target(_self.__args, *_self.__kwargs)
File "/Library/Python/2.7/site-packages/gnupg/_meta.py", line 532, in _read_response
line = stream.readline()
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 530, in readline
data = self.read(readsize, firstline=True)
File "/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/codecs.py", line 477, in read
newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xed in position 0: invalid continuation byte
print(p)
hola

As the last line shows, decryption worked, but it still threw the error.

Thanks!

isislovecruft · 2014-07-09T12:51:26Z

Hello @Nando-bog! Thanks for reporting this bug.

I don't have a Mac to test it on, but on a Linux machine I get the following:

>>> from gnupg import GPG
>>> g = GPG(homedir='foobar')
>>> c = g.encrypt('hallo', '50CC7744')
>>> print c.data
-----BEGIN PGP MESSAGE-----

hIwDz4uqK8zd5zkBA/962lezKEAsh157nZsiR+KYd/PW1jdxPG2u1RD4BaSEpkGF
cUlIkJmpliC0qiYvjA2ssnP4DPQ582z4rYAWVmbGjbrBIuQ3FBJBWxWbCkbDqCyu
tFzoCFkmILRQo6DLNgjNtXZPHiqYrP9ll5BaeteE1ooroJ0x3YSDMxbayX61OtJA
Aa2ST3t7iBU6xe6vRr8+4n3stbAwYB2H0RDh5/S/buVJQCI0tbmVSwLxdLZwadFF
XEq8W1X7iWPcGEmKlOkEag==
=9Rge
-----END PGP MESSAGE-----
>>> c.status
'encryption ok'
>>> d = g.decrypt(c.data)
>>> d.data
'hallo'

This is with Python 2.7.6 and python-gnupg-1.2.6.

It could have something to do with locale settings in your terminal. python-gnupg tries to be smart and respect them when necessary, otherwise it defaults to one with utf8. Or, it could be because you did str(c). I'm not sure.

@Nando-bog, do you think you could try again, using c.data and d.data, rather than str(c) and str(d), please?

* CHANGE gnupg._meta.GPGBase.__init__() to register the builtin `codecs.replace_errors` handler and a global codecs "strict" error handler. * FIXES Issue #49: #49

isislovecruft · 2014-10-28T01:08:57Z

I believe this issue was fixed in my fix/49-unicode-decode-on-readline branch, which I've merged into develop and will be available in the next upcoming version (1.3.3).

Please reopen if the issue persists.

extempore · 2014-10-30T04:11:35Z

I got a lot of UnicodeDecode errors while doing gpg.recv_keys() for a bunch of keys. It seems they were imported in the pubring successfully though.

Here is a key that gave me one of these errors: A0D180F35F45D0A0FBED9CD36E68F80607AF1977

I was using the current pypi version, should I switch to the develop branch or is it unstable?

Exception in thread Thread-683:
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 808, in __bootstrap_inner
    self.run()
  File "/usr/lib/python2.7/threading.py", line 761, in run
    self.__target(*self.__args, **self.__kwargs)
  File ""/home/ba/env_deed/local/lib/python2.7/site-packages/gnupg/_meta.py", line 564, in _read_response
    line = stream.readline()
  File "/home/ba/env_deed/lib/python2.7/codecs.py", line 530, in readline
    data = self.read(readsize, firstline=True)
  File ""/home/ba/env_deed/lib/python2.7/codecs.py", line 477, in read
    newchars, decodedbytes = self.decode(data, self.errors)
UnicodeDecodeError: 'utf8' codec can't decode byte 0xf6 in position 0: invalid start byte

pwfff · 2016-05-17T21:09:49Z

This solution needs to be rethought. Changing this global callback is affecting other code, most notably the DataStax Cassandra driver. The exception in the following code is never raised, so what is already valid UTF-8 has its characters replaced with garbage: https://github.com/datastax/python-driver/blob/master/cassandra/cqltypes.py#L675

sawall · 2016-10-24T20:30:16Z

This solution definitely needs to be rethought. It breaks MIME encodings of attachments even if I am not applying PGP to them. For example, the base64 representation of an image will be munged here:

import gnupg
gpg = gnupg.GPG('/path/to/gpg')

import smtplib
from email.mime.multipart import MIMEMultipart
from email.mime.image import MIMEImage

def send_my_email():
    msg = MIMEMultipart()
    msg['Subject'] = 'subject'
    msg['From'] = '[email protected]'
    msg['To'] = '[email protected]'
    with open('/tmp/image.jpg', mode='rb') as image_file:
        image = MIMEImage(image_file.read())
    msg.attach(image)
    s = smtplib.SMTP('smtp.gmail.com', 587)
    s.starttls()
    s.login('[email protected]', 'password')
    s.send_message(msg)
    s.quit()

sawall · 2016-10-25T15:59:24Z

Note that a workaround of the monkey-patch is to define gpgon and gpgoff functions and use them around any gpg calls. Presumably an approach like this could be used in a decorator that could be injected into python-gnupg.

If code like this is used when a package is loaded, it will wrangle this situation:

import codecs
default_strict_func = codecs.lookup_error('strict')
import gnupg
gpg = gnupg.GPG('/path/to/gpg')
gpg_strict_func = codecs.lookup_error('strict')
def gpgon(): codecs.register_error('strict', gpg_strict_func)
def gpgoff(): codecs.register_error('strict', default_strict_func)

e3rd · 2018-01-04T16:24:44Z

Thanks for posting this workaround!! I've expanded it so that you may use GPGSafe class instead of gnupg.GPG without having to manage anything.

gpg = GPGSafe(use_agent=False, homedir="~/.gnupg/") # (instead of gpg = gnupg.GPG(...))
gpg.sign(text)
...

https://gist.github.com/e3rd/45aed2e93ac20843b6790b6b642da396

(Since this issue remains closed, I've also created a pull request so that it is noted by project visitors.)

This removes the monkey-patch from isislovecruft/python-gnupg@d9116ba and instead uses a local modification of the StreamReader by switching from ›strict‹ error handlers (the default) to ›replace‹ error handlers. This should resolve isislovecruft#219 and isislovecruft#49, as well as email attachments.

isislovecruft added this to the 1.2.8 milestone Jul 9, 2014

isislovecruft self-assigned this Jul 9, 2014

isislovecruft added the question label Jul 9, 2014

evilaliv3 mentioned this issue Sep 13, 2014

Some PGP key cannot be uploaded with error: "The PGP key cannot be imported" globaleaks/globaleaks-whistleblowing-software#948

Closed

isislovecruft modified the milestones: 1.3.2, 1.3.x, 1.3.3 Oct 28, 2014

isislovecruft closed this as completed Oct 28, 2014

This was referenced Oct 7, 2018

Fix global unicode/codecs monkey-patch #244

Open

DO NOT USE THIS LIBRARY: Includes critical bug (global monkey-patch which breaks unicode and email sending!) #246

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UnicodeDecodeError #49

UnicodeDecodeError #49

Nando-bog commented May 21, 2014

isislovecruft commented Jul 9, 2014

isislovecruft commented Oct 28, 2014

extempore commented Oct 30, 2014

pwfff commented May 17, 2016

sawall commented Oct 24, 2016 •

edited

Loading

sawall commented Oct 25, 2016 •

edited

Loading

e3rd commented Jan 4, 2018 •

edited

Loading

UnicodeDecodeError #49

UnicodeDecodeError #49

Comments

Nando-bog commented May 21, 2014

isislovecruft commented Jul 9, 2014

isislovecruft commented Oct 28, 2014

extempore commented Oct 30, 2014

pwfff commented May 17, 2016

sawall commented Oct 24, 2016 • edited Loading

sawall commented Oct 25, 2016 • edited Loading

e3rd commented Jan 4, 2018 • edited Loading

sawall commented Oct 24, 2016 •

edited

Loading

sawall commented Oct 25, 2016 •

edited

Loading

e3rd commented Jan 4, 2018 •

edited

Loading