Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unicode regexp for finding header field name and slightly updated rakefile #7

Closed
maxmeyer opened this issue Dec 31, 2015 · 10 comments
Closed

Comments

@maxmeyer
Copy link

@terceiro Please port this PR into your fork: sup-heliotrope#3. Thanks.

@maxmeyer
Copy link
Author

maxmeyer commented Jan 9, 2016

Ping.

@terceiro
Copy link
Owner

terceiro commented Jan 9, 2016

can you please provide a test case?

@maxmeyer
Copy link
Author

Yes, I can. I hope that the string "survives" github C&P. The string abge�ndertes Angebot makes rmail fail. In German it's normally written like abgeändertes Angebot -> ä.

require 'rmail'

message = <<-EOS123
Subject: abge�ndertes Angebot                                                                                                    
EOS123

payload = RMail::Parser.read(message)
puts payload.header

That fails with the following error message:

/home/user/.gem/2.3.0/gems/rmail-1.1.1.666/lib/rmail/header.rb:81:in `=~': incompatible encoding regexp match (ASCII-8BIT regexp with UTF-8 string) (Encoding::CompatibilityError)
from /home/user/.gem/2.3.0/gems/rmail-1.1.1.666/lib/rmail/header.rb:81:in `parse'
from /home/user/.gem/2.3.0/gems/rmail-1.1.1.666/lib/rmail/parser.rb:235:in `block in parse_header'
from /home/user/.gem/2.3.0/gems/rmail-1.1.1.666/lib/rmail/parser.rb:231:in `each'
from /home/user/.gem/2.3.0/gems/rmail-1.1.1.666/lib/rmail/parser.rb:231:in `parse_header'
from /home/user/.gem/2.3.0/gems/rmail-1.1.1.666/lib/rmail/parser.rb:194:in `parse_low'
from /home/user/.gem/2.3.0/gems/rmail-1.1.1.666/lib/rmail/parser.rb:183:in `parse'
from /home/user/.gem/2.3.0/gems/rmail-1.1.1.666/lib/rmail/parser.rb:333:in `parse'
from /home/user/.gem/2.3.0/gems/rmail-1.1.1.666/lib/rmail/parser.rb:347:in `read'

@maxmeyer
Copy link
Author

Hope that helpes you to fix that error. If you need more information, please send me a ping.

@maxmeyer
Copy link
Author

BTW: I don't have a clue which user agent was used by the sender. There's no field like that in the mail.

@maxmeyer
Copy link
Author

I digged deeper to find out some more information. I would normlally expect the ä to be encoded by the user agent. I've got two files in my maildir. One seems to be encoded correctly, the other one contains the ä in "plain text". I even checked the file encodings.

Mail 1

File encoding: latin1
Subject: abgeändertes Angebot    

Mail 2

File encoding: utf-8   
Subject: =?iso-8859-1?Q?AW:_abge=E4ndertes_Angebot?=

Very strange to find those different encodings in different e-mails. Since I'm just a normal user regarding email I'm not sure how mail servers normally save mails.

@gauteh How does sup read in mails? Does it matter which file encoding the mail dir files have?

So maybe we've got two different places to be fixed?

  1. Sup - Read in mails with different encodings
  2. Rmail - Gracefully handle encoding errors

@maxmeyer
Copy link
Author

To fix my local setup I overwrote the Subject-header for that particular mail.

@gauteh
Copy link

gauteh commented Jan 17, 2016

Note that the comment in my patch is misleading, the regexp matches more stuff than what the RFC (and my comment) says.

@terceiro
Copy link
Owner

terceiro commented Jan 17, 2016 via email

@maxmeyer
Copy link
Author

Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants