Skip to content

Commit

Permalink
insane nonsense to make decode_utf8 barf as needed
Browse files Browse the repository at this point in the history
  • Loading branch information
rjbs committed Sep 8, 2013
1 parent 113c8ac commit 5ec644d
Showing 1 changed file with 9 additions and 1 deletion.
10 changes: 9 additions & 1 deletion lib/Dist/Zilla/Plugin/MetaYAML.pm
Original file line number Diff line number Diff line change
Expand Up @@ -102,7 +102,15 @@ sub gather_files {
if (
$yaml =~ /[^\x00-\xFF]/
or
! eval { decode_utf8($yaml, FB_CROAK); 1 }
! eval {
# Why do I need to do this completely idiotic thing??
# No other cajoling got the croak to occur on Latin-1-but-not-UTF-8
# input, including manual diddling of the utf8 flag. Probably I
# missed something, but this works, and the whole mess is
# temporary. -- rjbs, 2013-09-08
my $copy = join q{}, map {; chr ord } (split '', $yaml);
decode_utf8($copy, FB_CROAK); 1
}
) {
# Characters over \xFF or not a valid UTF-8 buffer:
# assume it's all text.
Expand Down

3 comments on commit 5ec644d

@miyagawa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Which version of Encode.pm were you testing with? Encode before 2.53 used to handle argument differently in decode_utf8 depending on the utf-8 flag. With the current version, it croaks correctly even when $yaml has a utf-8 flag (with latin-1 string in it).

To make it work with the earlier version of Encode, you might need to use decdoe("utf-8", $yaml, FB_CROAK|LEAVE_SRC).

@rjbs
Copy link
Owner Author

@rjbs rjbs commented on 5ec644d Sep 8, 2013

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm using 2.52, which is what was in 5.19.3. I will give it go.

FWIW, I tried futzing with the utf8 flag, and couldn't get that, alone, to change the croaking-ness.

Anyway, this is a temporary measure, so maybe I won't do much about it yet. Sure did make me 😕 though.

@miyagawa
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With Encode 2.52, i can make it croak either by a) utf8::downgrade to remove the utf-8 flag or b) Use decode("utf-8") instead of decode_utf8. HTH.

> perl -MEncode -e '$x="L\x{e9}on"; utf8::upgrade $x; warn Encode->VERSION; decode_utf8($x, Encode::FB_CROAK); warn 1'
2.52 at -e line 1.
1 at -e line 1.

> perl -MEncode -e '$x="L\x{e9}on"; warn Encode->VERSION; decode_utf8($x, Encode::FB_CROAK); warn 1'
2.52 at -e line 1.
utf8 "\xE9" does not map to Unicode at /Users/miyagawa/.plenv/versions/5.19.3/lib/perl5/5.19.3/darwin-2level/Encode.pm line 217.

> perl -MEncode -e '$x="L\x{e9}on"; utf8::upgrade $x; warn Encode->VERSION; decode("utf-8", $x, Encode::FB_CROAK); warn 1'
2.52 at -e line 1.
utf8 "\xE9" does not map to Unicode at /Users/miyagawa/.plenv/versions/5.19.3/lib/perl5/5.19.3/darwin-2level/Encode.pm line 176.

Please sign in to comment.