Skip to content
This repository has been archived by the owner on Nov 30, 2024. It is now read-only.

Handle exception message with invalid UTF-8; For rbx #1760

Merged
merged 2 commits into from
Feb 18, 2015

Conversation

bf4
Copy link
Contributor

@bf4 bf4 commented Nov 3, 2014

See https://travis-ci.org/mikel/mail/jobs/33707769#L129

An exception occurred running /home/travis/build/mikel/mail/gemfiles/vendor/bundle/rbx/2.1/gems/rspec-core-3.0.4/exe/rspec:
    invalid byte sequence in UTF-8 (ArgumentError)
Backtrace:
    Rubinius::Splitter.valid_encoding? at kernel/common/splitter.rb:17
    Rubinius::Splitter.split at kernel/common/splitter.rb:22
    String#split at kernel/common/string.rb:515
    RSpec::Core::Notifications::FailedExampleNotification#failure_lines at \
          gemfiles/vendor/bundle/rbx/2.1/gems/rspec-core-3.0.4/lib/rspec/core/notifications.rb:220

To reproduce with just the one test

 rspec spec/rspec/core/formatters/base_text_formatter_spec.rb:108

Update Ultimately the encoding bug was fixed in rspec/rspec-support#172

@JonRowe
Copy link
Member

JonRowe commented Nov 3, 2014

Can you add a spec for this? I'd like to see one exposing/demonstrating the problem here.

@JonRowe
Copy link
Member

JonRowe commented Nov 3, 2014

Also this won't work for Ruby 1.8, (we typically conditionally define methods like this based on the existance of the encode method), and won't work for charsets that aren't UTF-8

@bf4
Copy link
Contributor Author

bf4 commented Nov 3, 2014

@JonRowe trying to figure out how write a spec for it. My first goal was to prove I didn't break anything :)

I'll fix the 1.8 compatibility with next push.

@myronmarston
Copy link
Member

@JonRowe trying to figure out how write a spec for it.

It's tricky to write a string literal with an invalid byte sequence. My suggestion is to use \x notation in the string to explicitly specify the hexidecimal value of particular bytes.

@bf4
Copy link
Contributor Author

bf4 commented Nov 3, 2014

@myronmarston I can write such a string, the hard thing is to figure out why exception.message contains valid utf8 from mri's perspective, but invalid from rbx. Rather, that is how to model that.

@myronmarston
Copy link
Member

@myronmarston I can write such a string, the hard thing is to figure out why exception.message contains valid utf8 from mri's perspective, but invalid from rbx. Rather, that is how to model that.

That's trickier. However, while the bug only manifested for you on rbx, I expect that the same bug would manifest on MRI given the "right" string. I'd rather have a spec that demonstrates the bug on MRI than a spec that only demonstrates it on RBX, particularly since we have RBX in the "allowed failures" section of our travis builds. (We've invested countless hours in getting RBX builds to pass but it continues to cause us problems...)

@JonRowe
Copy link
Member

JonRowe commented Nov 3, 2014

I'd also like to see a spec demonstrating how this reacts with a UTF-8 incompatible encoding scheme (we have similar specs for this in our differ specs, which is now in the support gem)

@bf4
Copy link
Contributor Author

bf4 commented Nov 3, 2014

@myronmarston I understand. Any tricks to getting all specs to pass in dev? Do you mostly follow the steps in the .travis.yml? Specifically gettin DRb-related local failures.

Also, zounds, rbx is green with failures! https://travis-ci.org/rspec/rspec-core/jobs/39887273#L125

@myronmarston
Copy link
Member

@myronmarston I understand. Any tricks to getting all specs to pass in dev? Do you mostly follow the steps in the .travis.yml? Specifically getting DRb-related local failures.

.travis.yml does a lot of extra stuff outside running the specs that you don't need to bother with. In general, the specs should pass if you bundle install and run bundle exec rspec.

Do you have a ~/.rspec file? If so, that could affect things, and could explain the failures. Feel free to post the failures you're seeing.

@bf4
Copy link
Contributor Author

bf4 commented Nov 3, 2014

@myronmarston All specs pass now that I've blanked out my ~/.rspec. I didn't think I had anything important in there. Thanks for nudging me to look there. Was:

--colour
--drb
--format documentation

Now I can work on trying to reproduce the original bug and test drive the fix.

@bf4
Copy link
Contributor Author

bf4 commented Nov 3, 2014

Ok, I give up. @brixen, how do I disable rbx-(VM?)-specific stack traces so that they'll be similar to MRI? I assume it's a Rubinius::CONFIG['rbx.something']. I've read a good chunk of source code but haven't found it yet. I know how to do it in JRuby.

e.g. failures

         RuntimeError:
               foo
       -     # (erb):1:in `<main>'
       -     # ./spec/rspec/core/resources/formatter_specs.rb:39:in `block (2 levels) in <top (required)>'
       +     # (erb):1:in `__script__'
       +     # kernel/common/block_environment.rb:53:in `call_on_instance'
       +     # kernel/common/eval.rb:176:in `eval'
       +     # /Users/benjamin/.rvm/rubies/rbx-2.2.6/gems/gems/rubysl-erb-2.0.1/lib/rubysl/erb/erb.rb:849:in `result'
       +     # ./spec/rspec/core/resources/formatter_specs.rb:39:in `__script__'
       +     # kernel/common/eval.rb:101:in `instance_exec'
       +     # kernel/bootstrap/array.rb:87:in `map'
             # ./spec/support/sandboxing.rb:31:in `run'
       +     # kernel/bootstrap/array.rb:87:in `map'
             # ./spec/support/formatter_support.rb:13:in `run_example_specs_with_formatter'
       -     # ./spec/support/sandboxing.rb:2:in `block (3 levels) in <top (required)>'
       -     # ./spec/support/sandboxing.rb:36:in `instance_exec'
       -     # ./spec/support/sandboxing.rb:36:in `block in sandboxed'
       +     # kernel/common/eval.rb:101:in `instance_exec'
       +     # kernel/bootstrap/proc.rb:20:in `call'
       +     # ./spec/support/sandboxing.rb:2:in `__script__'
       +     # kernel/common/eval.rb:101:in `instance_exec'
       +     # ./spec/support/sandboxing.rb:36:in `sandboxed'
             # ./spec/support/sandboxing.rb:35:in `sandboxed'
       -     # ./spec/support/sandboxing.rb:2:in `block (2 levels) in <top (required)>'
       +     # ./spec/support/sandboxing.rb:2:in `__script__'
       +     # kernel/common/eval.rb:101:in `instance_exec'
       +     # kernel/bootstrap/proc.rb:20:in `call'
       +     # kernel/bootstrap/array.rb:87:in `map'
       +     # kernel/bootstrap/array.rb:87:in `map'
       +     # kernel/common/kernel.rb:447:in `load'
       +     # kernel/delta/code_loader.rb:66:in `load_script'
       +     # kernel/delta/code_loader.rb:152:in `load_script'
       +     # kernel/loader.rb:649:in `script'
       +     # kernel/loader.rb:831:in `main'

@myronmarston
Copy link
Member

@bf4 -- I think we can just skip those specs on RBX (or write a different fixture for RBX that uses the RBX backtraces).

BTW, we should probably fix our specs so they are not prone to failing when contributors have stuff in ~/.rspec. Care to take a stab at that?

@brixen
Copy link

brixen commented Nov 5, 2014

@bf4 we would strongly encourage not filtering backtraces. It removes valuable information.

MRI backtraces are an artifact of being written in C where no frame information is available. If the backtrace runs through some Ruby code in eg standard library, those frames would be included.

Related, instead of mining backtrace string text for information, we should be making proper exception objects with attributes.

In both these cases (ie filtering and accessing attributes in an object-oriented fashion), if you want to help us create APIs, we'd be happy to implement them.

@JonRowe
Copy link
Member

JonRowe commented Nov 5, 2014

@brixen I suspect @bf4 is just attempting to bring RBX into alignment with MRI for the purposes of this spec, the details of the underlying ruby failure not being important as we know it's an encoding issue.

@bf4
Copy link
Contributor Author

bf4 commented Nov 19, 2014

RSpec fixtures are way too magical. It took me a bit* to track down where they were coming from. I think I'm going to just ignore rbx for those specs.

  • Not the 14 days since the last comment here. That is just me 'multi-tasking'. It was more like 30 minutes :)

@myronmarston
Copy link
Member

RSpec fixtures are way too magical.

What fixtures are you referring to?

@bf4
Copy link
Contributor Author

bf4 commented Nov 19, 2014

where a failure like I mentioned above on
https://github.com/rspec/rspec-core/blob/master/spec/rspec/core/resources/formatter_specs.rb#L39
has no obvious expectation,

it "fails with a backtrace that has no file" do
  require 'erb'

  ERB.new("<%= raise 'foo' %>").result
end

the expectation is yadda yadda in a HEREDOC in
https://github.com/rspec/rspec-core/blob/master/spec/support/formatter_support.rb#L121

def expected_summary_output_for_example_specs
  &lt;&lt;-EOS.gsub(/^\s+\|/, '').chomp
  | 3) a failing spec with odd backtraces fails with a backtrace that has no file
  | Failure/Error: ERB.new("<%= raise 'foo' %>").result
  | RuntimeError:
  | foo
  | # (erb):1:in `<main>'
  | # ./spec/rspec/core/resources/formatter_specs.rb:39:in `block (2 levels) in <top (required)>'
  EOS
end

@bf4 bf4 force-pushed the handle_non_utf8_exception_messages branch from 5640d3d to b1f019d Compare November 19, 2014 03:42
@bf4
Copy link
Contributor Author

bf4 commented Nov 19, 2014

@JonRowe sorry, apparently once you reply via email, you can never format it. I just pushed the code I spiked last week, but I still haven't written the actual test.

# Converting it to a higher higher character set (UTF-16) and then
# back (to UTF-8) ensures that you will strip away invalid or undefined byte sequences.
encode(Encoding::UTF_16LE, :invalid => :replace, :undef => :replace, :replace => '?').
encode(Encoding::UTF_8)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You still haven't fixed this, it will break on 1.8.7.

@JonRowe
Copy link
Member

JonRowe commented Nov 19, 2014

Yeah no dramas, it's a known flaw with emailed comments :)

It's still not apparent what you mean, that method just returns a string so something is comparing it somewhere... but the the setup for the formatter specs is fairly complex so it has been reduced down into support methods, maybe if you added your spec to this we could give you some feedback, we do need a spec to assert this works ok on normal ruby.

@bf4 bf4 force-pushed the handle_non_utf8_exception_messages branch 2 times, most recently from 3249f3e to 8bd3911 Compare November 19, 2014 03:52
@bf4
Copy link
Contributor Author

bf4 commented Nov 19, 2014

@JonRowe

  1. re: email and formatting bug, just first time for me
  2. formatter setup was hard to track down. The setup certainly suprised me. I did try post-processing the HEREDOC to remove Rubinius code, but didn't have much luck
  3. Encoding spec... now that I'm futzing with it again I'm give it another shot.

@bf4 bf4 force-pushed the handle_non_utf8_exception_messages branch from 8bd3911 to e5efd86 Compare November 19, 2014 05:12
@bf4
Copy link
Contributor Author

bf4 commented Nov 19, 2014

I pushed each commit separately. I'm not sure how you want to handle co-ordinating the changes that would go in rspec-support. If you're okay with this here, I'll make a PR there and do whatever you think is best. (Modify the Gemfile?)

@@ -5,6 +5,8 @@ Feature: Use rspec-core without rspec-mocks or rspec-expectations
available, but rspec-core can be used just fine without either of those
gems installed.

# Rubinius stacktrace includes kernel/loader.rb etc.
@unsupported-on-rbx
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me know if there's a better way to do this. That tag was already there :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the normal way we skip a scenario on rbx, but I don't understand why you chose to skip this one. It doesn't appear to be related in any way to the encoding bug you ran into, as far as I can see. Can you explain?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, there were a bunch of failing rbx specs due to backtraces that I fixed in Skip specs with non-mri-compatible backtrace. Also see rspec-support …. it's technically out of scope of the PR, so I could put that in a second one, if you like. 23f911b

def exception_message
@exception_message ||= begin
string = exception.message.to_s
RSpec::Support::EncodedString.new(string, encoding_of(""))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah use encoding_of(string), that should suffice.

bf4 added a commit to bf4/rspec-support that referenced this pull request Feb 8, 2015
bf4 added a commit to bf4/rspec-support that referenced this pull request Feb 8, 2015
bf4 added a commit to bf4/rspec-support that referenced this pull request Feb 8, 2015
bf4 added a commit to bf4/rspec-support that referenced this pull request Feb 8, 2015
Map string char with invalid encoding to '?'
Format identical string expectation to read easier

Refs:
- rspec/rspec-core#1760
- via rspec#134
bf4 added a commit to bf4/rspec-support that referenced this pull request Feb 8, 2015
Map string char with invalid encoding to '?'
Format identical string expectation to read easier

Refs:
- rspec/rspec-core#1760
- via rspec#134
bf4 added a commit to bf4/rspec-support that referenced this pull request Feb 8, 2015
Map string char with invalid encoding to '?'
Format identical string expectation to read easier

Refs:
- rspec/rspec-core#1760
- via rspec#134

Set to pending for JRuby, opened issue
jruby/jruby#2580
that JRuby is the only Ruby that returns "\x80"
in place of "?"
@bf4 bf4 force-pushed the handle_non_utf8_exception_messages branch from 166bbd9 to b234ad4 Compare February 9, 2015 03:09
@bf4
Copy link
Contributor Author

bf4 commented Feb 9, 2015

Updated now that rspec/rspec-support#172 is merged.

@JonRowe I'll need more guidance on the changes you proposed in the encoding_spec.rb re: using a real exception, using group, using example_notification(example), and which file, if any, those specs should be put in.

I made a separate PR for the skipping backtrace specs on non-mri rubies

@bf4 bf4 force-pushed the handle_non_utf8_exception_messages branch from b234ad4 to 64e3278 Compare February 9, 2015 16:08
@bf4
Copy link
Contributor Author

bf4 commented Feb 15, 2015

@JonRowe @myronmarston re: #1760 (comment) thoughts?

@JonRowe
Copy link
Member

JonRowe commented Feb 16, 2015

@bf4 just look at the contents of FormatterSupport and use the methods there rather than your manual implementations?

bf4 added 2 commits February 17, 2015 20:54
Add spec for exception when failure_lines has a bad encoding
"".split("\n")       # => []
nil.to_s.split("\n") # => []

If exception.message is nil, the inner block will
not be reached. If it is non-nil, then there's no need
to check for nil inside the block.
@bf4 bf4 force-pushed the handle_non_utf8_exception_messages branch from 64e3278 to 62204c8 Compare February 18, 2015 02:58
@bf4
Copy link
Contributor Author

bf4 commented Feb 18, 2015

Simplified it, I think for the better.

@JonRowe
Copy link
Member

JonRowe commented Feb 18, 2015

Looks good, I'm going to merge despite the lack of an Appveyor build because it seems to be playing up.

JonRowe added a commit that referenced this pull request Feb 18, 2015
Handle exception message with invalid UTF-8; For rbx
@JonRowe JonRowe merged commit 4152945 into rspec:master Feb 18, 2015
JonRowe added a commit that referenced this pull request Feb 18, 2015
@JonRowe
Copy link
Member

JonRowe commented Feb 18, 2015

Should we backport this to 3-2-maintenance @myronmarston ?

@myronmarston
Copy link
Member

Should we backport this to 3-2-maintenance @myronmarston ?

Yes, if all the rspec-support changes this depends up on are in rspec-support 3.2.0. (I forget).

@bf4 bf4 deleted the handle_non_utf8_exception_messages branch February 19, 2015 00:48
@bf4
Copy link
Contributor Author

bf4 commented Feb 19, 2015

🌈 🎆 👯

The EncodedString#split fix was in rspec/rspec-support#172

@JonRowe
Copy link
Member

JonRowe commented Feb 19, 2015

Hmm I don't think they have and I'm not sure they all should be so prehaps better to leave this for 3.3

@myronmarston
Copy link
Member

Hmm I don't think they have and I'm not sure they all should be so prehaps better to leave this for 3.3

Sounds like the right choice.

MatheusRich pushed a commit to MatheusRich/rspec-core that referenced this pull request Oct 30, 2020
…ages

Handle exception message with invalid UTF-8; For rbx
MatheusRich pushed a commit to MatheusRich/rspec-core that referenced this pull request Oct 30, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants