Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle escaping characters in single-quoted dyna_symbol expressions #370

Merged
merged 2 commits into from
Nov 28, 2022
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
10 changes: 10 additions & 0 deletions fixtures/small/dyna_symbol_with_escapes_actual.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
:'"foo"'

:"'#{<<~LOL}'"
I'm so sorry for \n writing this
#{:'"雷神の"'}
少し響みて
さし曇り
雨も降らぬか
君を留めむ
LOL
10 changes: 10 additions & 0 deletions fixtures/small/dyna_symbol_with_escapes_expected.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
:"\"foo\""

:"'#{<<~LOL}'"
I'm so sorry for \n writing this
#{:"\"\u96F7\u795E\u306E\""}
少し響みて
さし曇り
雨も降らぬか
君を留めむ
LOL
114 changes: 60 additions & 54 deletions librubyfmt/rubyfmt_lib.rb
Original file line number Diff line number Diff line change
Expand Up @@ -433,58 +433,7 @@ def on_string_literal(*args, &blk)

args << [start_line, end_line]

if start_delim && end_delim && start_delim != "\""
if start_delim == "'" || start_delim.start_with?("%q")
# re-evaluate the string with its own quotes to handle escaping.
if args[0][1]
es = eval("#{start_delim}#{args[0][1][1]}#{end_delim}")
# did the original contain \u's?
have_source_slash_u = args[0][1][1].include?("\\u")
# if all chars are unicode definitionally none of them are delimiters so we
# can skip inspect
have_all_unicode = es.chars.all? { |x| x.bytes.first >= 128 }

if have_all_unicode && !have_source_slash_u
"#{start_delim}#{args[0][1][1]}#{end_delim}"
else
args[0][1][1] = es.inspect[1..-2]
end
# Match at word boundaries and beginning/end of the string
# so that things like `'\n'` correctly become `"\\n"`
# instead of rendering as actual whitespace
#
# About this regex: `(?<!\\)` does a negative lookup for slashes
# before the newline escape, which will only match instances
# like `\\n` and not `\\\\n`
args[0][1][1].gsub!(/(?<!\\)\\n/, "\n")
# This matches a special edge case where the last character on the line of a
# single-quoted string is "\".
args[0][1][1].gsub!(/\\\\\\n/, "\\\\\\\n")
end
else
# find delimiters after an odd number of backslashes, or quotes after even number.
pattern = /(?<!\\)(\\\\)*(\\#{Regexp.escape(start_delim[-1])}|\\#{Regexp.escape(end_delim)}|")/

(args[0][1..-1] || []).each do |part|
next if part.nil?
case part[0]
when :@tstring_content
part[1] = part[1].gsub(pattern) do |str|
if str.end_with?('"')
# insert needed escape
"#{str[0..-2]}\\\""
else
# drop unnecessary escape
"#{str[0..-3]}#{str[-1]}"
end
end
when :string_embexpr, :string_dvar
else
raise "got #{part[0]} in a #{start_delim}...#{end_delim} string"
end
end
end
end
clean_string_content(start_delim, end_delim, args[0])
end

super
Expand Down Expand Up @@ -525,8 +474,10 @@ def on_dyna_symbol(*args)
# on_tstring_end, which will append the closing
# quote to @string_stack. We want to ignore this,
# so remove it from the stack.
start_line = @string_stack.pop[1]
super + [[start_line, lineno]]
delim, start_line = @string_stack.pop
res = super
clean_string_content(delim, delim, res[1])
res + [[start_line, lineno]]
end

def on_regexp_beg(re_part)
Expand Down Expand Up @@ -558,6 +509,61 @@ def on___end__(val)
end_line = lineno
res + [[start_line, end_line]]
end

private def clean_string_content(start_delim, end_delim, string_contents)
if start_delim && end_delim && start_delim != "\""
if start_delim == "'" || start_delim.start_with?("%q")
# re-evaluate the string with its own quotes to handle escaping.
if string_contents[1]
es = eval("#{start_delim}#{string_contents[1][1]}#{end_delim}")
# did the original contain \u's?
have_source_slash_u = string_contents[1][1].include?("\\u")
# if all chars are unicode definitionally none of them are delimiters so we
# can skip inspect
have_all_unicode = es.chars.all? { |x| x.bytes.first >= 128 }

if have_all_unicode && !have_source_slash_u
"#{start_delim}#{string_contents[1][1]}#{end_delim}"
else
string_contents[1][1] = es.inspect[1..-2]
end
# Match at word boundaries and beginning/end of the string
# so that things like `'\n'` correctly become `"\\n"`
# instead of rendering as actual whitespace
#
# About this regex: `(?<!\\)` does a negative lookup for slashes
# before the newline escape, which will only match instances
# like `\\n` and not `\\\\n`
string_contents[1][1].gsub!(/(?<!\\)\\n/, "\n")
# This matches a special edge case where the last character on the line of a
# single-quoted string is "\".
string_contents[1][1].gsub!(/\\\\\\n/, "\\\\\\\n")
end
else
# find delimiters after an odd number of backslashes, or quotes after even number.
pattern = /(?<!\\)(\\\\)*(\\#{Regexp.escape(start_delim[-1])}|\\#{Regexp.escape(end_delim)}|")/

(string_contents[1..-1] || []).each do |part|
next if part.nil?
case part[0]
when :@tstring_content
part[1] = part[1].gsub(pattern) do |str|
if str.end_with?('"')
# insert needed escape
"#{str[0..-2]}\\\""
else
# drop unnecessary escape
"#{str[0..-3]}#{str[-1]}"
end
end
when :string_embexpr, :string_dvar
else
raise "got #{part[0]} in a #{start_delim}...#{end_delim} string"
end
end
end
end
end
end

GC.disable