Skip to content

Commit

Permalink
[SPARK-39677][SQL][DOCS] Fix args formatting of the regexp and like f…
Browse files Browse the repository at this point in the history
…unctions

### What changes were proposed in this pull request?
In the PR, I propose to fix args formatting of some regexp functions by adding explicit new lines. That fixes the following items in arg lists.

Before:

<img width="745" alt="Screenshot 2022-07-05 at 09 48 28" src="https://user-images.githubusercontent.com/1580697/177274234-04209d43-a542-4c71-b5ca-6f3239208015.png">

After:

<img width="704" alt="Screenshot 2022-07-05 at 11 06 13" src="https://user-images.githubusercontent.com/1580697/177280718-cb05184c-8559-4461-b94d-dfaaafda7dd2.png">

### Why are the changes needed?
To improve readability of Spark SQL docs.

### Does this PR introduce _any_ user-facing change?
No.

### How was this patch tested?
By building docs and checking manually:
```
$ SKIP_SCALADOC=1 SKIP_PYTHONDOC=1 SKIP_RDOC=1 bundle exec jekyll build
```

Closes apache#37082 from MaxGekk/fix-regexp-docs.

Authored-by: Max Gekk <[email protected]>
Signed-off-by: Max Gekk <[email protected]>
  • Loading branch information
MaxGekk committed Jul 5, 2022
1 parent 161c596 commit 4e42f8b
Showing 1 changed file with 16 additions and 30 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -84,16 +84,12 @@ abstract class StringRegexExpression extends BinaryExpression
Arguments:
* str - a string expression
* pattern - a string expression. The pattern is a string which is matched literally, with
exception to the following special symbols:
_ matches any one character in the input (similar to . in posix regular expressions)
exception to the following special symbols:<br><br>
_ matches any one character in the input (similar to . in posix regular expressions)\
% matches zero or more characters in the input (similar to .* in posix regular
expressions)
expressions)<br><br>
Since Spark 2.0, string literals are unescaped in our SQL parser. For example, in order
to match "\abc", the pattern should be "\\abc".
to match "\abc", the pattern should be "\\abc".<br><br>
When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it falls back
to Spark 1.6 behavior regarding string literal parsing. For example, if the config is
enabled, the pattern to match "\abc" should be "\abc".
Expand Down Expand Up @@ -189,7 +185,7 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
copy(left = newLeft, right = newRight)
}

// scalastyle:off line.contains.tab
// scalastyle:off line.contains.tab line.size.limit
/**
* Simple RegEx case-insensitive pattern matching function
*/
Expand All @@ -200,16 +196,12 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
Arguments:
* str - a string expression
* pattern - a string expression. The pattern is a string which is matched literally and
case-insensitively, with exception to the following special symbols:
_ matches any one character in the input (similar to . in posix regular expressions)
case-insensitively, with exception to the following special symbols:<br><br>
_ matches any one character in the input (similar to . in posix regular expressions)<br><br>
% matches zero or more characters in the input (similar to .* in posix regular
expressions)
expressions)<br><br>
Since Spark 2.0, string literals are unescaped in our SQL parser. For example, in order
to match "\abc", the pattern should be "\\abc".
to match "\abc", the pattern should be "\\abc".<br><br>
When SQL config 'spark.sql.parser.escapedStringLiterals' is enabled, it falls back
to Spark 1.6 behavior regarding string literal parsing. For example, if the config is
enabled, the pattern to match "\abc" should be "\abc".
Expand Down Expand Up @@ -237,7 +229,7 @@ case class Like(left: Expression, right: Expression, escapeChar: Char)
""",
since = "3.3.0",
group = "predicate_funcs")
// scalastyle:on line.contains.tab
// scalastyle:on line.contains.tab line.size.limit
case class ILike(
left: Expression,
right: Expression,
Expand Down Expand Up @@ -574,12 +566,10 @@ case class StringSplit(str: Expression, regex: Expression, limit: Expression)
Arguments:
* str - a string expression to search for a regular expression pattern match.
* regexp - a string representing a regular expression. The regex string should be a
Java regular expression.
Java regular expression.<br><br>
Since Spark 2.0, string literals (including regex patterns) are unescaped in our SQL
parser. For example, to match "\abc", a regular expression for `regexp` can be
"^\\abc$".
"^\\abc$".<br><br>
There is a SQL config 'spark.sql.parser.escapedStringLiterals' that can be used to
fallback to the Spark 1.6 behavior regarding string literal parsing. For example,
if the config is enabled, the `regexp` that can match "\abc" is "^\abc$".
Expand Down Expand Up @@ -783,12 +773,10 @@ abstract class RegExpExtractBase
Arguments:
* str - a string expression.
* regexp - a string representing a regular expression. The regex string should be a
Java regular expression.
Java regular expression.<br><br>
Since Spark 2.0, string literals (including regex patterns) are unescaped in our SQL
parser. For example, to match "\abc", a regular expression for `regexp` can be
"^\\abc$".
"^\\abc$".<br><br>
There is a SQL config 'spark.sql.parser.escapedStringLiterals' that can be used to
fallback to the Spark 1.6 behavior regarding string literal parsing. For example,
if the config is enabled, the `regexp` that can match "\abc" is "^\abc$".
Expand Down Expand Up @@ -888,12 +876,10 @@ case class RegExpExtract(subject: Expression, regexp: Expression, idx: Expressio
Arguments:
* str - a string expression.
* regexp - a string representing a regular expression. The regex string should be a
Java regular expression.
Java regular expression.<br><br>
Since Spark 2.0, string literals (including regex patterns) are unescaped in our SQL
parser. For example, to match "\abc", a regular expression for `regexp` can be
"^\\abc$".
"^\\abc$".<br><br>
There is a SQL config 'spark.sql.parser.escapedStringLiterals' that can be used to
fallback to the Spark 1.6 behavior regarding string literal parsing. For example,
if the config is enabled, the `regexp` that can match "\abc" is "^\abc$".
Expand Down

0 comments on commit 4e42f8b

Please sign in to comment.