Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better float/double cases for casting tests #1781

Merged
merged 8 commits into from
Feb 23, 2021

Conversation

sperlingxx
Copy link
Collaborator

@sperlingxx sperlingxx commented Feb 21, 2021

This pull request is to address issue #1764. And current PR also includes some code clean work .

@sperlingxx
Copy link
Collaborator Author

build

@sperlingxx sperlingxx requested a review from revans2 February 21, 2021 10:38
@sperlingxx
Copy link
Collaborator Author

build

@sameerz sameerz added the test Only impacts tests label Feb 22, 2021
@@ -425,6 +383,14 @@ class CastOpSuite extends GpuExpressionTestSuite {
col("c1").cast(FloatType))
}

ignore("overflow double strings") {
testSparkResultsAreEqual("overflow double strings", badDoubleStringsDf,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

badDoubleStringsDf is not as comprehensive as I would like it to be. The values in it are big enough to force an overflow, but there are still a number of corner cases where according to rapidsai/cudf#5225 are not being covered. The patch to fix it rapidsai/cudf#7410 is not in and not working 100% either in these same overflow cases. I am happy to have these go in as it, but I would like to see a follow on issue to cover the corner cases that are missing, and have a comment in badDoubleStringsDf pointing to it. Would be good to look at the float one too and make sure it is also not suffering from the same issue.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @revans2, I tried with many other similar (slightly overflow) cases. And I found GPU can produce correct results for almost all of them, except some special cases included in badDoubleStringsDf.
For float type, I failed to find any mismatching case.
For double type, I only found numbers within range [1.79769313486231581E308, 1.797693134862316E308) produce inconsistent results with spark.

@sperlingxx
Copy link
Collaborator Author

build

@sperlingxx
Copy link
Collaborator Author

build

@revans2
Copy link
Collaborator

revans2 commented Feb 23, 2021

@razajafri you filed the original bug about casting from string to double. Could you take a look at this PR too?

@@ -1131,14 +1131,16 @@ trait SparkQueryCompareTestSuite extends FunSuite with Arm {
"9.8e5").toDF("c0")
}

def badFloatStringsDf(session: SparkSession): DataFrame = {
def invalidFloatStringsDf(session: SparkSession): DataFrame = {
import session.sqlContext.implicits._
Seq(("A", "null"), ("1.3", "43.54")).toDF("c0", "c1")
}

def badDoubleStringsDf(session: SparkSession): DataFrame = {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: for consistency may be this should be renamed to invalidDoubleStringsDf

@razajafri
Copy link
Collaborator

@razajafri you filed the original bug about casting from string to double. Could you take a look at this PR too?

Just one nit otherwise LGTM

@razajafri razajafri merged commit 52d95b1 into NVIDIA:branch-0.4 Feb 23, 2021
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* enhance float/double cases for casting tests

Signed-off-by: sperlingxx <[email protected]>

* continue

* code clean

* code clean

* fix typo

* fix typo

* some updates

* fix typo
nartal1 pushed a commit to nartal1/spark-rapids that referenced this pull request Jun 9, 2021
* enhance float/double cases for casting tests

Signed-off-by: sperlingxx <[email protected]>

* continue

* code clean

* code clean

* fix typo

* fix typo

* some updates

* fix typo
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
test Only impacts tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants