-
Notifications
You must be signed in to change notification settings - Fork 245
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[BUG] regexp_test is failing in nightly tests #6028
Comments
Wondering if this is related to #5776? cc: @NVnavkumar |
Weird, it's supposed to skip those tests actually. Let me take a look. |
So there are 2 sets of integration tests: In if locale.nl_langinfo(locale.CODESET) != 'UTF-8':
pytestmark = [pytest.mark.regexp, pytest.mark.skip(reason=str("Current locale doesn't support UTF-8, regexp support is disabled"))]
else:
pytestmark = pytest.mark.regexp In
I would assume that However, for some reason, in the JVM used by the underlying Spark, the locale isn't using UTF-8, so regular expression support is disabled, hence the "part of the plan is not columnar" exceptions:
So the Python interpreter is inconsistent from the JVM on the driver in this case. |
Also filed #6032 which looks like related to recent regex change, |
@NVnavkumar do we care about the Python process settings? nl_langinfo's doc does not sound encouraging
Should we just retrieve default charset of the underlying JVM using something like: |
I think that requires access to the spark context, which I don't know will have been created by the point we would decide whether to skip a test or not |
Yes we have access to it at that point. |
Latest nightly run had 120 test failures in regexp_test. Sampling a few, they were all of the "plan is not columnar" variety.
The text was updated successfully, but these errors were encountered: