Skip to content

Commit

Permalink
GH-38750: [Python] Adding 3 operators to DNF filter expression (#38774)
Browse files Browse the repository at this point in the history
Adding is_nan, is_null and is_valid operators to DNF filter expression.
  • Loading branch information
JacekPliszka committed Nov 18, 2023
1 parent 8301eb5 commit 1ddd24f
Showing 1 changed file with 12 additions and 3 deletions.
15 changes: 12 additions & 3 deletions python/pyarrow/parquet/core.py
Original file line number Diff line number Diff line change
Expand Up @@ -133,9 +133,12 @@ def _check_filters(filters, check_null_strings=True):
Each tuple has format: (``key``, ``op``, ``value``) and compares the
``key`` with the ``value``.
The supported ``op`` are: ``=`` or ``==``, ``!=``, ``<``, ``>``, ``<=``,
``>=``, ``in`` and ``not in``. If the ``op`` is ``in`` or ``not in``, the
``value`` must be a collection such as a ``list``, a ``set`` or a
``tuple``.
``>=``, ``in``, ``not in``, ``is_nan``, ``is_null`` and ``is_valid``.
If the ``op`` is ``in`` or ``not in``, the ``value`` must be a collection such as
a ``list``, a ``set`` or a ``tuple``.
If the ``op`` is ``is_nan`` or ``is_valid`` - ``value`` is ignored.
If the ``op`` is ``is_null`` - ``value`` is nan_is_null parameter as described
in ``pyarrow.dataset.Expression``.
Examples:
Expand Down Expand Up @@ -208,6 +211,12 @@ def convert_single_predicate(col, op, val):
return field.isin(val)
elif op == 'not in':
return ~field.isin(val)
elif op == 'is_nan':
return field.is_nan()
elif op == 'is_null':
return field.is_null(nan_is_null=val)
elif op == 'is_valid':
return field.is_valid()
else:
raise ValueError(
'"{0}" is not a valid operator in predicates.'.format(
Expand Down

0 comments on commit 1ddd24f

Please sign in to comment.