Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature] Add 'expand_table' feature #1475

Merged
merged 6 commits into from
Sep 25, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
68 changes: 67 additions & 1 deletion docs/user-guide.rst
Original file line number Diff line number Diff line change
Expand Up @@ -350,7 +350,73 @@ Check out the api docs for `DataValidationRule`_ and `CondtionType`_ for more de

.. _CondtionType: https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/other#ConditionType

.. _DataValidationRule: https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/cells#DataValidationRule
.. _DataValidationRule: https://developers.google.com/sheets/api/reference/rest/v4/spreadsheets/cells#DataValidationRule

Extract table
~~~~~~~~~~~~~

Gspread provides a function to extract a data table.
A data table is defined as a rectangular table that stops either on the **first empty** cell or
the **enge of the sheet**.

You can extract table from any address by providing the top left corner of the desired table.

Gspread provides 3 directions for searching the end of the table:

* :attr:`~gspread.utils.TableDirection.right`: extract a single row searching on the right of the starting cell
* :attr:`~gspread.utils.TableDirection.down`: extract a single column searching on the bottom of the starting cell
* :attr:`~gspread.utils.TableDirection.table`: extract a rectangular table by first searching right from starting cell,
then searching down from starting cell.

.. note::

Gspread will not look for empty cell inside the table. it only look at the top row and first column.

Example extracting a table from the below sample sheet:

.. list-table:: Find table
:header-rows: 1

* - ID
- Name
- Universe
- Super power
* - 1
- Batman
- DC
- Very rich
* - 2
- DeadPool
- Marvel
- self healing
* - 3
- Superman
- DC
- super human
* -
- \-
- \-
- \-
* - 5
- Lavigne958
-
- maintains Gspread
* - 6
- Alifee
-
- maintains Gspread

Using the below code will result in rows 2 to 4:

.. code:: python

worksheet.expand("A2")

[
["Batman", "DC", "Very rich"],
["DeadPool", "Marvel", "self healing"],
["Superman", "DC", "super human"],
]



Expand Down
129 changes: 129 additions & 0 deletions gspread/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -168,6 +168,12 @@ class ValidationConditionType(StrEnum):
filter_expression = "FILTER_EXPRESSION"


class TableDirection(StrEnum):
table = "TABLE"
down = "DOWN"
right = "RIGHT"


def convert_credentials(credentials: Credentials) -> Credentials:
module = credentials.__module__
cls = credentials.__class__.__name__
Expand Down Expand Up @@ -979,6 +985,129 @@ def to_records(
return [dict(zip(headers, row)) for row in values]


def _expand_right(values: List[List[str]], start: int, end: int, row: int) -> int:
"""This is a private function, returning the column index of the last non empty cell
on the given row.

Search starts from ``start`` index column.
Search ends on ``end`` index column.
Searches only in the row pointed by ``row``.
"""
try:
return values[row].index("", start, end) - 1
except ValueError:
return end


def _expand_bottom(values: List[List[str]], start: int, end: int, col: int) -> int:
"""This is a private function, returning the row index of the last non empty cell
on the given column.

Search starts from ``start`` index row.
Search ends on ``end`` index row.
Searches only in the column pointed by ``col``.
"""
for rows in range(start, end):
# in case we try to look further than last row
if rows >= len(values):
return len(values) - 1

# check if cell is empty (or the row => empty cell)
if col >= len(values[rows]) or values[rows][col] == "":
return rows - 1

return end - 1


def find_table(
values: List[List[str]],
start_range: str,
direction: TableDirection = TableDirection.table,
) -> List[List[str]]:
"""Expands a list of values based on non-null adjacent cells.

Expand can be done in 3 directions defined in :class:`~gspread.utils.TableDirection`

* ``TableDirection.right``: expands right until the first empty cell
* ``TableDirection.down``: expands down until the first empty cell
* ``TableDirection.table``: expands right until the first empty cell and down until first empty cell

In case of empty result an empty list is restuned.

When the given ``start_range`` is outside the given matrix of values the exception
:class:`~gspread.exceptions.InvalidInputValue` is raised.

Example::

values = [
['', '', '', '', '' ],
['', 'B2', 'C2', '', 'E2'],
['', 'B3', 'C3', '', 'E3'],
['', '' , '' , '', 'E4'],
]
>>> utils.find_table(TableDirection.table, 'B2')
[
['B2', 'C2'],
['B3', 'C3'],
]


.. note::

the ``TableDirection.table`` will look right from starting cell then look down from starting cell.
It will not check cells located inside the table. This could lead to
potential empty values located in the middle of the table.

.. warning::

Given values must be padded with `''` empty values.

:param list[list] values: values where to find the table.
:param gspread.utils.TableDirection direction: the expand direction.
:param str start_range: the starting cell range.
:rtype list(list): the resulting matrix
"""
row, col = a1_to_rowcol(start_range)

# a1_to_rowcol returns coordinates starting form 1
row -= 1
col -= 1

if row >= len(values):
raise InvalidInputValue(
"given row for start_range is outside given values: start range row ({}) >= rows in values {}".format(
row, len(values)
)
)

if col >= len(values[row]):
raise InvalidInputValue(
"given column for start_range is outside given values: start range column ({}) >= columns in values {}".format(
col, len(values[row])
)
)

if direction == TableDirection.down:
rightMost = col
bottomMost = _expand_bottom(values, row, len(values), col)

if direction == TableDirection.right:
bottomMost = row
rightMost = _expand_right(values, col, len(values[row]), row)

if direction == TableDirection.table:
rightMost = _expand_right(values, col, len(values[row]), row)
bottomMost = _expand_bottom(values, row, len(values), col)

result = []

# build resulting array
for rows in values[row : bottomMost + 1]:
result.append(rows[col : rightMost + 1])

return result


# SHOULD NOT BE NEEDED UNTIL NEXT MAJOR VERSION
# DEPRECATION_WARNING_TEMPLATE = (
# "[Deprecated][in version {v_deprecated}]: {msg_deprecated}"
Expand Down
55 changes: 55 additions & 0 deletions gspread/worksheet.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
PasteOrientation,
PasteType,
T,
TableDirection,
ValidationConditionType,
ValueInputOption,
ValueRenderOption,
Expand All @@ -53,6 +54,7 @@
convert_colors_to_hex_value,
convert_hex_to_colors_dict,
fill_gaps,
find_table,
finditem,
get_a1_from_absolute_range,
is_full_a1_notation,
Expand Down Expand Up @@ -3336,3 +3338,56 @@ def add_validation(
}

return self.client.batch_update(self.spreadsheet_id, body)

def expand(
self,
top_left_range_name: str = "A1",
direction: TableDirection = TableDirection.table,
) -> List[List[str]]:
"""Expands a cell range based on non-null adjacent cells.

Expand can be done in 3 directions defined in :class:`~gspread.utils.TableDirection`

* ``TableDirection.right``: expands right until the first empty cell
* ``TableDirection.down``: expands down until the first empty cell
* ``TableDirection.table``: expands right until the first empty cell and down until the first empty cell

In case of empty result an empty list is restuned.

When the given ``start_range`` is outside the given matrix of values the exception
:class:`~gspread.exceptions.InvalidInputValue` is raised.

Example::

values = [
['', '', '', '', '' ],
['', 'B2', 'C2', '', 'E2'],
['', 'B3', 'C3', '', 'E3'],
['', '' , '' , '', 'E4'],
]
>>> utils.find_table(TableDirection.table, 'B2')
[
['B2', 'C2'],
['B3', 'C3'],
]


.. note::

the ``TableDirection.table`` will look right from starting cell then look down from starting cell.
It will not check cells located inside the table. This could lead to
potential empty values located in the middle of the table.

.. note::

when it is necessary to use non-default options for :meth:`~gspread.worksheet.Worksheet.get`,
please get the data first using desired options then use the function
:func:`gspread.utils.find_table` to extract the desired table.

:param str top_left_range_name: the top left corner of the table to expand.
:param gspread.utils.TableDirection direction: the expand direction
:rtype list(list): the resulting matrix
"""

values = self.get(pad_values=True)
return find_table(values, top_left_range_name, direction)
Loading