TableShark is a Python library for working with tables. It provides functionality for parsing, storing, and manipulating tables in various formats. It also provides a convenient way to handle table headers and cells.
You can install TableShark using pip
:
pip install tableshark
TableShark is a lightweight Python library that simplifies working with tabular data. It provides intuitive classes and functions for creating, manipulating, and analyzing tables. With TableLib, you can easily import tables from various data sources such as lists, pandas DataFrames, Excel files, CSV files, JSON files, and SQL databases. You can perform common table operations like adding rows and columns, merging tables, filtering data, and sorting. The library also offers seamless conversion between tables and popular formats like pandas DataFrames, Excel files, and CSV files. TableShark's user-friendly API and comprehensive documentation make it a valuable tool for anyone working with tabular data in Python.
TableError
is a custom exception class defined in this Python library. It is derived from the built-in Exception
class, allowing you to raise and handle specific errors related to table operations.
You can use TableError
to handle exceptional scenarios that may occur during table processing or manipulation.
It provides a straightforward way to identify and handle errors specific to table operations within your codebase.
To raise a TableError
, you can use the raise
statement and pass any relevant error message or information as an
argument to the exception. Here's an example:
from tableshark import TableError
try:
# Perform table operation
# ...
if True:
raise TableError("An error occurred while performing the table operation.")
# ...
except TableError as e:
# Handle the error
# ...
pass
Since TableError
is derived from the Exception
class, it inherits the basic functionality of exception handling in
Python. This includes capturing traceback information, handling exceptions using try-except
blocks, and propagating
exceptions up the call stack if not explicitly caught.
You can further customize TableError by adding additional attributes or methods as per your requirements. For example, you can define custom properties to provide more detailed information about the error or include methods for additional error handling functionality.
from tableshark import TableError
class TableErrorCustom(TableError):
def log_error(self):
# Custom error logging logic
# ...
pass
Feel free to extend the TableError
class according to your specific needs.
TableError is a simple yet powerful custom exception class provided by this library. It enables you to handle and communicate errors specifically related to table operations in a structured and controlled manner.
The Header
class represents the header section of a table in this Python library. It provides functionality for
parsing, storing, and manipulating the columns of the header.
To use the Header
class, you can create an instance by providing the raw header string and the structure type of the
table. The raw_header
parameter is an optional argument and defaults to an empty string. The structure
parameter
specifies the structure type of the table and accepts either 'xml'
or 'txt'
. Here's an example of creating
a Header
object:
from tableshark import Header
header = Header()
header.columns = ['Column 1', 'Column 2', 'Column 3']
Once you have a Header
object, you can perform various operations on it, such as retrieving columns, setting columns,
iterating over columns, and more.
raw_header (str)
: The raw header string provided during initialization._columns (list[str])
: The parsed columns of the header.
parse_header(raw_header: str, structure: Literal['xml', 'txt']) -> list[str]
: Parses the raw header based on the specified structure type and returns the parsed columns.columns() -> list
: Returns the columns of the header.columns(columns: list)
: Sets the columns of the header.rows()
: Alias for the columns() method.__len__() -> int
: Returns the number of columns in the header.__iter__() -> Iterator[str]
: Returns an iterator over the columns of the header.__getitem__(item: Union[str, int, Tuple[int], Tuple[int, int]]) -> Union[str, List[str]]
: Retrieves a column based on its name or index. It can also retrieve a column based on 2-dimensional coordinates or a slice object.__setitem__(key: Union[str, int], value: str)
: Sets the value of a column based on its name or index.__repr__() -> str
: Returns a string representation of the header.__add__(other: Union[str, List[str], Tuple[str], Header]) -> Header
: Concatenates the header with another header or a column. Returns a new Header object with the concatenated columns.__iadd__(other: Union[str, List[str], Tuple[str], Header])
: Concatenates the header with another header or a column in-place.
from tableshark import Header
raw_header =\
'''
+-----------------+-----------------+-----------------+
| Column 1 | Column 2 | Column 3 |
+-----------------+-----------------+-----------------+
'''
# Create a Header object
header = Header(raw_header, structure='txt')
# Retrieve columns
columns = header.columns
# Set columns
new_columns = ['Column A', 'Column B', 'Column C']
header.columns = new_columns
# Iterate over columns
for column in header:
print(column)
# Get the number of columns
num_columns = len(header)
# Get a column by name or index
column_1 = header['Column A']
column_2 = header[1]
# Set the value of a column
header['Column B'] = 'New Value'
# Concatenate headers
other_header = Header('Another Header')
concatenated_header = header + other_header
# In-place concatenation
header += 'Extra Column'
The Header
class provides a convenient way to handle the header section of a table. You can parse, store, and
manipulate columns with ease using the methods provided by this class. Incorporate the Header
class into your table
processing workflows to efficiently work with table headers.
The Cell
class represents a cell in a table. It encapsulates a value and provides various methods and properties for
working with the cell.
To create a Cell
object, you can initialize it with a value. The value can be of any type. Here's an example of
creating a Cell
object:
from tableshark import Cell
cell = Cell('Cell Value')
Once you have a Cell
object, you can perform various operations on it, such as retrieving the value, setting the
value, getting and setting the name, changing the type, and more.
_value (Any)
: The value of the cell._name (str)
: The name of the cell._type (str)
: The type of the cell.
name() -> str
: Returns the name of the cell.name(new_name: str)
: Sets the name of the cell.value() -> Any
: Returns the value stored in the cell.value(new_value)
: Sets the value of the cell.type() -> type
: Returns the type of the value.new_type(new: Callable)
: Changes the type of the value by applying a callable function.__str__() -> str
: Returns a string representation of the cell.__repr__() -> str
: Returns a string representation of the cell that can be used to recreate the cell object.__len__() -> int
: Returns the length of the value stored in the cell.__eq__(other)
-> bool: Checks if the value stored in the cell is equal to the provided value.__setattr__(key, value)
: Overrides the default setattr method to allow setting attributes.
from tableshark import Cell
# Create a Cell object
cell = Cell('value')
# Get the name of the cell
name = cell.name()
# Set the name of the cell
cell.name = 'Cell 1'
# Get the value of the cell
value = cell.value()
# Set the value of the cell
cell.value = 'new_value'
# Get the type of the value
cell_type = cell.type()
# Change the type of the value
cell.new_type(int)
# Get a string representation of the cell
cell_str = str(cell)
# Get a representation of the cell that can be used to recreate the object
cell_repr = repr(cell)
# Get the length of the value stored in the cell
value_length = len(cell)
# Check if the value stored in the cell is equal to another value
is_equal = cell == 'other_value'
# Set an attribute of the cell
cell.attribute = 'attribute_value'
The Cell
class provides a flexible way to work with individual cells in a table. You can access and modify the value,
name, and type of a cell using the methods and properties offered by this class. Incorporate the Cell
class into your
table processing workflows to effectively handle table cells.
The Vector
, Row
, and Column
classes represent a vector, row, and column in a table, respectively. They encapsulate
a list of cells and provide various methods and properties for working with the vector, row, or column.
The Vector
class is an abstract base class (ABC) that provides a common interface and functionality for vectors.
It contains a collection of Cell
objects and defines various methods and properties to work with these cells.
Key methods and properties:
__init__(*cells: Cell | Any)
: Initializes a Vector object with a variable number of Cell objects or values. If a value is provided instead of a Cell, it will be automatically converted into a Cell object.__getitem__(key: str | int) -> Any
: Retrieves the value of a cell in the vector by cell name or index.__setitem__(key, value)
: Sets the value of a cell in the vector by index.cell(i) -> Cell
: Returns the Cell object at the specified index.cells() -> list[Cell]
: Returns a list of Cell objects in the vector.to_dict() -> dict
: Converts the vector into a dictionary, where keys are cell names (or indices) and values are cell values.__eq__(other) -> bool
: Checks if two vectors are equal by comparing their cells.__iter__() -> Iterator[Cell]
: Allows iteration over the cells in the vector.df() -> pd.DataFrame (abstract)
: Converts the vector into a pandas DataFrame.__iadd__(other)
: Adds a Cell or Vector object to the vector in-place.__add__(other: Cell | Vector) -> Vector
: Adds a Cell or Vector object to the vector and returns a new Vector object.from_raw(value: Iterable[Any]) -> Vector
: Initializes the vector from an iterable of values.apply(func: Callable[[Cell, Any], Cell], *args: Any) -> Vector
: Applies a function to each cell in the vector.
The Row class is a subclass of Vector and represents a row of cells in a table-like structure.
To create a Row
object, you can pass a variable number of arguments representing the cells of the row. Each
argument can be an instance of the Cell
class or any other value. If a non-Cell
value is provided, it will be
automatically converted into a Cell
object. Here's an example of creating a Row object:
from tableshark import Row, Cell
row = Row(Cell(1), Cell(2), Cell(3))
# or alternatively
row_1 = Row(1, 2, 3) # 1, 2, and 3 will be converted into Cell objects
_str() -> str
: Returns a string representation of the row.df() -> pd.DataFrame
: Converts the row into a pandas DataFrame._T() -> Column
: Transposes the row into a Column object.
The Column class is a subclass of Vector and represents a column of cells in a table-like structure.
To create a Column
object, you can pass a variable number of arguments representing the cells of the column.
Each argument can be an instance of the Cell
class or any other value. If a non-Cell
value is provided, it will be
automatically converted into a Cell
object.
from tableshark import Column, Cell
column = Column(Cell(1), Cell(2), Cell(3))
# or alternatively
column_1 = Column(1, 2, 3) # 1, 2, and 3 will be converted into Cell objects
_str() -> str
: Returns a string representation of the column.df() -> pd.DataFrame
: Converts the column into a pandas DataFrame._T() -> Row
: Transposes the column into a Row object. These classes provide a convenient and flexible way to work with vectors, rows, and columns of cells in tabular data structures. They offer methods for accessing, manipulating, and converting the data represented by the vectors.
The Body class represents the body of a table, consisting of rows and columns.
To create an instance of the Body
class, you need to provide the raw body content of the table as a string, the
structure type ('xml' or 'txt'), and an optional iterable of Row
objects. Here's an example of creating a Body object:
from tableshark import Body
raw_body = \
"""
+---+---+---+
| 1 | 2 | 3 |
+---+---+---+
| 4 | 5 | 6 |
+---+---+---+
"""
body = Body(raw_body, structure='txt')
Or alternatively, you can create a Body object from an iterable of Row objects:
from tableshark import Body, Row
rows = [Row(1, 2, 3), Row(4, 5, 6)]
body = Body(rows=rows)
You can access the attributes and methods of the Body object to work with the body content of the table.
The Body class has the following attributes:
raw_body (str)
: The raw body content of the table.structure (Literal['xml', 'txt'])
: The structure type of the body content.rows (list[Row])
: The list of rows in the body.number_of_columns (int)
: The number of columns in the body.
The Body class provides the following methods:
parse_body(raw_body: str = '', structure: Literal['xml', 'txt'] = 'txt') -> list[Row]
: Parse the raw body content and return a list of Row objects.
Here's an example of how to use the Body class:
from tableshark import Body
raw_body = \
"""
+---+---+---+
| 1 | 2 | 3 |
+---+---+---+
| 4 | 5 | 6 |
+---+---+---+
"""
body = Body(raw_body, structure='txt')
# Example: Accessing the rows
rows = body.rows
# Example: Accessing the number of columns
number_of_columns = body.number_of_columns
# Example: Iterating over the rows
for row in body:
print(row)
# Output:
# Row(Cell(1), Cell(2), Cell(3))
# Row(Cell(4), Cell(5), Cell(6))
The Body
class provides a convenient way to represent and work with the body content of a table.
It allows you to parse the raw body content, access rows and columns, and iterate over the rows. Incorporate
the Body
class into your table-related projects to handle table bodies with ease.
The Table class represents a table structure.
You can create an instance of the Table class by providing the table structure as a string, or by specifying the header and body separately.
from tableshark import Table
table_structure = \
"""
+---+---+---+
| a | b | c |
+---+---+---+
| 4 | 5 | 6 |
+---+---+---+
"""
table = Table(stream=table_structure, structure='txt')
or
from tableshark import Table, Header, Body, Row
header = Header()
header.columns = ['Column 1', 'Column 2', 'Column 3']
body = Body(rows=Row(['row1', 'row2', 'row3']))
table = Table(header=header, body=body)
Access the attributes, properties, and methods of the Table object to work with the table structure.
from tableshark import Table
table = Table() # imagine that we have already created a Table object
# Example: Accessing the header and body
header = table.header
body = table.body
# Example: Accessing a specific column
column = table.column(2)
# Example: Accessing a specific row
row = table.row(1)
# Example: Accessing a specific cell
cell = table.cells(2, 3)
The Table class has the following attributes:
header (Header)
: The header of the table.body (Body)
: The body of the table.
The Table class provides the following properties:
header (Header)
: The getter and setter for the table's header.body (Body)
: The getter and setter for the table's body.
The Table class provides the following methods:
column(col: int | str) -> Column
: Retrieve a full column of the table.row(row: int) -> Row
: Retrieve a specific row of the table.cells(row: int, column: int) -> Cell
: Retrieve a specific cell of the table.
The Table class implements the following magic methods:
__getitem__(item: int | str) -> Row | Column
: Retrieve a specific row or column of the table.__iter__() -> Iterator[Row]
: Iterate over the rows of the table.__len__() -> int
: Return the number of rows in the table.__str__() -> str
: Return a string representation of the table.__repr__() -> str
: Return a string representation of the table.
Because of the magic methods, you can use the Table object as if it were a list of rows. For example, you can iterate over the rows of the table, access a specific row, or get the number of rows in the table.
from tableshark import Table, Body, Row
table = Table()
table.header = ['Column 1', 'Column 2', 'Column 3']
table.body = Body(rows=[Row(1, 2, 3), Row(4, 5, 6)])
# Example: Iterating over the rows
for row in table:
print(row)
# Output:
# Row(Cell(1), Cell(2), Cell(3))
# Row(Cell(4), Cell(5), Cell(6))
# Example: Accessing a specific row
row = table[1]
# or alternatively
row = table.row(1)
print(row)
# Output:
# Row(Cell(4), Cell(5), Cell(6))
# Example: Getting the number of rows
number_of_rows = len(table)
# Example: Getting a specific cell
cell = table[1][2]
# or alternatively
cell = table.cells(1, 2)
print(cell)
# Output:
# Cell(6)
# Example: Getting a specific column
column = table['Column 2']
# or alternatively
column = table.column(2)
print(column)
# Output:
# Column(Cell(2), Cell(5))
# Also you can set the value of a specific cell
table[1][2] = 7
The Table class provides a convenient way to represent and manipulate table structures. It allows you to access the header, body, rows, columns, and individual cells of the table. Incorporate the Table class into your projects to handle table-related operations with ease.
The Schema class represents a collection of tables.
To use the Schema class, follow these steps:
from tableshark import Schema, Table
# 1. Create an instance of the Schema class by providing the tables as arguments.
schema = Schema(Table(), Table())
# 2. Access the tables within the schema using iteration or indexing.
for table in schema:
print(table)
# or
table = schema[0]
# 3. Add tables to the schema using the addition operator.
new_table = Table()
schema = schema + new_table
The Schema class provides the following methods:
from_excel(file_path, sheet_names=None) -> Schema
: Create a Schema instance from an Excel file.from_csv(file_path, delimiter=',') -> Schema
: Create a Schema instance from a CSV file.from_sql(executor, table_names=None) -> Schema
: Create a Schema instance from a SQL database.from_json(file_path) -> Schema
: Create a Schema instance from a JSON file.from_io(stream, type_='excel', **kwargs) -> Schema
: Create a Schema instance from a stream (e.g., BytesIO) of a specific type, such as Excel.
The Schema class implements the following magic methods:
__iter__() -> Iterator[Table]
: Iterate over the tables in the schema.__add__(other: Table) -> Schema
: Add a table to the schema.__getitem__(item) -> Table
: Get a specific table from the schema using indexing.__len__() -> int
: Get the number of tables in the schema.
The Schema class provides a string representation of the schema using the __str__()
and __repr__()
methods.
__str__()
: Returns a string representation of the schema.__repr__()
: Returns a string representation that can be used to recreate the schema.
The Schema
class allows you to organize and work with a collection of tables. It provides methods to create a
schema from various sources such as Excel files, CSV files, SQL databases, and JSON files. You can iterate over
the tables, add new tables, and access specific tables within the schema. Use the Schema
class to manage and
manipulate multiple tables efficiently in your projects.