Releases: snowflakedb/snowpark-python
Releases · snowflakedb/snowpark-python
Release
1.9.0 (2023-10-13)
New Features
- Added support for the Python 3.11 runtime environment.
Dependency updates
- Added back the dependency of
typing-extensions
.
Bug Fixes
- Fixed a bug where imports from permanent stage locations were ignored for temporary stored procedures, UDTFs, UDFs, and UDAFs.
- Revert back to using CTAS (create table as select) statement for
Dataframe.writer.save_as_table
which does not need insert permission for writing tables.
New Features
- Support
PythonObjJSONEncoder
json-serializable objects forARRAY
andOBJECT
literals.
Release
1.8.0 (2023-09-14)
New Features
- Added support for VOLATILE/IMMUTABLE keyword when registering UDFs.
- Added support for specifying clustering keys when saving dataframes using
DataFrame.save_as_table
. - Accept
Iterable
objects input forschema
when creating dataframes usingSession.create_dataframe
. - Added the property
DataFrame.session
to return aSession
object. - Added the property
Session.session_id
to return an integer that represents session ID. - Added the property
Session.connection
to return aSnowflakeConnection
object . - Added support for creating a Snowpark session from a configuration file or environment variables.
Dependency updates
- Updated
snowflake-connector-python
to 3.2.0.
Bug Fixes
- Fixed a bug where automatic package upload would raise
ValueError
even when compatible package version were added insession.add_packages
. - Fixed a bug where table stored procedures were not registered correctly when using
register_from_file
. - Fixed a bug where dataframe joins failed with
invalid_identifier
error. - Fixed a bug where
DataFrame.copy
disables SQL simplfier for the returned copy. - Fixed a bug where
session.sql().select()
would fail if any parameters are specified tosession.sql()
.
Release
1.7.0 (2023-08-28)
New Features
- Added parameters
external_access_integrations
andsecrets
when creating a UDF, UDTF or Stored Procedure from Snowpark Python to allow integration with external access. - Added support for these new functions in
snowflake.snowpark.functions
:array_flatten
flatten
- Added support for
apply_in_pandas
insnowflake.snowpark.relational_grouped_dataframe
. - Added support for replicating your local Python environment on Snowflake via
Session.replicate_local_environment
.
Bug Fixes
- Fixed a bug where
session.create_dataframe
fails to properly set nullable columns where nullability was affected by order or data was given. - Fixed a bug where
DataFrame.select
could not identify and alias columns in presence of table functions when output columns of table function overlapped with columns in dataframe.
Behavior Changes
- When creating stored procedures, UDFs, UDTFs, UDAFs with parameter
is_permanent=False
will now create temporary objects even whenstage_name
is provided. The default value ofis_permanent
isFalse
which is why if this value is not explicitly set toTrue
for permanent objects, users will notice a change in behavior. types.StructField
now enquotes column identifier by default.
Release
1.6.1 (2023-08-02)
New Features
- Added support for these new functions in
snowflake.snowpark.functions
:array_sort
sort_array
array_min
array_max
explode_outer
- Added support for pure Python packages specified via
Session.add_requirements
orSession.add_packages
. They are now usable in stored procedures and UDFs even if packages are not present on the Snowflake Anaconda channel.- Added Session parameter
custom_packages_upload_enabled
andcustom_packages_force_upload_enabled
to enable the support for pure Python packages feature mentioned above. Both parameters default toFalse
.
- Added Session parameter
- Added support for specifying package requirements by passing a Conda environment yaml file to
Session.add_requirements
. - Added support for asynchronous execution of multi-query dataframes that contain binding variables.
- Added support for renaming multiple columns in
DataFrame.rename
. - Added support for Geometry datatypes.
- Added support for
params
insession.sql()
in stored procedures. - Added support for user-defined aggregate functions (UDAFs). This feature is currently in private preview.
- Added support for vectorized UDTFs (user-defined table functions). This feature is currently in public preview.
- Added support for Snowflake Timestamp variants (i.e.,
TIMESTAMP_NTZ
,TIMESTAMP_LTZ
,TIMESTAMP_TZ
)- Added
TimestampTimezone
as an argument inTimestampType
constructor. - Added type hints
NTZ
,LTZ
,TZ
andTimestamp
to annotate functions when registering UDFs.
- Added
Improvements
- Removed redundant dependency
typing-extensions
. DataFrame.cache_result
now creates temp table fully qualified names under current database and current schema.
Bug Fixes
- Fixed a bug where type check happens on pandas before it is imported.
- Fixed a bug when creating a UDF from
numpy.ufunc
. - Fixed a bug where
DataFrame.union
was not generating the correctSelectable.schema_query
when SQL simplifier is enabled.
Behavior Changes
DataFrameWriter.save_as_table
now respects thenullable
field of the schema provided by the user or the inferred schema based on data from user input.
Dependency updates
- Updated
snowflake-connector-python
to 3.0.4.
Release
1.5.1 (2023-06-20)
New Features
- Added support for the Python 3.10 runtime environment.
Release
1.5.0 (2023-06-09)
Behavior Changes
- Aggregation results, from functions such as DataFrame.agg and DataFrame.describe, no longer strip away non-printing characters from column names.
New Features
- Added support for the Python 3.9 runtime environment.
- Added support for new functions in snowflake.snowpark.functions:
- array_generate_range
- array_unique_agg
- collect_set
- sequence
- Added support for registering and calling stored procedures with TABLE return type.
- Added support for parameter length in StringType() to specify the maximum number of characters that can be stored by the column.
- Added the alias functions.element_at() for functions.get().
- Added the alias Column.contains for functions.contains.
- Added experimental feature DataFrame.alias.
- Added support for querying metadata columns from stage when creating DataFrame using DataFrameReader.
- Added support for StructType.add to append more fields to existing StructType objects.
- Added support for parameter execute_as in StoredProcedureRegistration.register_from_file() to specify stored procedure caller rights.
Bug Fixes
- Fixed a bug where the Dataframe.join_table_function did not run all of the necessary queries to set up the join table function when SQL simplifier was enabled.
- Fixed type hint declaration for custom types - ColumnOrName, ColumnOrLiteralStr, ColumnOrSqlExpr, LiteralType and ColumnOrLiteral that were breaking mypy checks.
- Fixed a bug where DataFrameWriter.save_as_table and DataFrame.copy_into_table failed to parse fully qualified table names.
Release
1.4.0 (2023-04-24)
New Features
- Added support for
session.getOrCreate
. - Added support for alias
Column.getField
. - Added support for new functions in
snowflake.snowpark.functions
:date_add
anddate_sub
to make add and subtract operations easier.daydiff
explode
array_distinct
.regexp_extract
.struct
.format_number
.bround
.substring_index
- Added parameter
skip_upload_on_content_match
when creating UDFs, UDTFs and stored procedures usingregister_from_file
to skip uploading files to a stage if the same version of the files are already on the stage. - Added support for
DataFrame.save_as_table
method to take table names that contain dots. - Flattened generated SQL when
DataFrame.filter()
orDataFrame.order_by()
is followed by a projection statement (e.g.DataFrame.select()
,DataFrame.with_column()
). - Added support for creating dynamic tables (in private preview) using
Dataframe.create_or_replace_dynamic_table
. - Added an optional argument
params
insession.sql()
to support binding variables. Note that this is not supported in stored procedures yet.
Bug Fixes
- Fixed a bug in
strtok_to_array
where an exception was thrown when a delimiter was passed in. - Fixed a bug in
session.add_import
where the module had the same namespace as other dependencies.
Release
1.3.0 (2023-03-28)
New Features
- Added support for
delimiters
parameter infunctions.initcap()
. - Added support for
functions.hash()
to accept a variable number of input expressions. - Added API
Session.conf
for getting, setting or checking the mutability of any runtime configuration. - Added support for managing case sensitivity in
Row
results fromDataFrame.collect
usingcase_sensitive
parameter. - Added indexer support for
snowflake.snowpark.types.StructType
. - Added a keyword argument
log_on_exception
toDataframe.collect
andDataframe.collect_no_wait
to optionally disable error logging for SQL exceptions.
Bug Fixes
- Fixed a bug where a DataFrame set operation(
DataFrame.substract
,DataFrame.union
, etc.) being called after another DataFrame set operation andDataFrame.select
orDataFrame.with_column
throws an exception. - Fixed a bug where chained sort statements are overwritten by the SQL simplifier.
Improvements
- Simplified JOIN queries to use constant subquery aliases (
SNOWPARK_LEFT
,SNOWPARK_RIGHT
) by default. Users can disable this at runtime withsession.conf.set('use_constant_subquery_alias', False)
to use randomly generated alias names instead. - Allowed specifying statement parameters in
session.call()
. - Enabled the uploading of large pandas DataFrames in stored procedures by defaulting to a chunk size of 100,000 rows.
Release
1.2.0 (2023-03-02)
New Features
- Added support for displaying source code as comments in the generated scripts when registering stored procedures. This
is enabled by default, turn off by specifyingsource_code_display=False
at registration. - Added a parameter
if_not_exists
when creating a UDF, UDTF or Stored Procedure from Snowpark Python to ignore creating the specified function or procedure if it already exists. - Accept integers when calling
snowflake.snowpark.functions.get
to extract value from array. - Added
functions.reverse
in functions to open access to Snowflake built-in function
reverse. - Added parameter
require_scoped_url
in snowflake.snowflake.files.SnowflakeFile.open()(in Private Preview)
to replaceis_owner_file
is marked for deprecation.
Bug Fixes
- Fixed a bug that overwrote
paramstyle
toqmark
when creating a Snowpark session. - Fixed a bug where
df.join(..., how="cross")
fails withSnowparkJoinException: (1112): Unsupported using join type 'Cross'
. - Fixed a bug where querying a
DataFrame
column created from chained function calls used a wrong column name.
1.1.0
1.1.0 (2023-01-26)
New Features:
- Added
asc
,asc_nulls_first
,asc_nulls_last
,desc
,desc_nulls_first
,desc_nulls_last
,date_part
andunix_timestamp
in functions. - Added the property
DataFrame.dtypes
to return a list of column name and data type pairs. - Added the following aliases:
functions.expr()
forfunctions.sql_expr()
.functions.date_format()
forfunctions.to_date()
.functions.monotonically_increasing_id()
forfunctions.seq8()
functions.from_unixtime()
forfunctions.to_timestamp()
Bug Fixes:
- Fixed a bug in SQL simplifier that didn’t handle Column alias and join well in some cases. See #658 for details.
- Fixed a bug in SQL simplifier that generated wrong column names for function calls, NaN and INF.
Improvements
- The session parameter
PYTHON_SNOWPARK_USE_SQL_SIMPLIFIER
isTrue
after Snowflake 7.3 was released. In snowpark-python,session.sql_simplifier_enabled
reads the value ofPYTHON_SNOWPARK_USE_SQL_SIMPLIFIER
by default, meaning that the SQL simplfier is enabled by default after the Snowflake 7.3 release. To turn this off, setPYTHON_SNOWPARK_USE_SQL_SIMPLIFIER
in Snowflake toFalse
or runsession.sql_simplifier_enabled = False
from Snowpark. It is recommended to use the SQL simplifier because it helps to generate more concise SQL.