Skip to content

Releases: snowflakedb/snowpark-python

Release

16 Oct 18:51
5a01033
Compare
Choose a tag to compare

1.9.0 (2023-10-13)

New Features

  • Added support for the Python 3.11 runtime environment.

Dependency updates

  • Added back the dependency of typing-extensions.

Bug Fixes

  • Fixed a bug where imports from permanent stage locations were ignored for temporary stored procedures, UDTFs, UDFs, and UDAFs.
  • Revert back to using CTAS (create table as select) statement for Dataframe.writer.save_as_table which does not need insert permission for writing tables.

New Features

  • Support PythonObjJSONEncoder json-serializable objects for ARRAY and OBJECT literals.

Release

15 Sep 20:47
100071a
Compare
Choose a tag to compare

1.8.0 (2023-09-14)

New Features

  • Added support for VOLATILE/IMMUTABLE keyword when registering UDFs.
  • Added support for specifying clustering keys when saving dataframes using DataFrame.save_as_table.
  • Accept Iterable objects input for schema when creating dataframes using Session.create_dataframe.
  • Added the property DataFrame.session to return a Session object.
  • Added the property Session.session_id to return an integer that represents session ID.
  • Added the property Session.connection to return a SnowflakeConnection object .
  • Added support for creating a Snowpark session from a configuration file or environment variables.

Dependency updates

  • Updated snowflake-connector-python to 3.2.0.

Bug Fixes

  • Fixed a bug where automatic package upload would raise ValueError even when compatible package version were added in session.add_packages.
  • Fixed a bug where table stored procedures were not registered correctly when using register_from_file.
  • Fixed a bug where dataframe joins failed with invalid_identifier error.
  • Fixed a bug where DataFrame.copy disables SQL simplfier for the returned copy.
  • Fixed a bug where session.sql().select() would fail if any parameters are specified to session.sql().

Release

28 Aug 20:43
513fad6
Compare
Choose a tag to compare

1.7.0 (2023-08-28)

New Features

  • Added parameters external_access_integrations and secrets when creating a UDF, UDTF or Stored Procedure from Snowpark Python to allow integration with external access.
  • Added support for these new functions in snowflake.snowpark.functions:
    • array_flatten
    • flatten
  • Added support for apply_in_pandas in snowflake.snowpark.relational_grouped_dataframe.
  • Added support for replicating your local Python environment on Snowflake via Session.replicate_local_environment.

Bug Fixes

  • Fixed a bug where session.create_dataframe fails to properly set nullable columns where nullability was affected by order or data was given.
  • Fixed a bug where DataFrame.select could not identify and alias columns in presence of table functions when output columns of table function overlapped with columns in dataframe.

Behavior Changes

  • When creating stored procedures, UDFs, UDTFs, UDAFs with parameter is_permanent=False will now create temporary objects even when stage_name is provided. The default value of is_permanent is False which is why if this value is not explicitly set to True for permanent objects, users will notice a change in behavior.
  • types.StructField now enquotes column identifier by default.

Release

03 Aug 01:28
243bf10
Compare
Choose a tag to compare

1.6.1 (2023-08-02)

New Features

  • Added support for these new functions in snowflake.snowpark.functions:
    • array_sort
    • sort_array
    • array_min
    • array_max
    • explode_outer
  • Added support for pure Python packages specified via Session.add_requirements or Session.add_packages. They are now usable in stored procedures and UDFs even if packages are not present on the Snowflake Anaconda channel.
    • Added Session parameter custom_packages_upload_enabled and custom_packages_force_upload_enabled to enable the support for pure Python packages feature mentioned above. Both parameters default to False.
  • Added support for specifying package requirements by passing a Conda environment yaml file to Session.add_requirements.
  • Added support for asynchronous execution of multi-query dataframes that contain binding variables.
  • Added support for renaming multiple columns in DataFrame.rename.
  • Added support for Geometry datatypes.
  • Added support for params in session.sql() in stored procedures.
  • Added support for user-defined aggregate functions (UDAFs). This feature is currently in private preview.
  • Added support for vectorized UDTFs (user-defined table functions). This feature is currently in public preview.
  • Added support for Snowflake Timestamp variants (i.e., TIMESTAMP_NTZ, TIMESTAMP_LTZ, TIMESTAMP_TZ)
    • Added TimestampTimezone as an argument in TimestampType constructor.
    • Added type hints NTZ, LTZ, TZ and Timestamp to annotate functions when registering UDFs.

Improvements

  • Removed redundant dependency typing-extensions.
  • DataFrame.cache_result now creates temp table fully qualified names under current database and current schema.

Bug Fixes

  • Fixed a bug where type check happens on pandas before it is imported.
  • Fixed a bug when creating a UDF from numpy.ufunc.
  • Fixed a bug where DataFrame.union was not generating the correct Selectable.schema_query when SQL simplifier is enabled.

Behavior Changes

  • DataFrameWriter.save_as_table now respects the nullable field of the schema provided by the user or the inferred schema based on data from user input.

Dependency updates

  • Updated snowflake-connector-python to 3.0.4.

Release

21 Jun 18:13
c6e9b56
Compare
Choose a tag to compare

1.5.1 (2023-06-20)

New Features

  • Added support for the Python 3.10 runtime environment.

Release

13 Jun 23:21
14456f6
Compare
Choose a tag to compare

1.5.0 (2023-06-09)

Behavior Changes

  • Aggregation results, from functions such as DataFrame.agg and DataFrame.describe, no longer strip away non-printing characters from column names.

New Features

  • Added support for the Python 3.9 runtime environment.
  • Added support for new functions in snowflake.snowpark.functions:
    • array_generate_range
    • array_unique_agg
    • collect_set
    • sequence
  • Added support for registering and calling stored procedures with TABLE return type.
  • Added support for parameter length in StringType() to specify the maximum number of characters that can be stored by the column.
  • Added the alias functions.element_at() for functions.get().
  • Added the alias Column.contains for functions.contains.
  • Added experimental feature DataFrame.alias.
  • Added support for querying metadata columns from stage when creating DataFrame using DataFrameReader.
  • Added support for StructType.add to append more fields to existing StructType objects.
  • Added support for parameter execute_as in StoredProcedureRegistration.register_from_file() to specify stored procedure caller rights.

Bug Fixes

  • Fixed a bug where the Dataframe.join_table_function did not run all of the necessary queries to set up the join table function when SQL simplifier was enabled.
  • Fixed type hint declaration for custom types - ColumnOrName, ColumnOrLiteralStr, ColumnOrSqlExpr, LiteralType and ColumnOrLiteral that were breaking mypy checks.
  • Fixed a bug where DataFrameWriter.save_as_table and DataFrame.copy_into_table failed to parse fully qualified table names.

Release

24 Apr 22:21
1d6973e
Compare
Choose a tag to compare

1.4.0 (2023-04-24)

New Features

  • Added support for session.getOrCreate.
  • Added support for alias Column.getField.
  • Added support for new functions in snowflake.snowpark.functions:
    • date_add and date_sub to make add and subtract operations easier.
    • daydiff
    • explode
    • array_distinct.
    • regexp_extract.
    • struct.
    • format_number.
    • bround.
    • substring_index
  • Added parameter skip_upload_on_content_match when creating UDFs, UDTFs and stored procedures using register_from_file to skip uploading files to a stage if the same version of the files are already on the stage.
  • Added support for DataFrame.save_as_table method to take table names that contain dots.
  • Flattened generated SQL when DataFrame.filter() or DataFrame.order_by() is followed by a projection statement (e.g. DataFrame.select(), DataFrame.with_column()).
  • Added support for creating dynamic tables (in private preview) using Dataframe.create_or_replace_dynamic_table.
  • Added an optional argument params in session.sql() to support binding variables. Note that this is not supported in stored procedures yet.

Bug Fixes

  • Fixed a bug in strtok_to_array where an exception was thrown when a delimiter was passed in.
  • Fixed a bug in session.add_import where the module had the same namespace as other dependencies.

Release

29 Mar 00:59
667ea4e
Compare
Choose a tag to compare

1.3.0 (2023-03-28)

New Features

  • Added support for delimiters parameter in functions.initcap().
  • Added support for functions.hash() to accept a variable number of input expressions.
  • Added API Session.conf for getting, setting or checking the mutability of any runtime configuration.
  • Added support for managing case sensitivity in Row results from DataFrame.collect using case_sensitive parameter.
  • Added indexer support for snowflake.snowpark.types.StructType.
  • Added a keyword argument log_on_exception to Dataframe.collect and Dataframe.collect_no_wait to optionally disable error logging for SQL exceptions.

Bug Fixes

  • Fixed a bug where a DataFrame set operation(DataFrame.substract, DataFrame.union, etc.) being called after another DataFrame set operation and DataFrame.select or DataFrame.with_column throws an exception.
  • Fixed a bug where chained sort statements are overwritten by the SQL simplifier.

Improvements

  • Simplified JOIN queries to use constant subquery aliases (SNOWPARK_LEFT, SNOWPARK_RIGHT) by default. Users can disable this at runtime with session.conf.set('use_constant_subquery_alias', False) to use randomly generated alias names instead.
  • Allowed specifying statement parameters in session.call().
  • Enabled the uploading of large pandas DataFrames in stored procedures by defaulting to a chunk size of 100,000 rows.

Release

03 Mar 01:37
04ce69d
Compare
Choose a tag to compare

1.2.0 (2023-03-02)

New Features

  • Added support for displaying source code as comments in the generated scripts when registering stored procedures. This
    is enabled by default, turn off by specifying source_code_display=False at registration.
  • Added a parameter if_not_exists when creating a UDF, UDTF or Stored Procedure from Snowpark Python to ignore creating the specified function or procedure if it already exists.
  • Accept integers when calling snowflake.snowpark.functions.get to extract value from array.
  • Added functions.reverse in functions to open access to Snowflake built-in function
    reverse.
  • Added parameter require_scoped_url in snowflake.snowflake.files.SnowflakeFile.open() (in Private Preview) to replace is_owner_file is marked for deprecation.

Bug Fixes

  • Fixed a bug that overwrote paramstyle to qmark when creating a Snowpark session.
  • Fixed a bug where df.join(..., how="cross") fails with SnowparkJoinException: (1112): Unsupported using join type 'Cross'.
  • Fixed a bug where querying a DataFrame column created from chained function calls used a wrong column name.

1.1.0

27 Jan 05:44
dc1e0c8
Compare
Choose a tag to compare

1.1.0 (2023-01-26)

New Features:

  • Added asc, asc_nulls_first, asc_nulls_last, desc, desc_nulls_first, desc_nulls_last, date_part and unix_timestamp in functions.
  • Added the property DataFrame.dtypes to return a list of column name and data type pairs.
  • Added the following aliases:
    • functions.expr() for functions.sql_expr().
    • functions.date_format() for functions.to_date().
    • functions.monotonically_increasing_id() for functions.seq8()
    • functions.from_unixtime() for functions.to_timestamp()

Bug Fixes:

  • Fixed a bug in SQL simplifier that didn’t handle Column alias and join well in some cases. See #658 for details.
  • Fixed a bug in SQL simplifier that generated wrong column names for function calls, NaN and INF.

Improvements

  • The session parameter PYTHON_SNOWPARK_USE_SQL_SIMPLIFIER is True after Snowflake 7.3 was released. In snowpark-python, session.sql_simplifier_enabled reads the value of PYTHON_SNOWPARK_USE_SQL_SIMPLIFIER by default, meaning that the SQL simplfier is enabled by default after the Snowflake 7.3 release. To turn this off, set PYTHON_SNOWPARK_USE_SQL_SIMPLIFIER in Snowflake to False or run session.sql_simplifier_enabled = False from Snowpark. It is recommended to use the SQL simplifier because it helps to generate more concise SQL.