Skip to content

Commit

Permalink
refactor!: rename python package to google-spark-connect (#25)
Browse files Browse the repository at this point in the history
Release-As: 0.4.0
  • Loading branch information
isha97 authored Jan 29, 2025
1 parent beeaa98 commit 357d1fe
Show file tree
Hide file tree
Showing 4 changed files with 27 additions and 12 deletions.
18 changes: 9 additions & 9 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Dataproc Spark Connect Client
# Google Spark Connect Client

A wrapper of the Apache [Spark Connect](https://spark.apache.org/spark-connect/) client with
additional functionalities that allow applications to communicate with a remote Dataproc
Expand All @@ -8,13 +8,13 @@ Spark cluster using the Spark Connect protocol without requiring additional step

.. code-block:: console

pip install dataproc_spark_connect
pip install google_spark_connect

## Uninstall

.. code-block:: console

pip uninstall dataproc_spark_connect
pip uninstall google_spark_connect


## Setup
Expand All @@ -28,12 +28,12 @@ If you are running the client outside of Google Cloud, you must set following en

## Usage

1. Install the latest version of Dataproc Python client and Dataproc Spark Connect modules:
1. Install the latest version of Dataproc Python client and Google Spark Connect modules:

.. code-block:: console

pip install google_cloud_dataproc --force-reinstall
pip install dataproc_spark_connect --force-reinstall
pip install google_spark_connect --force-reinstall

2. Add the required import into your PySpark application or notebook:

Expand Down Expand Up @@ -85,14 +85,14 @@ This will happen even if you are running the client from a non-GCE instance.

.. code-block:: console

VERSION=<version> gsutil cp dist/dataproc_spark_connect-${VERSION}-py2.py3-none-any.whl gs://<your_bucket_name>
VERSION=<version> gsutil cp dist/google_spark_connect-${VERSION}-py2.py3-none-any.whl gs://<your_bucket_name>

4. Download the new SDK on Vertex, then uninstall the old version and install the new one.

.. code-block:: console

%%bash
export VERSION=<version>
gsutil cp gs://<your_bucket_name>/dataproc_spark_connect-${VERSION}-py2.py3-none-any.whl .
yes | pip uninstall dataproc_spark_connect
pip install dataproc_spark_connect-${VERSION}-py2.py3-none-any.whl
gsutil cp gs://<your_bucket_name>/google_spark_connect-${VERSION}-py2.py3-none-any.whl .
yes | pip uninstall google_spark_connect
pip install google_spark_connect-${VERSION}-py2.py3-none-any.whl
15 changes: 15 additions & 0 deletions google/cloud/spark_connect/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,4 +11,19 @@
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
import importlib.metadata
import warnings

from .session import GoogleSparkSession

old_package_name = "dataproc-spark-connect"
current_package_name = "google-spark-connect"
try:
importlib.metadata.distribution(old_package_name)
warnings.warn(
f"Package '{old_package_name}' is already installed in your environment. "
f"This might cause conflicts with '{current_package_name}'. "
f"Consider uninstalling '{old_package_name}' and only install '{current_package_name}'."
)
except:
pass
2 changes: 1 addition & 1 deletion google/cloud/spark_connect/session.py
Original file line number Diff line number Diff line change
Expand Up @@ -57,7 +57,7 @@ class GoogleSparkSession(SparkSession):
Examples
--------
Create a Spark session with Dataproc Spark Connect.
Create a Spark session with Google Spark Connect.
>>> spark = (
... GoogleSparkSession.builder
Expand Down
4 changes: 2 additions & 2 deletions setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@


setup(
name="dataproc-spark-connect",
name="google-spark-connect",
version="0.2.0",
description="Dataproc client library for Spark Connect",
description="Google client library for Spark Connect",
long_description=long_description,
author="Google LLC",
url="https://github.com/GoogleCloudDataproc/dataproc-spark-connect-python",
Expand Down

0 comments on commit 357d1fe

Please sign in to comment.