Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added interactive installation wizard #184

Merged
merged 2 commits into from
Sep 11, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 4 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -145,4 +145,7 @@ cython_debug/
# dev files and scratches
dev/cleanup.py

Support
Support

.databricks
.vscode
26 changes: 25 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,30 @@
# UCX - Unity Catalog Migration Toolkit

This repo contains various functions and utilities for UC Upgrade.
Your best companion for enabling the Unity Catalog.

## Installation

The `./install.sh` script will guide you through installation process. Make sure you have Python 3.10 (or greater)
installed on your workstation, and you've configured authentication for
the [Databricks Workspace](https://databricks-sdk-py.readthedocs.io/en/latest/authentication.html#default-authentication-flow).

![install wizard](./examples/ucx-install.gif)

The easiest way to install and authenticate is through a [Databricks configuration profile](https://docs.databricks.com/en/dev-tools/auth.html#databricks-client-unified-authentication):

```shell
export DATABRICKS_CONFIG_PROFILE=ABC
./install.sh
```

You can also specify environment variables in a more direct way, like in this example for installing
on a Azure Databricks Workspace using the Azure CLI authentication:

```shell
az login
export DATABRICKS_HOST=https://adb-123....azuredatabricks.net/
./install.sh
```

## Latest working version and how-to

Expand Down
119 changes: 0 additions & 119 deletions bin/install.py

This file was deleted.

Binary file added examples/ucx-install.gif
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
64 changes: 64 additions & 0 deletions install.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
#!/bin/bash

# This script will eventually be replaced with `databricks labs install ucx` command.

# Initialize an empty array to store Python 3 binary paths
python3_binaries=()

# Split the $PATH variable into an array using ':' as the delimiter
IFS=':' read -ra path_dirs <<< "$PATH"

# Iterate over each directory in the $PATH
for dir in "${path_dirs[@]}"; do
# Construct the full path to the python3 binary in the current directory
python3_path="${dir}/python3"

# Check if the python3 binary exists and is executable
if [ -x "$python3_path" ]; then
python3_binaries+=("$python3_path")
fi
done

if [ -z "${python3_binaries[*]}" ]; then
echo "[!] No Python binaries detected"
exit 1
fi

# Check versions for all Python binaries found
python_versions=()
for python_binary in "${python3_binaries[@]}"; do
python_version=$("$python_binary" --version | awk '{print $2}')
python_versions+=("$python_version -> $(realpath "$python_binary")")
done

IFS=$'\n' python_versions=($(printf "%s\n" "${python_versions[@]}" | sort -V))

py="/dev/null"
for version_and_binary in "${python_versions[@]}"; do
echo "[i] found Python $version_and_binary"
IFS=" -> " read -ra parts <<< "$version_and_binary"
py="${parts[2]}"
done

echo "[i] latest python is $py"

tmp_dir=$(mktemp -d)

# Create isolated Virtualenv with the latest Python version
# in the ephemeral temporary directory
$py -m venv "$tmp_dir"

. "$tmp_dir/bin/activate"

# Use the Python from Virtualenv
py="$tmp_dir/bin/python"

echo "[+] installing dependencies within ephemeral Virtualenv: $tmp_dir"
# Install all project dependencies, so that installer can proceed
$py -m pip install --quiet -e .

# Invoke python module of the install app directly,
# without console_scripts entrypoint
$py -m databricks.labs.ucx.cli.app install

rm -r "$tmp_dir"
17 changes: 17 additions & 0 deletions src/databricks/labs/ucx/cli/app.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,30 @@
import logging
import os
from pathlib import Path
from typing import Annotated

import typer
from databricks.sdk import WorkspaceClient
from typer import Typer

from databricks.labs.ucx.__about__ import __version__
from databricks.labs.ucx.logger import _install

_install()
logging.root.setLevel("INFO")
logger = logging.getLogger(__name__)

app = Typer(name="UC Migration Toolkit", pretty_exceptions_show_locals=True)


@app.command()
def install():
from databricks.labs.ucx.install import main

ws = WorkspaceClient(product="ucx", product_version=__version__)
main(ws, verbose=False)


@app.command()
def migrate_groups(config_file: Annotated[Path, typer.Argument(help="Path to config file")] = "migration_config.yml"):
from databricks.labs.ucx.config import MigrationConfig
Expand Down
15 changes: 14 additions & 1 deletion src/databricks/labs/ucx/config.py
Original file line number Diff line number Diff line change
Expand Up @@ -82,6 +82,10 @@ def from_dict(cls, raw: dict):
return cls(**raw)


# Used to set the right expectation about configuration file schema
_CONFIG_VERSION = 1


@dataclass
class MigrationConfig:
inventory_database: str
Expand Down Expand Up @@ -112,10 +116,19 @@ def inner(x):
return dict(result)
return x

return inner(self)
serialized = inner(self)
serialized["version"] = _CONFIG_VERSION
return serialized

@classmethod
def from_dict(cls, raw: dict) -> "MigrationConfig":
stored_version = raw.get("version", None)
if stored_version != _CONFIG_VERSION:
msg = (
f"Unsupported config version: {stored_version}. "
f"UCX v{__version__} expects config version to be {_CONFIG_VERSION}"
)
raise ValueError(msg)
return cls(
inventory_database=raw.get("inventory_database"),
tacl=TaclConfig.from_dict(raw.get("tacl", {})),
Expand Down
Loading