Skip to content

Commit

Permalink
Merge branch 'main' into anya/helpPage
Browse files Browse the repository at this point in the history
  • Loading branch information
atbida authored Dec 13, 2024
2 parents a1a96b0 + a6b78a9 commit 3f6f548
Show file tree
Hide file tree
Showing 5 changed files with 258 additions and 14 deletions.
15 changes: 8 additions & 7 deletions docs/architecture/self-hosted/download-logs.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,15 @@ tags:
- support logs
---

As an admin user, you can download your environment logs from Prophecy without needing access to your Prophecy cluster or the assistance of Prophecy Support. This reduces the delay in debugging any issues with your Prophecy services.
:::info
This feature requires Prophecy 3.4.1.0 or later.
:::

Prophecy lets you download logs to help debug issues with your Prophecy services. As an admin, you can download environment logs directly from Prophecy without needing access to your Prophecy cluster or assistance from Prophecy Support.

## Use case
![Download logs](../../release_notes/2024/img/nov-download-logs.png)

- The overarching objective is to debug what is going on with your Prophecy services.
- To achieve this, we've enabled admins to be able to download Prophecy logs and environment information so that they can upload them to Zendesk.
This streamlines the process and reduces delays in troubleshooting by enabling you to quickly share logs and environment details with Zendesk for further analysis.

### Download logs features

Expand All @@ -24,9 +27,7 @@ You can use the Download Logs feature to capture logs using the Services and tim

The captured logs include all relevant Prophecy configurations, such as the following items:

- Kubernetes cluster configuration
- Resource quotas
- Node configuration
- Kubernetes cluster configuration, including resource quotas and node configuration
- Cluster custom resources
- Config maps and files
- Resource consumption logs
Expand Down
4 changes: 4 additions & 0 deletions docs/architecture/self-hosted/self-hosted.md
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,10 @@ There are a couple of ways to install the self-managed version of Prophecy:

Once Prophecy is installed, you'll have to manually perform upgrades, backups, restores, etc.

:::info
For personalized help, contact [support](https://prophecy.zendesk.com/). If applicable, send along your [logs](download-logs.md) to help us troubleshoot.
:::

## Spark

After installation, Prophecy requires the following for interactive development:
Expand Down
14 changes: 7 additions & 7 deletions docs/deployment/prophecy-build-tool/pbt-github-actions.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,15 @@ tags:

## [Example GitHub Repo](https://github.com/prophecy-samples/external-cicd-template)

## Integrating with GitHub Actions
## Integrate with GitHub Actions

PBT can be integrated with your own CI/CD solution to build, test and deploy Prophecy code. The steps for setting up PBT with GitHub Actions on your repository containing a Prophecy project are mentioned below.
PBT can be integrated with your own CI/CD solution to build, test, and deploy Prophecy code. The steps for setting up PBT with GitHub Actions on your repository containing a Prophecy project are mentioned below.

### Pre-requisite

- A Prophecy project that is currently hosted in a GitHub repository

### Setting up environment variables and secrets
### Set up environment variables and secrets

PBT requires environment variables **DATABRICKS_URL** and **DATABRICKS_TOKEN** to be set for complete functionality.

Expand All @@ -39,7 +39,7 @@ Steps:
Screenshot after setting DATABRICKS_TOKEN secret:
![GitHub Actions Secret addition](img/pbt-github-secret.png)

The environment variables can now be all set within the GitHub actions YML file as follows:
The environment variables can also be set within the GitHub actions YML file as follows:

```yaml
env:
Expand All @@ -49,18 +49,18 @@ DATABRICKS_TOKEN: ${{ secrets.DATABRICKS_TOKEN }}
The complete YML file definition is discussed in the next section.
### Setting up a GitHub Actions Workflow on every push to prod branch
### Set up a GitHub Actions Workflow on every push to prod branch
We’re now ready to setup CI/CD on the Prophecy project.
To setup a workflow to build, run all unit tests and then deploy the built jar (Scala)/ whl (Python) on Databricks on every push to the `prod` branch automatically:

- Create a .YML file in the project repository at the below location (relative to root)
- Create a .YML file in the project repository at the below location (relative to root):

```
.github/workflows/exampleWorkflow.yml
```

- Add the below contents to **exampleWorkflow.yml**
- Add the below contents to `exampleWorkflow.yml`:

```yaml
name: Example CI/CD with GitHub actions
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
239 changes: 239 additions & 0 deletions docs/metadata/lineage/lineage-extractor.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,239 @@
---
title: Lineage extractor
id: lineage-extractor
description: Prophecy lineage extractor on GitHub Actions
sidebar_position: 4
tags:
- metadata
- lineage
- extractor
- github actions
---

The Prophecy lineage extractor tool extracts lineage information from Prophecy projects and Pipelines. It allows you to specify a project, Pipeline, and branch, and outputs the extracted lineage to a specified directory. You can also optionally set up email notifications.

## Python command

```
python -m prophecy_lineage_extractor --project-id <PROJECT_ID> --pipeline-id <PIPELINE_ID> --output-dir <OUTPUT_DIRECTORY> [--send-email] [--branch <BRANCH_NAME>]
```

### Arguments

| Argument | Type | Description | Required |
| :-------------- | :--- | :------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :------- |
| `--project-id` | str | Prophecy Project ID | True |
| `--pipeline-id` | str | Prophecy Pipeline ID | True |
| `--output-dir` | str | Output directory inside the project where lineage files will be stored | True |
| `--send-email` | flag | If specified, sends an email with the generated lineage report to the environment variable `RECEIVER_EMAIL`. You must set the following environment variables for this option if passed: <br /><br /> <ul><li>`SMTP_HOST`</li><li>`SMTP_PORT`</li><li>`SMTP_USERNAME`</li><li>`SMTP_PASSWORD`</li><li>`RECEIVER_EMAIL`</li></ul> | False |
| `--branch` | str | Branch to run the lineage extractor on. <br /> The default branch in Prophecy is generally 'main'. | True |

## Integrate with GitHub Actions

The lineage extractor can be integrated with your GitHub Actions. The steps for setting up the lineage extractor with GitHub Actions on your repository containing a Prophecy project are mentioned below.

### Prerequisite

- A Prophecy project that is currently hosted in a GitHub repository

### Set up environment variables and secrets

The lineage extractor requires environment variables `PROPHECY_URL` and `PROPHECY_PAT` to be set for complete functionality.

Optionally, if you choose to set up email notifications, you must also set secrets for your `SMTP_USERNAME` and `SMTP_PASSWORD`.

These environment variables can be set as secrets inside the GitHub repository of the project. For more information, see [Set up environment variables and secrets](../../deployment/prophecy-build-tool/pbt-github-actions.md#set-up-environment-variables-and-secrets).

The environment variables can also be set within the GitHub actions YML file as follows:

```yaml
env:
PROPHECY_PAT: ${{ secrets.PROPHECY_PAT }}
SMTP_USERNAME: ${{ secrets.SMTP_USERNAME}}
SMTP_PASSWORD: ${{ secrets.SMTP_PASSWORD }}
```
The complete YML file definition is discussed in the next section.
### Run the lineage extractor
We’re now ready to run the lineage extractor on the Prophecy project.
To run the extractor, use the following example with your own environment variables:
:::note
You only need to provide SMTP credentials if you plan to pass the `--send-email` argument.

:::

```
export PROPHECY_URL=https://app.prophecy.io
export PROPHECY_PAT=${{ secrets.PROPHECY_PAT }}
export SMTP_HOST=smtp.gmail.com
export SMTP_PORT=587
export SMTP_USERNAME=${{ secrets.SMTP_USERNAME }}
export SMTP_PASSWORD=${{ secrets.SMTP_PASSWORD }}
export [email protected]
python -m prophecy_lineage_extractor --project-id 36587 --pipeline-id 36587/pipelines/customer_orders_demo --send-email --branch dev
```

- Create a .YML file in the project repository at the below location (relative to root):

```
.github/workflows/prophecy_lineage_extractor.yml
```

- Add the below contents with your own environment variables to `prophecy_lineage_extractor.yml`:

<details>
<summary>On the default branch</summary>

```
name: Run Prophecy Lineage extractor on main
on:
push:
branches:
- main # Trigger on merge to the main branch
paths:
- 'datasets/**'
- 'pipelines/**'
- 'pbt_project.yml'
- '.github/workflows/prophecy_lineage_extractor.yml'
permissions:
contents: write
jobs:
extract-and-mail-prophecy-lineage:
runs-on: ubuntu-latest
env:
OUTPUT_DIR: "output"
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.9' # Adjust Python version as needed
- name: Install Package from PyPI
run: |
pip install --no-cache-dir prophecy-lineage-extractor
- name: Extract and Send Prophecy Lineage
env:
PROPHECY_URL: "https://app.prophecy.io"
MONITOR_TIME_ENV: ${{ vars.MONITOR_TIME_ENV }}
PROPHECY_PAT: ${{ secrets.PROPHECY_PAT }}
SMTP_HOST: "smtp.gmail.com"
SMTP_PORT: "587"
SMTP_USERNAME: ${{ secrets.SMTP_USERNAME }}
SMTP_PASSWORD: ${{ secrets.SMTP_PASSWORD }}
RECEIVER_EMAIL: "[email protected]"
run: |
python -m prophecy_lineage_extractor --project-id 36587 --pipeline-id 36587/pipelines/customer_orders_demo --send-email --output-dir $OUTPUT_DIR
- name: Commit file to output directory
env:
GIT_COMMIT: ${{ vars.GIT_COMMIT }} # whether to commit output file to github
run: |
# set this in secret to enable git commits
echo "Output Directory: '$OUTPUT_DIR'"
if [[ $GIT_COMMIT == "1" ]]; then
git config --global user.name 'pateash'
git config --global user.email '[email protected]'
echo "Commiting enabled, adding output file"
git add $OUTPUT_DIR/*
echo "========================================"
git commit -m "[Github Action: main]: Adding excel lineage report"
echo "========================================"
echo "Pushing Changes to git"
git push
else
# simple version are created manually from code edits.
echo "Commiting to git is not enabled"
fi
```

</details>

<details>
<summary> On a custom branch</summary>

```
name: Run Prophecy Lineage extractor on dev
on:
push:
branches:
- dev # Trigger on merge to the dev branch
paths:
- 'datasets/**'
- 'pipelines/**'
- 'pbt_project.yml'
- '.github/workflows/prophecy_lineage_extractor_dev.yml'
permissions:
contents: write
jobs:
extract-and-mail-prophecy-lineage:
runs-on: ubuntu-latest
env:
OUTPUT_DIR: "output_dev"
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.9' # Adjust Python version as needed
- name: Install Package from PyPI
run: |
pip install --no-cache-dir prophecy-lineage-extractor
- name: Extract and Send Prophecy Lineage
env:
PROPHECY_URL: "https://app.prophecy.io"
MONITOR_TIME_ENV: ${{ vars.MONITOR_TIME_ENV }}
PROPHECY_PAT: ${{ secrets.PROPHECY_PAT }}
SMTP_HOST: "smtp.gmail.com"
SMTP_PORT: "587"
SMTP_USERNAME: ${{ secrets.SMTP_USERNAME }}
SMTP_PASSWORD: ${{ secrets.SMTP_PASSWORD }}
RECEIVER_EMAIL: "[email protected]"
run: |
python -m prophecy_lineage_extractor --project-id 36587 --pipeline-id 36587/pipelines/customer_orders_demo --send-email --output-dir $OUTPUT_DIR --branch dev
- name: Commit file to output directory
env:
GIT_COMMIT: ${{ vars.GIT_COMMIT }} # Reference the GitHub variable here
run: |
# set this in secret to enable git commits
echo "output dir '$OUTPUT_DIR'"
if [[ $GIT_COMMIT == "1" ]]; then
git config --global user.name 'pateash'
git config --global user.email '[email protected]'
echo "Commiting enabled, adding output file"
git add $OUTPUT_DIR/*
echo "========================================"
git commit -m "[Github Action: dev]: Adding excel lineage report"
echo "========================================"
echo "Pushing Changes to git"
git push
else
# simple version are created manually from code edits.
echo "Commiting to git is not enabled"
fi
```

</details>

## Output example

The lineage extractor output is in the form of an XLSX file.

![Lineage extractor output](./img/prophecy-lineage-report-for-pipeline.png)

0 comments on commit 3f6f548

Please sign in to comment.