Skip to content

Commit

Permalink
Simplify publishing of documentation (#12892)
Browse files Browse the repository at this point in the history
Close: #11423
Close: #11152
  • Loading branch information
mik-laj authored Dec 9, 2020
1 parent 73843d0 commit e595d35
Show file tree
Hide file tree
Showing 8 changed files with 468 additions and 179 deletions.
40 changes: 39 additions & 1 deletion dev/README_RELEASE_AIRFLOW.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@
- [Publish release to SVN](#publish-release-to-svn)
- [Prepare PyPI "release" packages](#prepare-pypi-release-packages)
- [Update CHANGELOG.md](#update-changelogmd)
- [Publish documentation](#publish-documentation)
- [Notify developers of release](#notify-developers-of-release)
- [Update Announcements page](#update-announcements-page)

Expand Down Expand Up @@ -551,6 +552,43 @@ At this point we release an official package:
- Update CHANGELOG.md with the details, and commit it.
## Publish documentation
Documentation is an essential part of the product and should be made available to users.
In our cases, documentation for the released versions is published in a separate repository - [`apache/airflow-site`](https://github.com/apache/airflow-site), but the documentation source code and build tools are available in the `apache/airflow` repository, so you have to coordinate between the two repositories to be able to build the documentation.
Documentation for providers can be found in the ``/docs/apache-airflow`` directory.
- First, copy the airflow-site repository and set the environment variable ``AIRFLOW_SITE_DIRECTORY``.
```shell script
git clone https://github.com/apache/airflow-site.git airflow-site
cd airflow-site
export AIRFLOW_SITE_DIRECTORY="$(pwd)"
```
- Then you can go to the directory and build the necessary documentation packages
```shell script
cd "${AIRFLOW_REPO_ROOT}"
./breeze build-docs -- --package apache-airflow --for-production
```
- Now you can preview the documentation.
```shell script
./docs/start_doc_server.sh
```
- Copy the documentation to the ``airflow-site`` repository, create commit and push changes.
```shell script
./docs/publish_docs.py --package apache-airflow
cd "${AIRFLOW_SITE_DIRECTORY}"
git commit -m "Add documentation for Apache Airflow ${VERSION}"
git push
```
## Notify developers of release
- Notify [email protected] (cc'ing [email protected] and [email protected]) that
Expand Down Expand Up @@ -583,7 +621,7 @@ https://pypi.python.org/pypi/apache-airflow
The documentation is available on:
https://airflow.apache.org/
https://airflow.apache.org/docs/${VERSION}/
https://airflow.apache.org/docs/apache-airflow/${VERSION}/
Find the CHANGELOG here for more details:
Expand Down
53 changes: 53 additions & 0 deletions dev/README_RELEASE_PROVIDER_PACKAGES.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,7 @@
- [Build and sign the source and convenience packages](#build-and-sign-the-source-and-convenience-packages-1)
- [Commit the source packages to Apache SVN repo](#commit-the-source-packages-to-apache-svn-repo-1)
- [Publish the Regular convenience package to PyPI](#publish-the-regular-convenience-package-to-pypi)
- [Publish documentation](#publish-documentation)
- [Notify developers of release](#notify-developers-of-release)

<!-- END doctoc generated TOC please keep comment here to allow auto update -->
Expand Down Expand Up @@ -884,6 +885,58 @@ twine upload -r pypi dist/*

* Again, confirm that the packages are available under the links printed.

## Publish documentation

Documentation is an essential part of the product and should be made available to users.
In our cases, documentation for the released versions is published in a separate repository - [`apache/airflow-site`](https://github.com/apache/airflow-site), but the documentation source code and build tools are available in the `apache/airflow` repository, so you have to coordinate between the two repositories to be able to build the documentation.

Documentation for providers can be found in the `/docs/apache-airflow-providers` directory and the `/docs/apache-airflow-providers-*/` directory. The first directory contains the package contents lists and should be updated every time a new version of provider packages is released.

- First, copy the airflow-site repository and set the environment variable ``AIRFLOW_SITE_DIRECTORY``.

```shell script
git clone https://github.com/apache/airflow-site.git airflow-site
cd airflow-site
export AIRFLOW_SITE_DIRECTORY="$(pwd)"
```

- Then you can go to the directory and build the necessary documentation packages

```shell script
cd "${AIRFLOW_REPO_ROOT}"
./breeze build-docs -- \
--package apache-airflow-providers \
--package apache-airflow-providers-apache-airflow \
--package apache-airflow-providers-telegram \
--for-production
```

- Now you can preview the documentation.

```shell script
./docs/start_doc_server.sh
```

- Copy the documentation to the ``airflow-site`` repository

```shell script
./docs/publish_docs.py \
--package apache-airflow-providers \
--package apache-airflow-providers-apache-airflow \
--package apache-airflow-providers-telegram \
cd "${AIRFLOW_SITE_DIRECTORY}"
```

- If you publish a new package, you must add it to [the docs index](https://github.com/apache/airflow-site/blob/master/landing-pages/site/content/en/docs/_index.md):

- Create commit and push changes.

```shell script
git commit -m "Add documentation for backport packages - $(date "+%Y-%m-%d%n")"
git push
```

## Notify developers of release

- Notify [email protected] (cc'ing [email protected] and [email protected]) that
Expand Down
182 changes: 19 additions & 163 deletions docs/build_docs.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,44 +17,34 @@
# under the License.
import argparse
import fnmatch
import os
import re
import shlex
import shutil
import sys
from collections import defaultdict
from contextlib import contextmanager
from glob import glob
from subprocess import run
from tempfile import NamedTemporaryFile, TemporaryDirectory
from typing import Dict, List, Optional, Tuple

from tabulate import tabulate

from docs.exts.docs_build import dev_index_generator, lint_checks # pylint: disable=no-name-in-module
from docs.exts.docs_build.docs_builder import ( # pylint: disable=no-name-in-module
DOCS_DIR,
AirflowDocsBuilder,
get_available_packages,
)
from docs.exts.docs_build.errors import ( # pylint: disable=no-name-in-module
DocBuildError,
display_errors_summary,
parse_sphinx_warnings,
)
from docs.exts.docs_build.github_action_utils import with_group # pylint: disable=no-name-in-module
from docs.exts.docs_build.spelling_checks import ( # pylint: disable=no-name-in-module
SpellingError,
display_spelling_error_summary,
parse_spelling_warnings,
)
from docs.exts.provider_yaml_utils import load_package_data # pylint: disable=no-name-in-module

if __name__ != "__main__":
raise Exception(
raise SystemExit(
"This file is intended to be executed as an executable program. You cannot use it as a module."
"To run this script, run the ./build_docs.py command"
)

ROOT_PROJECT_DIR = os.path.abspath(os.path.join(os.path.dirname(os.path.realpath(__file__)), os.pardir))
ROOT_PACKAGE_DIR = os.path.join(ROOT_PROJECT_DIR, "airflow")
DOCS_DIR = os.path.join(ROOT_PROJECT_DIR, "docs")
ALL_PROVIDER_YAMLS = load_package_data()

CHANNEL_INVITATION = """\
If you need help, write to #documentation channel on Airflow's Slack.
Channel link: https://apache-airflow.slack.com/archives/CJ1LVREHX
Expand All @@ -68,150 +58,6 @@
]


@contextmanager
def with_group(title):
"""
If used in Github Action, creates an expandable group in the Github Action log.
Otherwise, dispaly simple text groups.
For more information, see:
https://docs.github.com/en/free-pro-team@latest/actions/reference/workflow-commands-for-github-actions#grouping-log-lines
"""
if os.environ.get('GITHUB_ACTIONS', 'false') != "true":
print("#" * 20, title, "#" * 20)
yield
return
print(f"::group::{title}")
yield
print("\033[0m")
print("::endgroup::")


class AirflowDocsBuilder:
"""Documentation builder for Airflow."""

def __init__(self, package_name: str):
self.package_name = package_name

@property
def _doctree_dir(self) -> str:
return f"{DOCS_DIR}/_doctrees/docs/{self.package_name}"

@property
def _out_dir(self) -> str:
if self.package_name == 'apache-airflow-providers':
# Disable versioning. This documentation does not apply to any issued product and we can update
# it as needed, i.e. with each new package of providers.
return f"{DOCS_DIR}/_build/docs/{self.package_name}"
else:
return f"{DOCS_DIR}/_build/docs/{self.package_name}/latest"

@property
def _src_dir(self) -> str:
return f"{DOCS_DIR}/{self.package_name}"

def clean_files(self) -> None:
"""Cleanup all artifacts generated by previous builds."""
api_dir = os.path.join(self._src_dir, "_api")

shutil.rmtree(api_dir, ignore_errors=True)
shutil.rmtree(self._out_dir, ignore_errors=True)
os.makedirs(api_dir, exist_ok=True)
os.makedirs(self._out_dir, exist_ok=True)

print(f"Recreated content of the {shlex.quote(self._out_dir)} and {shlex.quote(api_dir)} folders")

def check_spelling(self):
"""Checks spelling."""
spelling_errors = []
with TemporaryDirectory() as tmp_dir, with_group(f"Check spelling: {self.package_name}"):
build_cmd = [
"sphinx-build",
"-W", # turn warnings into errors
"-T", # show full traceback on exception
"-b", # builder to use
"spelling",
"-c",
DOCS_DIR,
"-d", # path for the cached environment and doctree files
self._doctree_dir,
self._src_dir, # path to documentation source files
tmp_dir,
]
print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
env = os.environ.copy()
env['AIRFLOW_PACKAGE_NAME'] = self.package_name
completed_proc = run( # pylint: disable=subprocess-run-check
build_cmd, cwd=self._src_dir, env=env
)
if completed_proc.returncode != 0:
spelling_errors.append(
SpellingError(
file_path=None,
line_no=None,
spelling=None,
suggestion=None,
context_line=None,
message=(
f"Sphinx spellcheck returned non-zero exit status: {completed_proc.returncode}."
),
)
)
warning_text = ""
for filepath in glob(f"{tmp_dir}/**/*.spelling", recursive=True):
with open(filepath) as speeling_file:
warning_text += speeling_file.read()

spelling_errors.extend(parse_spelling_warnings(warning_text, self._src_dir))
return spelling_errors

def build_sphinx_docs(self) -> List[DocBuildError]:
"""Build Sphinx documentation"""
build_errors = []
with NamedTemporaryFile() as tmp_file, with_group(f"Building docs: {self.package_name}"):
build_cmd = [
"sphinx-build",
"-T", # show full traceback on exception
"--color", # do emit colored output
"-b", # builder to use
"html",
"-d", # path for the cached environment and doctree files
self._doctree_dir,
"-c",
DOCS_DIR,
"-w", # write warnings (and errors) to given file
tmp_file.name,
self._src_dir, # path to documentation source files
self._out_dir, # path to output directory
]
print("Executing cmd: ", " ".join([shlex.quote(c) for c in build_cmd]))
env = os.environ.copy()
env['AIRFLOW_PACKAGE_NAME'] = self.package_name
completed_proc = run( # pylint: disable=subprocess-run-check
build_cmd, cwd=self._src_dir, env=env
)
if completed_proc.returncode != 0:
build_errors.append(
DocBuildError(
file_path=None,
line_no=None,
message=f"Sphinx returned non-zero exit status: {completed_proc.returncode}.",
)
)
tmp_file.seek(0)
warning_text = tmp_file.read().decode()
# Remove 7-bit C1 ANSI escape sequences
warning_text = re.sub(r"\x1B[@-_][0-?]*[ -/]*[@-~]", "", warning_text)
build_errors.extend(parse_sphinx_warnings(warning_text, self._src_dir))
return build_errors


def get_available_packages():
"""Get list of all available packages to build."""
provider_package_names = [provider['package-name'] for provider in ALL_PROVIDER_YAMLS]
return ["apache-airflow", *provider_package_names, "apache-airflow-providers"]


def _get_parser():
available_packages_list = " * " + "\n * ".join(get_available_packages())
parser = argparse.ArgumentParser(
Expand All @@ -233,18 +79,25 @@ def _get_parser():
parser.add_argument(
'--spellcheck-only', dest='spellcheck_only', action='store_true', help='Only perform spellchecking'
)
parser.add_argument(
'--for-production',
dest='for_production',
action='store_true',
help=('Builds documentation for official release i.e. all links point to stable version'),
)

return parser


def build_docs_for_packages(
current_packages: List[str], docs_only: bool, spellcheck_only: bool
current_packages: List[str], docs_only: bool, spellcheck_only: bool, for_production: bool
) -> Tuple[Dict[str, List[DocBuildError]], Dict[str, List[SpellingError]]]:
"""Builds documentation for single package and returns errors"""
all_build_errors: Dict[str, List[DocBuildError]] = defaultdict(list)
all_spelling_errors: Dict[str, List[SpellingError]] = defaultdict(list)
for package_name in current_packages:
print("#" * 20, package_name, "#" * 20)
builder = AirflowDocsBuilder(package_name=package_name)
builder = AirflowDocsBuilder(package_name=package_name, for_production=for_production)
builder.clean_files()
if not docs_only:
spelling_errors = builder.check_spelling()
Expand Down Expand Up @@ -309,6 +162,7 @@ def main():
spellcheck_only = args.spellcheck_only
disable_checks = args.disable_checks
package_filters = args.package_filter
for_production = args.for_production

print("Current package filters: ", package_filters)
current_packages = (
Expand All @@ -326,6 +180,7 @@ def main():
current_packages=current_packages,
docs_only=docs_only,
spellcheck_only=spellcheck_only,
for_production=for_production,
)
if package_build_errors:
all_build_errors.update(package_build_errors)
Expand All @@ -347,6 +202,7 @@ def main():
current_packages=to_retry_packages,
docs_only=docs_only,
spellcheck_only=spellcheck_only,
for_production=for_production,
)
if package_build_errors:
all_build_errors.update(package_build_errors)
Expand Down
Loading

0 comments on commit e595d35

Please sign in to comment.