Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chore: bake synthtool and owl-bot into library_generation docker image #2615

Merged
merged 41 commits into from
May 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
41 commits
Select commit Hold shift + click to select a range
0b65f15
chore: bake synthtool into library_generation docker image
diegomarquezp Apr 2, 2024
a6bb064
remove synthtool usage
diegomarquezp Apr 2, 2024
d4f27e2
do not use config file
diegomarquezp Apr 2, 2024
19a0b04
bake synthtool in dockerfile
diegomarquezp Apr 2, 2024
bc693b5
bake copy-code into the image
diegomarquezp Apr 2, 2024
f335cbb
remove unnecessary SHELL statement
diegomarquezp Apr 2, 2024
aab50ef
add info for installing synthtool
diegomarquezp Apr 2, 2024
f97cdca
modify owl-bot usage in postprocess_library
diegomarquezp Apr 2, 2024
d68c14a
Merge remote-tracking branch 'origin/main' into transfer-synthtool-java
diegomarquezp Apr 10, 2024
eca0d92
fix synthtool installation
diegomarquezp Apr 10, 2024
6aa8533
change permissions to script folder
diegomarquezp Apr 10, 2024
53c5dea
assume location of generation_config.yaml
diegomarquezp Apr 10, 2024
cc4309b
fix permissions
diegomarquezp Apr 10, 2024
7a4b6b8
fix path to config yaml
diegomarquezp Apr 10, 2024
4838d5b
use entrypoint for docker image
diegomarquezp Apr 11, 2024
1b01713
format
diegomarquezp Apr 11, 2024
c149461
finish development guide
diegomarquezp Apr 11, 2024
4ee1692
remove olwbot_cli usage
diegomarquezp Apr 11, 2024
e77bacc
add entrypoint for docker image
diegomarquezp Apr 17, 2024
9120c94
manually set HOME var in owlbot
diegomarquezp Apr 18, 2024
3d18426
Merge remote-tracking branch 'origin/main' into transfer-synthtool-java
diegomarquezp Apr 18, 2024
427c564
post-merge cleanup
diegomarquezp Apr 18, 2024
7b79763
fix parameters
diegomarquezp Apr 18, 2024
240742a
fix synthtool sha
diegomarquezp Apr 18, 2024
8498f2c
fix entrypoint
diegomarquezp Apr 19, 2024
f893e5c
Merge remote-tracking branch 'origin/main' into transfer-synthtool-java
diegomarquezp Apr 19, 2024
2faf45d
lint
diegomarquezp Apr 19, 2024
7ce9548
remove unused entrypoint file
diegomarquezp Apr 19, 2024
8474452
remove owlbot CLI from readmne
diegomarquezp Apr 19, 2024
b58e990
restore xtrace
diegomarquezp Apr 19, 2024
00cb2a7
correct comment
diegomarquezp Apr 19, 2024
b27a57e
remove test helper
diegomarquezp Apr 19, 2024
6c6432a
checkout owlbot cli committish
diegomarquezp Apr 19, 2024
eca8c62
remove owlbot config
diegomarquezp Apr 19, 2024
0041c6b
fix golden file
diegomarquezp Apr 19, 2024
7ec0f4c
Update library_generation/DEVELOPMENT.md
diegomarquezp Apr 29, 2024
4399bfe
docker instructions, pwd blurb, unit tests instructions
diegomarquezp Apr 29, 2024
ef105e7
Merge remote-tracking branch 'origin/main' into transfer-synthtool-java
diegomarquezp Apr 29, 2024
b1d3fc3
Merge remote-tracking branch 'origin/transfer-synthtool-java' into tr…
diegomarquezp Apr 29, 2024
1df1aa1
restore generate repo
diegomarquezp Apr 30, 2024
bf95938
move text editor installation as a readme note
diegomarquezp Apr 30, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
66 changes: 54 additions & 12 deletions .cloudbuild/library_generation/library_generation.Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -15,26 +15,68 @@
# build from the root of this repo:
FROM gcr.io/cloud-devrel-public-resources/python

# install tools
ARG SYNTHTOOL_COMMITTISH=63cc541da2c45fcfca2136c43e638da1fbae174d
ARG OWLBOT_CLI_COMMITTISH=ac84fa5c423a0069bbce3d2d869c9730c8fdf550
ENV HOME=/home

# install OS tools
RUN apt-get update && apt-get install -y \
unzip openjdk-17-jdk rsync maven jq \
&& apt-get clean

COPY library_generation /src

# use python 3.11 (the base image has several python versions; here we define the default one)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because the base image cloud-devrel-public-resources still uses 3.11 so we have to use 3.11? Who maintains cloud-devrel-public-resources?

Copy link
Contributor Author

@diegomarquezp diegomarquezp May 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so we have to use 3.11?

We could use a higher version too, or even a lower one if it's still compatible with the dependencies

who maintains cloud-devrel-public-resources?

Judging from a documentation entry in the cloud client libraries g3doc, I think it's the Cloud Client Libraries team
Maybe we can extract the base Dockerfile and prepend it to ours to eliminate this dependency.

RUN rm $(which python3)
RUN ln -s $(which python3.11) /usr/local/bin/python
RUN ln -s $(which python3.11) /usr/local/bin/python3
RUN python -m pip install --upgrade pip
RUN cd /src && python -m pip install -r requirements.txt
RUN cd /src && python -m pip install .

# set dummy git credentials for empty commit used in postprocessing
RUN git config --global user.email "[email protected]"
RUN git config --global user.name "Cloud Java Bot"
# copy source code
COPY library_generation /src

WORKDIR /workspace
RUN chmod 750 /workspace
RUN chmod 750 /src/generate_repo.py
# install scripts as a python package
WORKDIR /src
RUN python -m pip install -r requirements.txt
RUN python -m pip install .

# install synthtool
WORKDIR /tools
RUN git clone https://github.com/googleapis/synthtool
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only reason we need synthtool is java.py and its dependencies?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, mainly the common_templates function. We could in theory move java.py and that function to sdk-platform

WORKDIR /tools/synthtool
RUN git checkout "${SYNTHTOOL_COMMITTISH}"
RUN python3 -m pip install --no-deps -e .
RUN python3 -m pip install -r requirements.in

CMD [ "/src/generate_repo.py" ]
# Install nvm with node and npm
ENV NODE_VERSION 20.12.0
WORKDIR /home
RUN curl -o- https://raw.githubusercontent.com/nvm-sh/nvm/v0.35.3/install.sh | bash
RUN chmod o+rx /home/.nvm
ENV NODE_PATH=/home/.nvm/versions/node/v${NODE_VERSION}/bin
ENV PATH=${PATH}:${NODE_PATH}
RUN node --version
RUN npm --version

# install the owl-bot CLI
WORKDIR /tools
RUN git clone https://github.com/googleapis/repo-automation-bots
WORKDIR /tools/repo-automation-bots/packages/owl-bot
RUN git checkout "${OWLBOT_CLI_COMMITTISH}"
RUN npm i && npm run compile && npm link
RUN owl-bot copy-code --version
RUN chmod -R o+rx ${NODE_PATH}
RUN ln -sf ${NODE_PATH}/* /usr/local/bin

# allow users to access the script folders
RUN chmod -R o+rx /src

# set dummy git credentials for the empty commit used in postprocessing
# we use system so all users using the container will use this configuration
RUN git config --system user.email "[email protected]"
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the only reason we need git config is from the post processing here? If it is, I think we should look into not initialize a Git repo, as a follow up enhancement though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, but the root cause is that the owl-bot CLI assumes the source folder (the one we generate and arrange) is a git repository. Extra points to process owlbot yamls using a custom script.

RUN git config --system user.name "Cloud Java Bot"

# allow read-write for /home and execution for binaries in /home/.nvm
RUN chmod -R a+rw /home
RUN chmod -R a+rx /home/.nvm

WORKDIR /workspace
ENTRYPOINT [ "python", "/src/cli/entry_point.py", "generate" ]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the user run the image with different scripts/methods?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, I declared this ENTRYPOINT since it's the "official" one. If we want to run a different script, we can pass the --entrypoint /src/my_script.py flag

171 changes: 171 additions & 0 deletions library_generation/DEVELOPMENT.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,171 @@
> [!IMPORTANT]
> All examples assume you are inside the `library_generation` folder

# Linting

When contributing, ensure your changes to python code have a valid
format.

```
python -m pip install black
black .
```

# Running the integration tests
diegomarquezp marked this conversation as resolved.
Show resolved Hide resolved

The integration tests build the docker image declared in
`.cloudbuild/library_generation/library_generation.Dockerfile`, pull GAPIC
repositories, generate the libraries and compares the results with the source
code declared in a "golden branch" of the repo.

It requires docker and python 3.x to be installed.

```
python -m pip install .
python -m pip install -r requirements.txt
diegomarquezp marked this conversation as resolved.
Show resolved Hide resolved
python -m unittest test/integration_tests.py
```

# Running the unit tests

The unit tests of the hermetic build scripts are contained in several scripts,
corresponding to a specific component. Every unit test script ends with
`unit_tests.py`. To avoid them specifying them
individually, we can use the following command:

```bash
python -m unittest discover -s test/ -p "*unit_tests.py"
```

> [!NOTE]
> The output of this command may look erratic during the first 30 seconds.
> This is normal. After the tests are done, an "OK" message should be shown.
# Running the scripts in your local environment
diegomarquezp marked this conversation as resolved.
Show resolved Hide resolved

Although the scripts are designed to be run in a Docker container, you can also
run them directly. This section explains how to run the entrypoint script
(`library_generation/cli/entry_point.py`).

## Installing prerequisites

In order to run the generation scripts directly, there are a few tools we
need to install beforehand.

### Install synthtool

It requires python 3.x to be installed.
You will need to specify a committish of the synthtool repo in order to have
your generation results matching exactly what the docker image would produce.
You can achieve this by inspecting `SYNTHTOOL_COMMITISH` in
`.cloudbuild/library_generation/library_generation.Dockerfile`.

```bash
# obtained from .cloudbuild/library_generation/library_generation.Dockerfile
export SYNTHTOOL_COMMITTISH=6612ab8f3afcd5e292aecd647f0fa68812c9f5b5
```

```bash
git clone https://github.com/googleapis/synthtool
cd synthtool
git checkout "${SYNTHTOOL_COMMITTISH}"
python -m pip install --require-hashes -r requirements.txt
python -m pip install --no-deps -e .
python -m synthtool --version
```

### Install the owl-bot CLI

Requires node.js to be installed.
Check this [installation guide](https://github.com/nvm-sh/nvm?tab=readme-ov-file#install--update-script)
for NVM, Node.js's version manager.

After you install it, you can install the owl-bot CLI with the following
commands:
```bash
git clone https://github.com/googleapis/repo-automation-bots
cd repo-automation-bots/packages/owl-bot
npm i && npm run compile && npm link
owl-bot copy-code --version
```

The key step is `npm link`, which will make the command available in you current
shell session.

## Running the script
The entrypoint script (`library_generation/cli/entry_point.py`) allows you to
update the target repository with the latest changes starting from the
googleapis committish declared in `generation_config.yaml`.

### Download the repo
For example, google-cloud-java
```
git clone https://github.com/googleapis/google-cloud-java
export path_to_repo="$(pwd)/google-cloud-java"
```

### Install the scripts
```
python -m pip install .
```

### Run the script
```
python cli/entry_point.py generate --repository-path="${path_to_repo}"
```


# Running the scripts using the docker container image
This is convenient in order to avoid installing the dependencies manually.

> [!IMPORTANT]
> From now, the examples assume you are in the root of your sdk-platform-java
> folder
## Build the docker image
```bash
docker build --file .cloudbuild/library_generation/library_generation.Dockerfile --iidfile image-id .
```

This will create an `image-id` file at the root of the repo with the hash ID of
the image.

## Run the docker image
The docker image will perform changes on its internal `/workspace` folder, to which you
need to map a folder on your host machine (i.e. map your downloaded repo to this
folder).

To run the docker container on the google-cloud-java repo, you must run:
```bash
docker run -u "$(id -u)":"$(id -g)" -v/path/to/google-cloud-java:/workspace $(cat image-id)
```

* `-u "$(id -u)":"$(id -g)"` makes docker run the container impersonating
yourself. This avoids folder ownership changes since it runs as root by
default.
* `-v/path/to/google-cloud-java:/workspace` maps the host machine's
google-cloud-java folder to the /workspace folder. The image is configured to
perform changes in this directory
* `$(cat image-id)` obtains the image ID created in the build step

## Debug the created containers
If you are working on changing the way the containers are created, you may want
to inspect the containers to check the setup. It would be convenient in such
case to have a text editor/viewer available. You can achieve this by modifying
the Dockerfile as follows:

```docker
# install OS tools
RUN apt-get update && apt-get install -y \
unzip openjdk-17-jdk rsync maven jq less vim \
&& apt-get clean
```

We add `less` and `vim` as text tools for further inspection.

You can also run a shell in a new container by running:

```bash
docker run --rm -it -u=$(id -u):$(id -g) -v/path/to/google-cloud-java:/workspace --entrypoint="bash" $(cat image-id)
```
4 changes: 0 additions & 4 deletions library_generation/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -98,8 +98,6 @@ They are shared by library level parameters.
| grpc_version | No | inferred from the generator if not specified |
| googleapis-commitish | Yes | |
| libraries_bom_version | Yes | |
| owlbot-cli-image | Yes | |
| synthtool-commitish | Yes | |
| template_excludes | Yes | |

### Library level parameters
Expand Down Expand Up @@ -149,8 +147,6 @@ gapic_generator_version: 2.34.0
protoc_version: 25.2
googleapis_commitish: 1a45bf7393b52407188c82e63101db7dc9c72026
libraries_bom_version: 26.37.0
owlbot_cli_image: sha256:623647ee79ac605858d09e60c1382a716c125fb776f69301b72de1cd35d49409
synthtool_commitish: 6612ab8f3afcd5e292aecd647f0fa68812c9f5b5
destination_path: google-cloud-java
template_excludes:
- ".github/*"
Expand Down
2 changes: 0 additions & 2 deletions library_generation/generate_composed_library.py
Original file line number Diff line number Diff line change
Expand Up @@ -112,8 +112,6 @@ def generate_composed_library(
"",
versions_file,
owlbot_cli_source_folder,
config.owlbot_cli_image,
config.synthtool_commitish,
str(config.is_monorepo()).lower(),
config_path,
],
Expand Down
8 changes: 0 additions & 8 deletions library_generation/model/generation_config.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,8 +32,6 @@ def __init__(
gapic_generator_version: str,
googleapis_commitish: str,
libraries_bom_version: str,
owlbot_cli_image: str,
synthtool_commitish: str,
template_excludes: list[str],
libraries: list[LibraryConfig],
grpc_version: Optional[str] = None,
Expand All @@ -42,8 +40,6 @@ def __init__(
self.gapic_generator_version = gapic_generator_version
self.googleapis_commitish = googleapis_commitish
self.libraris_bom_version = libraries_bom_version
self.owlbot_cli_image = owlbot_cli_image
self.synthtool_commitish = synthtool_commitish
self.template_excludes = template_excludes
self.libraries = libraries
self.grpc_version = grpc_version
Expand Down Expand Up @@ -145,10 +141,6 @@ def from_yaml(path_to_yaml: str) -> GenerationConfig:
libraries_bom_version=__required(
config, "libraries_bom_version", REPO_LEVEL_PARAMETER
),
owlbot_cli_image=__required(config, "owlbot_cli_image", REPO_LEVEL_PARAMETER),
synthtool_commitish=__required(
config, "synthtool_commitish", REPO_LEVEL_PARAMETER
),
template_excludes=__required(config, "template_excludes", REPO_LEVEL_PARAMETER),
libraries=parsed_libraries,
)
Expand Down
9 changes: 7 additions & 2 deletions library_generation/owlbot/bin/entrypoint.sh
Original file line number Diff line number Diff line change
Expand Up @@ -70,7 +70,12 @@ echo "Fixing missing license headers..."
python3 "${scripts_root}/owlbot/src/fix-license-headers.py"
echo "...done"

# ensure formatting on all .java files in the repository
# Ensure formatting on all .java files in the repository.
# Here we manually set the user.home system variable. Unfortunately, Maven
# user.home inference involves the /etc/passwd file (confirmed empirically),
# instead of the presumable $HOME env var, which may not work properly
# when `docker run`ning with the -u flag because we may incur in users
# not registered in the container's /etc/passwd file
echo "Reformatting source..."
mvn fmt:format -V --batch-mode --no-transfer-progress
mvn fmt:format -Duser.home="${HOME}" -V --batch-mode --no-transfer-progress
Copy link
Collaborator

@JoeWang1127 JoeWang1127 Apr 22, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the permission issue (the generated files can't be removed by shutil.rmtree) is caused by mvn, not docker command?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docker does indeed take ownership of the volumes (i.e. host machine folders) you pass, which makes the -u flag necessary. This is a workaround of a new problem that occurs with maven when running the container with an arbitrary user.

The user.home workaround is necessary because if you run the docker container with a user not found in the container's /etc/passwd file (which is the case when you run -u $(id -u):$(id -g)), then the value of user.home will be '?', so here we explicitly set it.

echo "...done"
Loading
Loading