Skip to content

Commit

Permalink
[BEAM-5436] Add doc page on Go cross compilation. (#17256)
Browse files Browse the repository at this point in the history
  • Loading branch information
lostluck authored Apr 4, 2022
1 parent 384e381 commit 70ff734
Show file tree
Hide file tree
Showing 3 changed files with 83 additions and 1 deletion.
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
---
type: languages
title: "Go SDK Cross Compilation"
---
<!--
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->

# Overview

This page contains technical details for users starting Go SDK pipelines on machines that are not using a `linux` operating system, nor an `amd64` architecture.

Go is a statically compiled language.
To execute a Go binary on a machine, it must be compiled for the matching operating system and processor architecture.
This has implications for how Go SDK pipelines execute on [workers](/documentation/glossary/#worker).

# Development: Using `go run`

When starting your in development pipeline against a remote runner, you can use `go run` from your development environment.
The Go SDK will cross-compile your pipeline for `linux-amd64`, and use that as the pipeline's worker binary.

Alternatively, some local runners support Loopback execution.
Setting the flag `--environment_type=LOOPBACK` can cause the runner to connect back to the local binary to serve as a worker.
This can simplify development and debugging by avoiding hiding log output in a container.

# Production: Overriding the Worker Binary

Go SDK pipeline binaries have a `--worker_binary` flag to set the path to the desired worker binary.
This section will teach you how to use this flag for robust Go pipelines.

In production settings, it's common to only have access to compiled artifacts.
For Go SDK pipelines, you may need to have two: one for the launching platform, and one for the worker platform.

In order to run a Go program on a specific platform, that program must be built targeting that platform's operating system, and architecture.
The Go compiler is able to cross compile to a target architecture by setting the [`$GOOS` and `$GOARCH` environment variables](https://go.dev/doc/install/source#environment) for your build.

For example, you may be launching a pipeline from an M1 Macbook, but running the jobs on a Flink cluster executing on linux VMs with amd64 processors.
In this situation, you would need to compile your pipeline binary for both `darwin-arm64` for the launching, and `linux-amd64`.

```
# Build binary for the launching platform.
# This uses the defaults for your machine, so no new environment variables are needed.
$ go build path/to/my/pipeline -o output/launcher
# Build binary for the worker platform, linux-amd64
$ GOOS=linux GOARCH=amd64 go build path/to/my/pipeline -o output/worker
```

Execute the pipeline with the `--worker_binary` flag set to the desired binary.

```
# Launch the pipeline specifying the worker binary.
$ ./output/launcher --worker_binary=output/worker --runner=flink --endpoint=... <...other flags...>
```

# SDK Containers

Apache Beam releases [SDK specific containers](documentation/runtime/environments/) for runners to use to launch workers.
These containers provision and initialize the worker binary as appropriate for the SDK.

At present, Go SDK worker containers are only built for the `linux-amd64` platform.
See [BEAM-11704](https://issues.apache.org/jira/browse/BEAM-11704) for the current state of ARM64 container support.

Because Go is statically compiled, there are no runtime dependencies on a specific Go version for a container.
The Go release used to compile your binary will be what your workers execute.
Be sure to update to a recent [Go release](https://go.dev/doc/devel/release) for best performance.
6 changes: 5 additions & 1 deletion website/www/site/content/en/documentation/sdks/go.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
---
type: languages
title: "Beam Go SDK"
aliases: /learn/sdks/go/
---
<!--
Licensed under the Apache License, Version 2.0 (the "License");
Expand All @@ -20,6 +21,9 @@ limitations under the License.
The Go SDK for Apache Beam provides a simple, powerful API for building both batch and streaming parallel data processing pipelines.
It is based on the following [design](https://s.apache.org/beam-go-sdk-design-rfc).

Unlike Java and Python, Go is a statically compiled language.
This means worker binaries may need to be [cross-compiled](/documentation/sdks/go-cross-compilation/) to execute on distributed runners.

## Get Started with the Go SDK

Get started with the [Beam Go SDK quickstart](/get-started/quickstart-go) to set up your development environment and run an example pipeline. Then, read through the [Beam programming guide](/documentation/programming-guide) to learn the basic concepts that apply to all SDKs in Beam.
Expand All @@ -29,7 +33,7 @@ See the [godoc](https://pkg.go.dev/github.com/apache/beam/sdks/v2/go/pkg/beam) f
## Status

Version 2.32.0 is the last experimental release of the Go SDK. The Go SDK supports most Batch oriented features, and cross language transforms.
It's possible to write many kinds of transforms, but specific built in transforms may still be missing.
It's possible to write many kinds of transforms, but specific built in transforms may still be missing, or incomplete.

Requests for specific transforms may be filed to the [`sdk-go` component in JIRA](https://issues.apache.org/jira/issues/?jql=project%20%3D%20BEAM%20AND%20status%20in%20(Open%2C%20%22In%20Progress%22%2C%20Idea%2C%20%22Under%20Discussion%22%2C%20%22In%20Implementation%22%2C%20%22Triage%20Needed%22)%20AND%20component%20%3D%20sdk-go%20ORDER%20BY%20updated%20DESC).
Contributions are welcome.
2 changes: 2 additions & 0 deletions website/www/site/layouts/partials/section-menu/en/sdks.html
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,8 @@
<li><a href="https://pkg.go.dev/github.com/apache/beam/sdks/v2/go/pkg/beam" target="_blank">Go SDK API reference <img src="/images/external-link-icon.png"
width="14" height="14"
alt="External link."></a>

<li><a href="/documentation/sdks/go-cross-compilation/">Cross compilation</a></li>
</li>
</ul>
</li>
Expand Down

0 comments on commit 70ff734

Please sign in to comment.