Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Publish a container image for Apple M1 machines #798

Closed
hazcod opened this issue May 17, 2021 · 12 comments · Fixed by #1193
Closed

Publish a container image for Apple M1 machines #798

hazcod opened this issue May 17, 2021 · 12 comments · Fixed by #1193
Labels
priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.

Comments

@hazcod
Copy link

hazcod commented May 17, 2021

Bug Description

Crash on Docker for Mac Kubernetes, M1 chip.

Stacktrace

runtime: unexpected return pc for runtime.asyncPreempt called from 0xc0003d2000
stack: frame={sp:0xc000309cf0, fp:0xc000309cf8} stack=[0xc000308000,0xc00030a000)
000000c000309bf0:  000000c0000a2180  000000c0002aa160 
000000c000309c00:  000000c00010c238  000000c000000180 
000000c000309c10:  000000c000000208  000000c0000a21d0 
000000c000309c20:  000000c0000002b8  000000000044338e <runtime.newproc+110> 
000000c000309c30:  000000c000309c38  000000c000309c68 
000000c000309c40:  00000000004081eb <runtime.chanrecv2+43>  000000c00010c1e0 
000000c000309c50:  000000c0003f5cc0  000000c000000101 
000000c000309c60:  000000000095b0c5 <github.com/GoogleCloudPlatform/cloudsql-proxy/proxy/proxy.(*Client).Run+165>  000000c000309ce0 
000000c000309c70:  000000000095b0e5 <github.com/GoogleCloudPlatform/cloudsql-proxy/proxy/proxy.(*Client).Run+197>  000000c00010c1e0 
000000c000309c80:  000000c0003f5cc0  000000c0003d2000 
000000c000309c90:  0000004000800b8e  0000000000000033 
000000c000309ca0:  0000000000b8e130  000000c00045e0d0 
000000c000309cb0:  0000000000466d80 <runtime.newproc.func1+0>  000000c000717cf8 
000000c000309cc0:  0000000000000000  0000000000000000 
000000c000309cd0:  0000000000000000  0000000000000000 
000000c000309ce0:  000000c000309f70  000000000046f940 <runtime.asyncPreempt+0> 
000000c000309cf0: <000000c0003d2000 >000000c00010c1e0 
000000c000309d00:  000000c00010c360  000000c0003d2000 
000000c000309d10:  0000000000000000  000000c00051a180 
000000c000309d20:  000000c00018b740  000000c00010c1e0 
000000c000309d30:  0000000000000000  0000000000000000 
000000c000309d40:  0000000000b7e900  000000c00002a920 
000000c000309d50:  000000c0001c8000  0000000000000000 
000000c000309d60:  0000000000000000  000001c00018b0b0 
000000c000309d70:  00000006fc23ac00  000000c0002dfda0 
000000c000309d80:  0000000000000000  0000000000000000 
000000c000309d90:  0000000000ac5b47  0000000000000002 
000000c000309da0:  0000000000000002  0000000000000002 
000000c000309db0:  0000000000000002  0000000000000002 
000000c000309dc0:  0000000000000002  0000000000b88a90 
000000c000309dd0:  0000000000000002  0000000000000002 
000000c000309de0:  000000c00011d838  0000000000000000 
000000c000309df0:  000000000000007e 
fatal error: unknown caller pc

runtime stack:
runtime.throw(0xacf03d, 0x11)
        /usr/local/go/src/runtime/panic.go:1117 +0x72
runtime.gentraceback(0xffffffffffffffff, 0xffffffffffffffff, 0x0, 0xc000000180, 0x0, 0x0, 0x7fffffff, 0xaf54d0, 0x40306a59a0, 0x0, ...)
        /usr/local/go/src/runtime/traceback.go:261 +0x1a56
runtime.copystack(0xc000000180, 0x2000)
        /usr/local/go/src/runtime/stack.go:908 +0x2f1
runtime.shrinkstack(0xc000000180)
        /usr/local/go/src/runtime/stack.go:1180 +0x13d
runtime.scanstack(0xc000000180, 0xc000034698)
        /usr/local/go/src/runtime/mgcmark.go:720 +0x58e
runtime.markroot.func1()
        /usr/local/go/src/runtime/mgcmark.go:233 +0xc6
runtime.markroot(0xc000034698, 0x14)
        /usr/local/go/src/runtime/mgcmark.go:206 +0x33e
runtime.gcDrain(0xc000034698, 0x7)
        /usr/local/go/src/runtime/mgcmark.go:1014 +0x118
runtime.gcBgMarkWorker.func2()
        /usr/local/go/src/runtime/mgc.go:2003 +0x17e
runtime.systemstack(0xc000000d80)
        /usr/local/go/src/runtime/asm_amd64.s:379 +0x66
runtime.mstart()
        /usr/local/go/src/runtime/proc.go:1246

goroutine 67 [GC worker (idle), 2 minutes]:
runtime.systemstack_switch()
        /usr/local/go/src/runtime/asm_amd64.s:339 fp=0xc00003f760 sp=0xc00003f758 pc=0x46c680
runtime.gcBgMarkWorker()
        /usr/local/go/src/runtime/mgc.go:1967 +0x1c7 fp=0xc00003f7e0 sp=0xc00003f760 pc=0x41f567
runtime.goexit()
        /usr/local/go/src/runtime/asm_amd64.s:1371 +0x1 fp=0xc00003f7e8 sp=0xc00003f7e0 pc=0x46e4c1
created by runtime.gcBgMarkStartWorkers
        /usr/local/go/src/runtime/mgc.go:1835 +0x37

goroutine 1 [chan receive (scan), 2 minutes]:
github.com/GoogleCloudPlatform/cloudsql-proxy/proxy/proxy.(*Client).Run(0xc0003d2000, 0xc00010c1e0)
        /go/src/cloudsql-proxy/proxy/proxy/client.go:137 +0xc5
**
ERROR:/qemu/accel/tcg/cpu-exec.c:697:cpu_exec: assertion failed: (cpu == current_cpu)

Environment

  1. OS type and version: macOS 11.3.1 , Docker for Mac 3.3.3.
  2. Cloud SQL Proxy version (./cloud_sql_proxy -version): 1.22.0
@hazcod hazcod added the type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. label May 17, 2021
@kurtisvg
Copy link
Contributor

kurtisvg commented May 17, 2021

The M1 is an ARM chip and the stack trace says asm_amd64, which makes me think it's not running an ARM binary. Which docker image are you using (your own or one of ours)?

@enocom enocom added needs more info and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels May 17, 2021
@kurtisvg kurtisvg added priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. labels May 17, 2021
@enocom
Copy link
Member

enocom commented May 19, 2021

@hazcod Can you confirm which docker image you're using?

@hazcod
Copy link
Author

hazcod commented May 19, 2021

This is the official one, but it might be emulated by Docker for Mac via qemu I guess to x86?

@enocom enocom added type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. and removed needs more info type: question Request for information or clarification. labels May 19, 2021
@enocom enocom changed the title cpu_exec: assertion failed gcr.io/cloudsql-docker/gce-proxy default image crashes on Apple M1 machines May 19, 2021
@enocom
Copy link
Member

enocom commented May 19, 2021

I assume you're using the default image and not alpine or buster.

In any case, this should just work, but perhaps there's still an issue here with Docker? We'll investigate.

@enocom
Copy link
Member

enocom commented May 19, 2021

How are you starting your container? I see there are some known issues with Docker on the M1 chip:

Not all images are available for ARM64 architecture. You can add --platform linux/amd64 to run an Intel image under emulation. In particular, the mysql image is not available for ARM64. You can work around this issue by using a mariadb image.

https://docs.docker.com/docker-for-mac/apple-silicon/#known-issues

@enocom enocom added priority: p3 Desirable enhancement or fix. May not be included in next release. and removed priority: p2 Moderately-important priority. Fix may not be included in next release. labels May 19, 2021
@enocom
Copy link
Member

enocom commented May 19, 2021

Note: the link above recommends running ARM64 containers on Apple Silicon. Recently distroless added support for ARM64.

However, attempts to run Intel-based containers on Apple Silicon machines can crash as QEMU sometimes fails to run the container. Filesystem change notification APIs (e.g. inotify) do not work under QEMU emulation, see docker/for-mac#5321. Therefore, we recommend that you run ARM64 containers on Apple Silicon machines. These containers are also faster and use less memory than Intel-based containers.

@enocom enocom changed the title gcr.io/cloudsql-docker/gce-proxy default image crashes on Apple M1 machines Default container image crashes on Apple M1 machines May 19, 2021
@kurtisvg
Copy link
Contributor

We probably need to do something special to support ARM64 in the default container. Until then I think using the emulation flag as Eno described above is a suitable work around.

@hazcod
Copy link
Author

hazcod commented May 19, 2021

For reference how I run it via Kubernetes on Docker for Mac (via a Helm template):

apiVersion: apps/v1
kind: Deployment

metadata:
  namespace: "{{ .Values.namespace }}"
  name: "{{ .Values.name }}-api"

  labels:
    app: "{{ .Values.name }}"
    owner: "{{ .Values.owner }}"

spec:
  replicas: {{ .Values.replicaCount }}

  selector:
    matchLabels:
      app: "{{ .Values.name }}"
      component: "api"

  strategy:
    type: Recreate
  
  template:

    metadata:

      labels:
        app: "{{ .Values.name }}"
        owner: "{{ .Values.owner }}"
        component: "api"

      annotations:
        seccomp.security.alpha.kubernetes.io/pod: runtime/default
        {{ if not .Values.developmentMode }}
        # do not enable on Docker for Mac, since it doesn't support AppArmor
        container.apparmor.security.beta.kubernetes.io/api: runtime/default
        {{ end }}

    spec:

      restartPolicy: Always
      
      {{ if not .Values.developmentMode }}
      serviceAccountName: "{{ .Values.serviceAccount }}"
      
      initContainers:
      - image:  gcr.io/google.com/cloudsdktool/cloud-sdk:slim
        name: workload-identity-initcontainer
        command:
        - '/bin/bash'
        - '-c'
        - |
          curl -s -H 'Metadata-Flavor: Google' 'http://169.254.169.254/computeMetadata/v1/instance/service-accounts/default/token' --retry 30 --retry-connrefused --retry-max-time 30 > /dev/null || exit 1
      {{ end }}

      containers:
      -
        imagePullPolicy: Always
        image: eu.gcr.io/cloudsql-docker/gce-proxy:1.22.0
        name: cloudsql-proxy

        command:
        - '/cloud_sql_proxy'
        - '-enable_iam_login'
        {{ if not .Values.developmentMode }}
        - '-ip_address_types=PRIVATE'
        {{ end }}
        - '-instances={{ .Values.app.db.requests.instanceUrl }}=tcp:3126,{{ .Values.app.db.meta.instanceUrl }}=tcp:3127'

        {{ if .Values.developmentMode }}
        env:
        -
          name: GOOGLE_APPLICATION_CREDENTIALS
          value: /sa.json

        volumeMounts:
        -
          mountPath: /sa.json
          name: {{.Values.name}}-api-sajson
          readOnly: true
        {{ end }}

        resources:
          requests:
            cpu: 0.5
            memory: 0.5Gi
          limits:
            cpu: 0.5
            memory: 0.5Gi

        securityContext:
          readOnlyRootFilesystem: true
          privileged: false
          runAsNonRoot: true
          allowPrivilegeEscalation: false
          capabilities:
            drop: [all]
          seccompProfile:
            type: RuntimeDefault

      
      volumes:
      {{ if .Values.developmentMode }}
      -
        hostPath:
          path: {{ .Values.devServiceAccountFile}}
        name: "{{.Values.name}}-api-sajson"
      {{ end }}

@enocom
Copy link
Member

enocom commented May 21, 2021

@hazcod I don't have access to an Apple Silicon machine presently. However, if there's interest you could build your own container by making two changes to the Dockerfile at the root of this project:

  1. Change Line 22 to RUN CGO_ENABLED=0 GOARCH=arm64 go build -ldflags "-X main.metadataString=container" -o cloud_sql_proxy ./cmd/cloud_sql_proxy
  2. Change Line 25 to FROM gcr.io/distroless/static:nonroot-arm64.

After those changes, you'll have an M1 friendly version of the Cloud SQL Auth proxy. But again, I don't have an Apple Silicon machine at the moment and can't verify myself.

@enocom enocom added type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design. and removed type: bug Error or flaw in code with unintended results or allowing sub-optimal usage patterns. labels Jun 1, 2021
@enocom enocom changed the title Default container image crashes on Apple M1 machines Publish a container image for Apple M1 machines Jun 1, 2021
@enocom enocom removed their assignment Jul 13, 2021
@enocom enocom added priority: p2 Moderately-important priority. Fix may not be included in next release. priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. and removed priority: p3 Desirable enhancement or fix. May not be included in next release. priority: p2 Moderately-important priority. Fix may not be included in next release. labels Dec 2, 2021
@hazcod
Copy link
Author

hazcod commented Mar 1, 2022

Peeking in here again for official arm64 images. :-)

@enocom
Copy link
Member

enocom commented Mar 1, 2022

We have this prioritized internally for the first half of this year. Right now, there's other work in flight that takes priority (e.g., v2 proxy).

@enocom
Copy link
Member

enocom commented May 9, 2022

Fixed by #1193. We'll have an M1 friendly container in the next release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
priority: p1 Important issue which blocks shipping the next release. Will be fixed prior to next release. type: feature request ‘Nice-to-have’ improvement, new feature or different behavior or design.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants