Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apache Airflow DAGs are not synchronised when using git-sync:v4.3.0 #928

Closed
conchaox opened this issue Dec 4, 2024 · 7 comments · Fixed by #931
Closed

Apache Airflow DAGs are not synchronised when using git-sync:v4.3.0 #928

conchaox opened this issue Dec 4, 2024 · 7 comments · Fixed by #931

Comments

@conchaox
Copy link

conchaox commented Dec 4, 2024

Hello,

We've recently encountered a security issue with some of the Airflow and Git-sync versions, you can check the details here. Our remediation process is:

For nghttp2/libnghttp2-14: 1.43.0-1 upgrade Debian:11 nghttp2 to version 1.43.0-1+deb11u1 or higher.

However, it seems that we need to update the version of the git-sync sidecar container as well seems the current default version doesn't provide the updated package. I've been trying to update the version to v4.3.0 and it seems to be working as expected, but when I push a new DAG to the branch, the DAGs seems not to be reflected in the Airflow pods (i.e. is not visible in the Web UI) when I check the logs for the dags-git-sync sidecar container I always see something like this:

INFO: detected pid 1, running init handler
env $GIT_SYNC_REPO has been deprecated, use $GITSYNC_REPO instead
env $GIT_SYNC_DEPTH has been deprecated, use $GITSYNC_DEPTH instead
env $GIT_SYNC_SUBMODULES has been deprecated, use $GITSYNC_SUBMODULES instead
env $GIT_SYNC_ADD_USER has been deprecated, use $GITSYNC_ADD_USER instead
env $GIT_SYNC_USERNAME has been deprecated, use $GITSYNC_USERNAME instead
env $GIT_SYNC_PASSWORD has been deprecated, use $GITSYNC_PASSWORD instead
env $GIT_KNOWN_HOSTS has been deprecated, use $GITSYNC_SSH_KNOWN_HOSTS instead
{"logger":"","ts":"2024-11-25 15:57:59.273993","caller":{"file":"main.go","line":424},"level":0,"msg":"setting --ref from deprecated --branch"}
{"logger":"","ts":"2024-11-25 15:57:59.274431","caller":{"file":"main.go","line":456},"level":0,"msg":"setting --link from deprecated --dest"}
{"logger":"","ts":"2024-11-25 15:57:59.274560","caller":{"file":"main.go","line":466},"level":0,"msg":"setting --period from deprecated --wait"}
{"logger":"","ts":"2024-11-25 15:57:59.274710","caller":{"file":"main.go","line":497},"level":0,"msg":"setting --sync-timeout from deprecated --timeout"}
{"logger":"","ts":"2024-11-25 15:57:59.274947","caller":{"file":"main.go","line":626},"level":0,"msg":"starting up","version":"v4.3.0","pid":11,"uid":65533,"gid":65533,"home":"/tmp","flags":[]}
{"logger":"","ts":"2024-11-25 15:57:59.280165","caller":{"file":"main.go","line":731},"level":0,"msg":"git version","version":"git version 2.39.5"}
{"logger":"","ts":"2024-11-25 15:57:59.362345","caller":{"file":"main.go","line":1237},"level":0,"msg":"repo directory was empty or failed checks","path":"/git"}
{"logger":"","ts":"2024-11-25 15:57:59.362845","caller":{"file":"main.go","line":1247},"level":0,"msg":"initializing repo directory","path":"/git"}
{"logger":"","ts":"2024-11-25 15:58:02.337018","caller":{"file":"main.go","line":1775},"level":0,"msg":"update required","ref":"airflow-dags","local":"","remote":"70706c8d51088703e0b7b7341854bf6bd7368d85","syncCount":0}
{"logger":"","ts":"2024-11-25 15:58:04.023592","caller":{"file":"main.go","line":1821},"level":0,"msg":"updated successfully","ref":"airflow-dags","remote":"70706c8d51088703e0b7b7341854bf6bd7368d85","syncCount":1}
{"logger":"","ts":"2024-11-25 18:16:34.813457","caller":{"file":"main.go","line":1775},"level":0,"msg":"update required","ref":"airflow-dags","local":"70706c8d51088703e0b7b7341854bf6bd7368d85","remote":"5c0485e7f889344a8825fd545d3e678abb9eb31f","syncCount":1}
{"logger":"","ts":"2024-11-25 18:16:34.969499","caller":{"file":"main.go","line":1821},"level":0,"msg":"updated successfully","ref":"airflow-dags","remote":"5c0485e7f889344a8825fd545d3e678abb9eb31f","syncCount":2}

In this case, the name of my branch is airflow-dags

I understand that there has been some mayor changes from v3 to v4 for example some of the used flags have change, but more importantly it fundamentally changes the way the internal sync-loop works (reference here), so I was wondering if there's some recommendation on how to migrate from v3 to v4 without breaking functionality.

Regards,
Alejandro.

@thockin
Copy link
Member

thockin commented Dec 4, 2024

The logs you posted indicate success. Can you help me understand what the failure looks like t a lower level?

To start, you could run git-sync with -v 6 and post full logs. That will show every command that gets executed and the results of them.

If you kubectl exec into the git-sync container, what does ls -l /git look like?

The inner-loop change should not change the results of running git-sync, and especially not make it claim success but actually fail. There are a number of flags that changed in v4 and it's possible that one of those is tripping you up, maybe?

@conchaox
Copy link
Author

Hello @thockin,

Thanks for your reply. I'm not sure how to modify the git-sync command that is executed since I'm not running Git-sync manually but using this Airflow helm chart, I'll try to figure it out. Regarding the other question, this is the output of ls -l /git:

$ ls -l /git
total 0
lrwxrwxrwx 1 git-sync git-sync 51 Dec 12 04:40 repo -> .worktrees/5c0485e7f889344a8825fd545d3e678abb9eb31f

It appears that the content of my repo, with the subpath airflow/dags is being copied correcttly:

/git/repo/airflow/dags$ ls
__pycache__	 custom_operators      example-dag.py			my-awesome-dag.py      s3_upload_copy_delete.py  tutorial_taskflow_api_etl.py
airflow_test.py  databricks_sample.py  example_external_task_sensor.py	s2_conn_test.py        s3_upload_dag.py		 vault-test.py
confs		 dyn_dags.py	       kube_pod.py			s3_upload_copy_dag.py  test_operator.py

But when I tried to run any of those DAGs from the UI, there's and error saying that the DAG doesn't exist, and if I add a new DAG to the branch, it doesn't get reflected in the UI.

Something that I noticed from the logs that I previously shared with you:

{"logger":"","ts":"2024-11-25 15:57:59.362345","caller":{"file":"main.go","line":1237},"level":0,"msg":"repo directory was empty or failed checks","path":"/git"}
{"logger":"","ts":"2024-11-25 15:57:59.362845","caller":{"file":"main.go","line":1247},"level":0,"msg":"initializing repo directory","path":"/git"}

And, when I check one of our instances that is working using the old GitSync version, I see something like this:

INFO: detected pid 1, running init handler
I1211 16:40:43.340714      11 main.go:401] "level"=0 "msg"="starting up" "pid"=11 "args"=["/git-sync"]
I1211 16:40:43.360160      11 main.go:952] "level"=0 "msg"="cloning repo" "origin"="https://github.com/LiveRamp/identity-ds-orchestration.git" "path"="/dags"
I1211 16:40:44.130982      11 main.go:762] "level"=0 "msg"="syncing git" "rev"="HEAD" "hash"="f1a6bdfd4aa1f8288bb81a6e88bad8659dea0c4b"
I1211 16:40:44.152263      11 main.go:802] "level"=0 "msg"="adding worktree" "path"="/dags/f1a6bdfd4aa1f8288bb81a6e88bad8659dea0c4b" "branch"="origin/dev"
I1211 16:40:44.199252      11 main.go:862] "level"=0 "msg"="reset worktree to hash" "path"="/dags/f1a6bdfd4aa1f8288bb81a6e88bad8659dea0c4b" "hash"="f1a6bdfd4aa1f8288bb81a6e88bad8659dea0c4b"
I1211 16:40:44.199276      11 main.go:867] "level"=0 "msg"="updating submodules"

It seems that the path for the old version is /dags which is mounted in a EmptyDir volume in the airflow container.

I guess I should try to mount that EmptyDir in /git as explained here, so this will work? but we prefer to keep the same config since this is how the Chart works.

Regards,
Alejandro

@thockin
Copy link
Member

thockin commented Dec 12, 2024

It looks like you are not setting the --root flag, which is ironically not something that changed in v4. I don't know how it would have worked before either.

@thockin
Copy link
Member

thockin commented Dec 12, 2024

Helm has a "render" function which will show you the full result without sending it to the kube API

@conchaox
Copy link
Author

Hello @thockin

I believed the --root flag has been placed using the GIT_SYNC_ROOT env var:
image
It seems that the default value for that is /dags. this is a mount point configured by default by the Chart and is mounted on a EmptyDir.

is the newer version expecting the mount point to be /git ? I guess I can try changing to that value but I would prefer to have the /dags mount point as currently is.

@thockin
Copy link
Member

thockin commented Dec 12, 2024

Look at your original logs:

INFO: detected pid 1, running init handler
env $GIT_SYNC_REPO has been deprecated, use $GITSYNC_REPO instead
env $GIT_SYNC_DEPTH has been deprecated, use $GITSYNC_DEPTH instead
env $GIT_SYNC_SUBMODULES has been deprecated, use $GITSYNC_SUBMODULES instead
env $GIT_SYNC_ADD_USER has been deprecated, use $GITSYNC_ADD_USER instead
env $GIT_SYNC_USERNAME has been deprecated, use $GITSYNC_USERNAME instead
env $GIT_SYNC_PASSWORD has been deprecated, use $GITSYNC_PASSWORD instead
env $GIT_KNOWN_HOSTS has been deprecated, use $GITSYNC_SSH_KNOWN_HOSTS instead

Note that GIT_SYNC_ROOT is not in that list. The fact that it is using "/git" corroborates -- that is the default value.

"/dags" is a perfectly fine value for --root (or $GIT_SYNC_ROOT or better $GITSYNC_ROOT), but it's not being passed in (or not being recognized).

Digging into it, I think I have a problem with my Dockerfile, which sets GITSYNC_ROOT which takes preference over GIT_SYNC_ROOT, which you are setting.

Once again, a reason flags are better than env :)

Change all your envs to "GITSYNC_*" instead of "GIT_SYNC_*" and I bet it works.

@thockin
Copy link
Member

thockin commented Dec 12, 2024

I will look at fixing this properly. Nice bug.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants