Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs for triggered_dataset_event #34410

Merged
merged 3 commits into from
Sep 16, 2023

Conversation

fritz-astronomer
Copy link
Contributor

@fritz-astronomer fritz-astronomer commented Sep 15, 2023

  • add templates.rst reference for triggering_dataset_events
  • adds a note to check the templates page on the datasets page

Validated with breeze locally :) Still can't get stuff like :class:`~airflow.models.dataset.DatasetEvent` to link correctly, but that's outside the scope of this

…heck the templates page on the datasets page
@fritz-astronomer fritz-astronomer marked this pull request as draft September 15, 2023 21:26
@fritz-astronomer fritz-astronomer marked this pull request as ready for review September 15, 2023 21:48
@fritz-astronomer
Copy link
Contributor Author

Good to go 👍

Comment on lines +225 to +226
WHERE "updated_at" >= '{{ (triggering_dataset_events.values() | first | first).source_dag_run.data_interval_start }}'
AND "updated_at" < '{{ (triggering_dataset_events.values() | first | first).source_dag_run.data_interval_end }}';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to use your words this is ugly... maybe it's weird to use a user-defined macro in the example dag, but it would make this easier to follow i think, and it's probably a better thing to do

do these sort properly? are we actually getting the first dataset event by time (for lower bound), and the last one by time for upper bound?

Copy link
Contributor Author

@fritz-astronomer fritz-astronomer Sep 15, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂 hoisted by my own petard. I feel like, maybe future work could be an Airflow-wide macro to make this less bend-over-backward?

I'm hesitant to add a DAG-specific macro to the example DAG as that might be "a step too far" if people are unfamiliar with multiple concepts being presented at once, and then don't know why things are breaking if they just copypaste the jinja template, but don't know they also need the special macro.

I think the | first also make the structure of it obvious. So, though ugly, I vaguely prefer being explicit with the ugliness.

re: sorting - I have no idea how the list sorts. Luckily this is an example for a single-dataset schedule, so it doesn't matter 😄

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

as you wish, as you wish

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to add a DAG-specific macro to the example DAG as that might be "a step too far" if people are unfamiliar with multiple concepts being presented at once

I agree, it's better to use a standard/official thing, even if it's ugly

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've bet that someone might be confused what actually first, not users all familiar with Jinja filters and even don't know which are builtin, for example it might be a good idea to add this link in code example comments https://jinja.palletsprojects.com/en/3.1.x/templates/#jinja-filters.first

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will just quietly sneak in a change to macro later ;)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the link @Taragolis - that also gives an opportunity to explain what the heck is happening there 😅

Comment on lines +225 to +226
WHERE "updated_at" >= '{{ (triggering_dataset_events.values() | first | first).source_dag_run.data_interval_start }}'
AND "updated_at" < '{{ (triggering_dataset_events.values() | first | first).source_dag_run.data_interval_end }}';
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hesitant to add a DAG-specific macro to the example DAG as that might be "a step too far" if people are unfamiliar with multiple concepts being presented at once

I agree, it's better to use a standard/official thing, even if it's ugly

@Taragolis Taragolis merged commit 21610c1 into apache:main Sep 16, 2023
@ephraimbuddy ephraimbuddy added the type:doc-only Changelog: Doc Only label Oct 3, 2023
@ephraimbuddy ephraimbuddy added this to the Airflow 2.7.2 milestone Oct 3, 2023
ephraimbuddy pushed a commit that referenced this pull request Oct 5, 2023
* add templates reference for triggering_dataset_events and a note to check the templates page on the datasets page

* add working example, correct type of triggering_dataset_events

* explain | first | first

(cherry picked from commit 21610c1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants