Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rename dataset related python variable names to asset #41348

Merged
merged 138 commits into from
Sep 30, 2024

Conversation

Lee-W
Copy link
Member

@Lee-W Lee-W commented Aug 9, 2024

Why

as part of AIP-74

part of #42307

What

Rename the variable names in Airflow. DB, API, and UI changes will be done in the following PRs

  • Rename module airflow.api_connexion.schemas.dataset_schema as airflow.api_connexion.schemas.asset_schema

    • Rename variable create_dataset_event_schema as create_asset_event_schema
    • Rename variable dataset_collection_schema as asset_collection_schema
    • Rename variable dataset_event_collection_schema as asset_event_collection_schema
    • Rename variable dataset_event_schema as asset_event_schema
    • Rename variable dataset_schema as asset_schema
    • Rename class TaskOutletDatasetReferenceSchema as TaskOutletAssetReferenceSchema
    • Rename class DagScheduleDatasetReferenceSchema as DagScheduleAssetReferenceSchema
    • Rename class DatasetAliasSchema as AssetAliasSchema
    • Rename class DatasetSchema as AssetSchema
    • Rename class DatasetCollection as AssetCollection
    • Rename class DatasetEventSchema as AssetEventSchema
    • Rename class DatasetEventCollection as AssetEventCollection
    • Rename class DatasetEventCollectionSchema as AssetEventCollectionSchema
    • Rename class CreateDatasetEventSchema as CreateAssetEventSchema
  • Rename module airflow.datasets as airflow.assets

    • Rename class DatasetAlias as AssetAlias

    • Rename class DatasetAll as AssetAll

    • Rename class DatasetAny as AssetAny

    • Rename function expand_alias_to_datasets as expand_alias_to_assets

    • Rename class DatasetAliasEvent as AssetAliasEvent

      • Rename method dest_dataset_uri as dest_asset_uri
    • Rename class BaseDataset as BaseAsset

      • Rename method iter_datasets as iter_assets
      • Rename method iter_dataset_aliases as iter_asset_aliases
    • Rename class Dataset as Asset

      • Rename method iter_datasets as iter_assets
      • Rename method iter_dataset_aliases as iter_asset_aliases
    • Rename class _DatasetBooleanCondition as _AssetBooleanCondition

      • Rename method iter_datasets as iter_assets
      • Rename method iter_dataset_aliases as iter_asset_aliases
  • Rename module airflow.datasets.manager as airflow.assets.manager

    • Rename variable dataset_manager as asset_manager

    • Rename function resolve_dataset_manager as resolve_asset_manager

    • Rename class DatasetManager as AssetManager

      • Rename method register_dataset_change as register_asset_change
      • Rename method create_datasets as create_assets
      • Rename method register_dataset_change as notify_asset_created
      • Rename method notify_dataset_changed as notify_asset_changed
  • Rename module airflow.models.dataset as airflow.models.asset

    • Rename class DatasetDagRunQueue as AssetDagRunQueue
    • Rename class DatasetEvent as AssetEvent
    • Rename class DatasetModel as AssetModel
    • Rename class DatasetAliasModel as AssetAliasModel
    • Rename class DagScheduleDatasetReference as DagScheduleAssetReference
    • Rename class TaskOutletDatasetReference as TaskOutletAssetReference
    • Rename class DagScheduleDatasetAliasReference as DagScheduleAssetAliasReference
  • Rename module airflow.api_ui.views.datasets as airflow.api_ui.views.assets

    • Rename variable dataset_router as asset_rounter
  • Rename module airflow.listeners.spec.dataset as airflow.listeners.spec.asset

    • Rename function on_dataset_created as on_asset_created
    • Rename function on_dataset_changed as on_asset_changed
  • Rename module airflow.timetables.datasets as airflow.timetables.assets

    • Rename class DatasetOrTimeSchedule as AssetOrTimeSchedule
  • Rename module airflow.serialization.pydantic.dataset as airflow.serialization.pydantic.asset

    • Rename class DagScheduleDatasetReferencePydantic as DagScheduleAssetReferencePydantic
    • Rename class TaskOutletDatasetReferencePydantic as TaskOutletAssetReferencePydantic
    • Rename class DatasetPydantic as AssetPydantic
    • Rename class DatasetEventPydantic as AssetEventPydantic
  • Rename module airflow.datasets.metadata as airflow.assets.metadata

  • In module airflow.jobs.scheduler_job_runner

    • and its class SchedulerJobRunner

      • Rename method _create_dag_runs_dataset_triggered as _create_dag_runs_asset_triggered
      • Rename method _orphan_unreferenced_datasets as _orphan_unreferenced_datasets
  • In module airflow.api_connexion.security

    • Rename decorator requires_access_dataset as requires_access_asset
  • In module airflow.auth.managers.models.resource_details

    • Rename class DatasetDetails as AssetDetails
  • In module airflow.auth.managers.base_auth_manager

    • Rename function is_authorized_dataset as is_authorized_asset
  • In module airflow.timetables.simple

    • Rename class DatasetTriggeredTimetable as AssetTriggeredTimetable
  • In module airflow.lineage.hook

    • Rename class DatasetLineageInfo as AssetLineageInfo

      • Rename attribute dataset as asset
    • In its class HookLineageCollector

      • Rename method create_dataset as create_asset
      • Rename method add_input_dataset as add_input_asset
      • Rename method add_output_dataset as add_output_asset
      • Rename method collected_datasets as collected_assets
  • In module airflow.models.dag

    • Rename function get_dataset_triggered_next_run_info as get_asset_triggered_next_run_info

    • In its class DagModel

      • Rename method get_dataset_triggered_next_run_info as get_asset_triggered_next_run_info
  • In module airflow.models.taskinstance

    • and its class TaskInstance

      • Rename method _register_dataset_changes as _register_asset_changes
  • In module airflow.providers_manager

    • and its class ProvidersManager

      • Rename method initialize_providers_dataset_uri_resources as initialize_providers_asset_uri_resources
      • Rename attribute _discover_dataset_uri_resources as _discover_asset_uri_resources
      • Rename property dataset_factories as asset_factories
      • Rename property dataset_uri_handlers as asset_uri_handlers
      • Rename property dataset_to_openlineage_converters as asset_to_openlineage_converters
  • In module airflow.security.permissions

    • Rename constant RESOURCE_DATASET as RESOURCE_ASSET
  • In module airflow.serialization.enums

    • and its class DagAttributeTypes

      • Rename attribute DATASET_EVENT_ACCESSORS as ASSET_EVENT_ACCESSORS
      • Rename attribute DATASET_EVENT_ACCESSOR as ASSET_EVENT_ACCESSOR
      • Rename attribute DATASET as ASSET
      • Rename attribute DATASET_ALIAS as ASSET_ALIAS
      • Rename attribute DATASET_ANY as ASSET_ANY
      • Rename attribute DATASET_ALL as ASSET_ALL
  • In module airflow.serialization.pydantic.taskinstance

    • and its class TaskInstancePydantic

      • Rename method _register_dataset_changes as _register_dataset_changes
  • In module airflow.serialization.serialized_objects

    • Rename function encode_dataset_condition as encode_asset_condition
    • Rename function decode_dataset_condition as decode_asset_condition
  • In module airflow.timetables.base

    • Rename class _NullDataset as _NullAsset

      • Rename method iter_datasets as iter_assets
  • In module airflow.utils.context

    • Rename class LazyDatasetEventSelectSequence as LazyAssetEventSelectSequence
  • In module airflow.www.auth

    • Rename function has_access_dataset as has_access_asset
  • Rename configuration core.strict_dataset_uri_validation as core.strict_asset_uri_validation, core.dataset_manager_class as core.asset_manager_class and core.dataset_manager_class as core.asset_manager_class

  • Rename example dags example_dataset_alias.py, example_dataset_alias_with_no_taskflow.py, example_datasets.py as example_asset_alias.py, example_asset_alias_with_no_taskflow.py, example_assets.py

  • Rename DagDependency name dataset-alias, dataset as asset-alias, asset

  • Rename context key triggering_dataset_events as triggering_asset_events

  • Rename resource key dataset-uris as asset-uris for providers amazon, common.io, mysql, fab, postgres, trino

  • In provider airflow.providers.amazon.aws

    • Rename package datasets as assets

      • In its module s3

        • Rename method create_dataset as create_asset
        • Rename method convert_dataset_to_openlineage as convert_asset_to_openlineage
    • and its module auth_manager.avp.entities

      • Rename attribute AvpEntities.DATASET as AvpEntities.ASSET
    • and its module auth_manager.auth_manager.aws_auth_manager

      • Rename function is_authorized_dataset as is_authorized_asset
  • In provider airflow.providers.common.io

    • Rename package datasets as assets

      • in its module file

        • Rename method create_dataset as create_asset
        • Rename method convert_dataset_to_openlineage as convert_asset_to_openlineage
  • In provider airflow.providers.fab

    • in its module auth_manager.fab_auth_manager

      • Rename function is_authorized_dataset as is_authorized_asset
  • In provider airflow.providers.openlineage

    • in its module utils.utils

      • Rename class DatasetInfo as AssetInfo
      • Rename function translate_airflow_dataset as translate_airflow_asset
  • Rename package airflow.providers.postgres.datasets as airflow.providers.postgres.assets

  • Rename package airflow.providers.mysql.datasets as airflow.providers.mysql.assets

  • Rename package airflow.providers.trino.datasets as airflow.providers.trino.assets

  • Add module airflow.providers.common.compat.assets

  • Add module airflow.providers.common.compat.openlineage.utils.utils

  • Add moddule airflow.providers.common.compat.security.permissions.RESOURCE_ASSET


^ Add meaningful description above
Read the Pull Request Guidelines for more information.
In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed.
In case of a new dependency, check compliance with the ASF 3rd Party License Policy.
In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

@uranusjr
Copy link
Member

I have created #41424 to add the name attribute. These two will conflict, so remember to rebase.

@Lee-W
Copy link
Member Author

Lee-W commented Aug 13, 2024

sure, will do

@Lee-W Lee-W force-pushed the rename-dataset-as-asset branch 4 times, most recently from c6538fa to 875d372 Compare August 13, 2024 13:35
@Lee-W Lee-W force-pushed the rename-dataset-as-asset branch 13 times, most recently from 35d6d6e to 870ba33 Compare August 19, 2024 01:27
@Lee-W Lee-W force-pushed the rename-dataset-as-asset branch from ee2b1b8 to 1c137e5 Compare September 30, 2024 00:23
@Lee-W
Copy link
Member Author

Lee-W commented Sep 30, 2024

Has the UI been tested if all is working? Some changes in UI very probably are not automatically be tested. Did not make a test and opened the UI. Might be a good point to double check not to break any link/HTML.

I've done some testing, and it worked fine. I tried to keep UI changes unrelated to the backend change in another separate PR (not yet there). As there are many conflicts today (sometime missing something during rebasing 🤦‍♂️), I'll try to resolve them and test them again. Thanks!

Just a quick update. I've done the test on my local end and it work fine. And the CI is finally green again! 🎉

Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Admittedly I did not read every line, but I think I read enough to be confident this is good…

@uranusjr uranusjr merged commit ede7cb2 into apache:main Sep 30, 2024
109 checks passed
@uranusjr uranusjr deleted the rename-dataset-as-asset branch September 30, 2024 05:30
kaxil added a commit to astronomer/airflow that referenced this pull request Oct 9, 2024
This PR reverts Core Extension doc change (https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/index.html) from apache#41348

Since those docs only work on Airflow stable, we can only change this after 3.0
kaxil added a commit that referenced this pull request Oct 9, 2024
This PR reverts Core Extension doc change (https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/index.html) from #41348

Since those docs only work on Airflow stable, we can only change this after 3.0
kunaljubce pushed a commit to kunaljubce/airflow that referenced this pull request Oct 13, 2024
This PR reverts Core Extension doc change (https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/index.html) from apache#41348

Since those docs only work on Airflow stable, we can only change this after 3.0
pavansharma36 pushed a commit to pavansharma36/airflow that referenced this pull request Oct 14, 2024
This PR reverts Core Extension doc change (https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/index.html) from apache#41348

Since those docs only work on Airflow stable, we can only change this after 3.0
R7L208 pushed a commit to R7L208/airflow that referenced this pull request Oct 17, 2024
This PR reverts Core Extension doc change (https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/index.html) from apache#41348

Since those docs only work on Airflow stable, we can only change this after 3.0
joaopamaral pushed a commit to joaopamaral/airflow that referenced this pull request Oct 21, 2024
harjeevanmaan pushed a commit to harjeevanmaan/airflow that referenced this pull request Oct 23, 2024
This PR reverts Core Extension doc change (https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/index.html) from apache#41348

Since those docs only work on Airflow stable, we can only change this after 3.0
PaulKobow7536 pushed a commit to PaulKobow7536/airflow that referenced this pull request Oct 24, 2024
This PR reverts Core Extension doc change (https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/index.html) from apache#41348

Since those docs only work on Airflow stable, we can only change this after 3.0
@Lee-W Lee-W self-assigned this Oct 28, 2024
ellisms pushed a commit to ellisms/airflow that referenced this pull request Nov 13, 2024
ellisms pushed a commit to ellisms/airflow that referenced this pull request Nov 13, 2024
This PR reverts Core Extension doc change (https://airflow.apache.org/docs/apache-airflow-providers/core-extensions/index.html) from apache#41348

Since those docs only work on Airflow stable, we can only change this after 3.0
@Lee-W Lee-W mentioned this pull request Nov 18, 2024
96 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
AIP-74 Dataset -> Asset airflow3.0:breaking Candidates for Airflow 3.0 that contain breaking changes area:API Airflow's REST/HTTP API area:lineage area:providers area:serialization area:webserver Webserver related Issues provider:amazon-aws AWS/Amazon - related issues provider:common-io provider:openlineage AIP-53
Projects
Development

Successfully merging this pull request may close these issues.

4 participants