Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Run CKAN upload weekly on airflow #283

Closed
machow opened this issue Aug 16, 2021 · 4 comments · Fixed by #1710
Closed

Run CKAN upload weekly on airflow #283

machow opened this issue Aug 16, 2021 · 4 comments · Fixed by #1710
Assignees
Labels
infrastructure Maintenance of infrastructure supporting the data warehouse. Product owner is @evansiroky

Comments

@machow
Copy link
Contributor

machow commented Aug 16, 2021

TODO:

  • Run the script in services/gtfs-ckan-uploader on our kubernetes cluster used for RT download (or one with enough RAM to get this thing airborn).
  • Use or modify our pod_operator to trigger running it
@machow machow added this to the GTFS CKAN Completion milestone Aug 16, 2021
@hunterowens
Copy link
Member

This is blocked by the CKAN api limit on multiGB uploads.

@holly-g
Copy link
Contributor

holly-g commented Jan 3, 2022

Should be resolved; new version of CKAN was deployed.

@holly-g holly-g added the good first issue Good for newcomers label Jan 3, 2022
@holly-g holly-g added infrastructure Maintenance of infrastructure supporting the data warehouse. Product owner is @evansiroky and removed good first issue Good for newcomers labels Jan 24, 2022
@themightychris
Copy link
Contributor

Let's make sure catchup is disabled on this airflow DAG if we're just updating a single dataset with the latest

@lauriemerrell
Copy link
Contributor

Just to note -- it looks like the CKAN reports are driven off the latest-only data in gtfs_schedule:

image

I am refactoring this data in #1157. We may want to touch base on whether the report content needs to be reevaluated at all. (My refactor does not change the substance of that data, but as part of my refactor I have noticed that most places that were consuming this data were possibly doing so incorrectly and there may be other datasets that better suit the needs. So I wanted to flag that for CKAN as well.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
infrastructure Maintenance of infrastructure supporting the data warehouse. Product owner is @evansiroky
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants