Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Pipeline Not Pulling in New Service Request Data for 2022 #1165

Closed
1 task
EchoProject opened this issue Jan 14, 2022 · 8 comments
Closed
1 task

Data Pipeline Not Pulling in New Service Request Data for 2022 #1165

EchoProject opened this issue Jan 14, 2022 · 8 comments
Labels
Bug Something isn't working Feature: Data Quality Role: Backend Related to API or other server-side work Size: Missing

Comments

@EchoProject
Copy link
Contributor

EchoProject commented Jan 14, 2022

Overview

We need to fix the data pipeline that gets 311 data because our users are not seeing any data starting in 2022.

After looking through the API, the Live Site is not pulling in new data for 2022. The tests below show a get request for January (test 1) 2021 and (test 2) 2022. We need to resolve this issue immediately so that we have a working tool for the NCs.

Action Items

Resources/Instructions

https://dev-api.311-data.org/docs#/default/get_all_service_requests_requests_get

Snip20220113_5

  • 2021-01-01 to - 2021-01-14

Snip20220113_4

  • 2022-01-01 to - 2022-01-14

Snip20220113_3

@EchoProject EchoProject added this to the 311 Data - Go Public milestone Jan 14, 2022
@EchoProject EchoProject added Bug Something isn't working Feature: Data Quality Role: Backend Related to API or other server-side work Size: Missing labels Jan 14, 2022
@EchoProject
Copy link
Contributor Author

@adamkendis @roz0n
We were made aware of the data problem from another PM on another project looking to use our tool for workshops. Could we make this a priority fix as we may lose EmpowerLA web placement if our tool is not working?

@ryanmswan
Copy link
Contributor

ryanmswan commented Jan 14, 2022

@EchoProject @adamkendis @roz0n

I can't test the backend integration and I don't want to break anything but in config.toml I think you need to add:

2022 = "i5ke-k6by" at line 31

and

2022

at line 43

ADDITIONAL INFO EDIT: This is to include the Socrata dataset from here.

@EchoProject
Copy link
Contributor Author

Message sent to Pras:

" Hi Pras, The descriptions of how 311 data loading works is here https://github.com/hackforla/311-data/blob/dev/docs/data_loading.md and the code is here https://github.com/hackforla/311-data/tree/dev/server/prefect. LA creates a new data source for every calendar year.

If 2022 is ready then it needs to be added to the prefect config and a new release needs to go out. The config file is here: https://github.com/hackforla/311-data/blob/dev/server/prefect/config.toml "

@EchoProject
Copy link
Contributor Author

Bonnie will bring this to Data Science team to see if we can make a quick fix.

KarinaLopez19 pushed a commit that referenced this issue Feb 6, 2022
@ExperimentsInHonesty
Copy link
Member

@KarinaLopez19 @salice
It looks like data is still missing from Jan 5th to Jan 20

@pras says you are trying to figure out how to backfill the data now during the period that the tool was not updating data. We look forward to an update when you are ready for us to look at it again Please tag @snooravi when you are.

@EchoProject
Copy link
Contributor Author

Pras Feb 8th at 9:41 AM
"Hi Everyone, Happy to inform that @Matt Webster & @jake mensch worked together for the last week & have fixed the data pipeline issue for 2022. You should be able to see data flowing from 2/7 in our production site. We are planning how to do the backfill for the period when sync was off.
Thanks again Matt & Jake. Appreciate that very much!"

The correction can be found here.

@nichhk
Copy link
Member

nichhk commented May 4, 2022

If I understand correctly, we are getting data from 2022 now. I manually verified this on the production site by selecting the "last week" time range.

We are, however, missing data from 1/5-1/20 (manually verified on prod site). We might be missing data from other date ranges as well, but I'm not sure how to check this right now.

It seems like this bug has technically been fixed, but we should probably open a new feature request for backfilling data form Socrata given a custom date range. This would be generally useful in case of future breakages. How does that sound?

@EchoProject
Copy link
Contributor Author

Closing this issue since the bulk of the data pipeline issue has been fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Something isn't working Feature: Data Quality Role: Backend Related to API or other server-side work Size: Missing
Projects
Status: Done (without merge)
Development

No branches or pull requests

4 participants