Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decide how to handle null coordinate point entries in the Soft-Story properties dataset #124

Open
agennadi opened this issue Dec 19, 2024 · 2 comments
Assignees

Comments

@agennadi
Copy link
Collaborator

Context

The Soft-Story properties dataset currently contains 4941 rows, with approximately 10 rows having null values in the point field. Valid point values are essential for accurately displaying soft-story properties on the map. These missing points need to be resolved during the transformation process of the dataset.

One potential approach is to load Addresses dataset to look up the missing points based on property addresses. However, loading 400,000 addresses to resolve 10 missing points may be excessive.

Alternatively, the missing points could be added manually. While this might work for now, it is not a scalable solution for future cases.

Research is needed to identify the optimal method for handling null values in the dataset.

Definition of Done

  • The optimal solution for handling null values is identified.
  • The solution is shared with the backend team.

Technical Details

Check out the Soft-Story properties dataset.
Relevant ETL code can be found in backend/etl/soft_story_properties_data_handler.py.

@leela-solomon
Copy link
Collaborator

  1. Add source column to track the nulls which we filled in.
  2. Fill in the null coordinates manually or with easy automation.

@elucherini
Copy link
Collaborator

We can write manual UPDATE statements to fill in the nulls, making sure we also update the source column. Alternatively, we could keep a separate table and left-join it to the original, but then we end up with two tables to maintain. I think option 1 is good enough for 10 missing points.

The data source can be MapBox. The simplest solution is to use this: https://docs.mapbox.com/playground/geocoding/ , or have ChatGPT write a simple script using our API key.

@leela-solomon leela-solomon changed the title Decide how to handle empty polygons in the Soft-Story properties dataset Decide how to handle null coordinate point entries in the Soft-Story properties dataset Jan 3, 2025
@agennadi agennadi removed their assignment Jan 6, 2025
@agennadi agennadi moved this from Ready for Requirements to In Progress in QuakeSafe Project Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: In Progress
Development

No branches or pull requests

6 participants