Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Lake][Schema] Clean (predvalue, truevalue) columns - subgraph, prediction.py & all tables #650

Closed
2 tasks done
idiom-bytes opened this issue Feb 20, 2024 · 1 comment · Fixed by #664
Closed
2 tasks done
Assignees
Labels
Type: Bug Something isn't working

Comments

@idiom-bytes
Copy link
Member

idiom-bytes commented Feb 20, 2024

Motivation

This was discussed a while ago. This data from subgraph was renamed.

Please clean it up so that inside subgraph/prediction.py and related pdr_tables + bronze...

Outline:

  1. All prediction + trueval parameters should be renamed to be more objective and cleaner.
  2. predvalue and truevalue are the right names to use, and describe the value the user predicted, and the true value accepted for the prediction.

DoD:

  • All subgraph fetch and code has been updated to address this.
  • All tables and columns that describe predvalue and truevalue have been updated, and named correctly.
@idiom-bytes idiom-bytes added the Type: Bug Something isn't working label Feb 20, 2024
@idiom-bytes idiom-bytes changed the title [Lake][Schema] Rename subgraph/prediction.py & all tables (prediction, trueval) -> (predvalue, truevalue) [Lake][Schema] Clean (predvalue, truevalue) columns - subgraph, prediction.py & all tables Feb 20, 2024
@kdetry kdetry self-assigned this Feb 21, 2024
idiom-bytes added a commit that referenced this issue Mar 5, 2024
* issue650 renaming

* issue650 - test fixes

* issue650 black format

* issue650: fixes after merges

* black fix

* take-back the gql_data_factory from the main branch

* Removed print statements

---------

Co-authored-by: idiom-bytes <[email protected]>
@idiom-bytes
Copy link
Member Author

This is going to change the schema of the raw-tables, which is going to force people to re-fetch a bunch of stuff.

Rather than push this now, we merged it into #734 DuckDB Integration PR. After updating the lake/ETL is completed, we can push this out with it so we reduce the number of alterations.

kdetry added a commit that referenced this issue Mar 6, 2024
* issue650 renaming

* issue650 - test fixes

* issue650 black format

* issue650: fixes after merges

* black fix

* take-back the gql_data_factory from the main branch

* Removed print statements

---------

Co-authored-by: idiom-bytes <[email protected]>
kdetry added a commit that referenced this issue Mar 7, 2024
* last_record_logic is added

* #650 - Clean (predvalue, truevalue) columns (#664)

* issue650 renaming

* issue650 - test fixes

* issue650 black format

* issue650: fixes after merges

* black fix

* take-back the gql_data_factory from the main branch

* Removed print statements

---------

Co-authored-by: idiom-bytes <[email protected]>

* Fixing test

* issue682

* remove unneccessary files

* black

* issue682 removed unnecessary methods

* pylint fix

* pylint fix

---------

Co-authored-by: idiom-bytes <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Type: Bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants