Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

32 - [feat] Continue Connecting NYISO and NYSERDA, and add Key Development Milestones #39

Merged
merged 24 commits into from
Oct 28, 2024

Conversation

deenasun
Copy link
Contributor

@deenasun deenasun commented Oct 26, 2024

What's new in this PR

Description

Updated api/webscraper/database.py:

  • Updated workflow of updating/inserting projects to also include updates for the key development milestones
  • If a matching project already exists in Supabase, get the key development milestones from the existing project and update it with new info
  • Otherwise, use a blank default key development milestone dict
  • Use the create_update_object helper function to make sure we only update fields that are currently empty rather than overriding all fields when updating

Note: nyiso_to_database() is for all the projects in the "Interconnection Queue" and "Cluster Projects" sheets of the NYISO xlsx. nyiso_in_service_to_database() is for the projects in the "In Service" sheet, which has a different flow for correctly updating the "Start of operations" milestone.

Updated api/webscraper/nyiso_scraper.py:

  • query_nyiso_excel gets the entire xlsx spreadsheet from NYISO
  • Then, the functions filter_nyiso_iq_sheet, filter_nyiso_cluster_sheet, and filter_nyiso_in_service_sheet are used to filter the individual sheets from the xlsx file
  • Each function cleans the data (removes any irrelevant rows and fills NaN/other invalid cells with None)
  • Also filter for Date of IR and IA Tender Date to update the key development milestones later
  • Only keep projects with size greater than or equal to 2,000 kW and convert size from kW to mW

Updated api/webscraper/nyserda_scraper.py:

  • Include field permit_process
  • Include field year_of_delivery_start_date for updating the key development milestones later

Note: NYSERDA's API limits us to only fetching 50,000 rows at once. So we have to make repeated API calls to get all the rows from the small solar projects database.

Updated api/utils/scraper_utils.py:

  • update_kdm is a helper function for updating key_development_milestone elements inside the key_development_milestone array
  • clean_df_data removes/replaces any invalid cells from the NYISO spreadsheet dataframe objects
  • standardize_label is used for fixing renewable energy technology labels that have hyphens in-between words (should fix the "Land-based Wind" discrepancy)
  • turn_timestamp_to_string is a function that turns pd.timestamp objects into a string representation for serialization (timestamp objects are not JSON serializable)

Created api/webscraper/database_constants.py:

  • Contains helpful constants to import into other files
  • renewable_energy_set contains the set of all renewable energy types we're collecting
  • renewable_energy_map is for mapping abbreviated NYISO renewable energy technologies to readable strings
  • initial_kdm_dict is used as the default key development milestone json

Screenshots

How to review

git fetch origin
git checkout 32-continue-connecting-nyiso-and-nyserda

In the root directory, run this command to download all the necessary python dependencies:

pip install -r requirements.txt

Next steps

  • Parse HTML of NYISO to get the most up-to-date xlsx download link
  • Function for updating existing projects based on last updated date/differences in key/values
  • Parse HTML of NYSERDA small-scale solar project database to find how many epochs we need to run for repeated data fetching?
  • Find way to reverse geocode NYISO project locations???

Relevant links

Online sources

Related PRs

Also included fixes from this issue:

CC: @itsliterallymonique

…le geocoding api to retrieve latitude and longitude
…-feat-connecting-nyiso-and-nyserda-to-database
…ditional filtered fields, created database.py
16-feat-connecting-nyiso-and-nyserda-to-database
…s, standardize renewable technology labels, map abbreviated energy tech from NYISO to readable string
@deenasun deenasun marked this pull request as ready for review October 26, 2024 23:15
Copy link
Collaborator

@itsliterallymonique itsliterallymonique left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking good Deena. Just make the 2 changes I commented on! thanks

renewable_energy_set = {'Hydroelectric', 'Land Based Wind', 'Offshore Wind', 'Solar', 'Geothermal', 'Energy Storage', 'Pumped Storage'}

renewable_energy_map = {
'H': 'Hydroelectric',
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add 'W': 'Land-Based Wind'

"renewable_technology", None
),
"developer": item.get("developer_name", None),
"proposed_cod": item.get("year_of_delivery_start_date", None),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't get proposed COD from year_of_delivery_start_date. Only NYISO has the proposed COD

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@itsliterallymonique would you like me to remove the inclusion of the proposed_cod date from the NYSERDA small-scale solar projects as well? currently, we're using the "interconnection_date" field for the small-scale solar projects as an estimate for the proposed COD

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@deenasun hmmm. Let's keep that for now in the small-scale solar database. I will check with our POC if we can use interconnection_date for proposed_cod. I am not too sure if they are the same.

Copy link
Contributor Author

@deenasun deenasun Oct 28, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! (done but pushed changes to the branch 26-feat-parse-nyiso-link-from-html)

@deenasun deenasun merged commit 6c021be into main Oct 28, 2024
2 checks passed
@deenasun deenasun deleted the 32-continue-connecting-nyiso-and-nyserda branch November 8, 2024 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[feat] Continue Connecting NYISO and NYSERDA, and add Key Development Milestones
2 participants