Renewable Energy Analysis - Representing and Manipulating Data - Univeristy of Stirling Course Project
- From the project root folder:
pip install -r requirements.txt
- Add the project folder to the PYTHONPATH.
- To check that you addded it corectly run pytest from the project folder
Usefull Links:
To access the datasets import the renewable_energy_analysis package and access the paths like so:
import file_names as fn
# access original files
# use always r (read)
gdp_file = open(fn.OriginalPaths.GDP,'r')
or
import renewable_energy_analysis as fn
# access cleaned files
# use r (read) or w (write) depending on the use
gdp_file = open(fn.CleanedPaths.GDP, 'r')
- Remove the unused columns
- Set the column as : Nation, Year1, Year2,..., YearX
- If only have the Nation code, join your dataset with another that has also the Nation name
- Filter out the Nations by joining with the nation list (datasets/cleaned/common_countries).
- Filter out the Years by joining with the year list (datasets/cleaned/common_years).
- Set the row-index on Nation, Set a multilevel index on Year1, Year2,..., YearX called as the DataFrame
- Save the dataset on the cleaned folder
Also:
- Remember to keep only the columns in point 2. because you will add columns to the dataframe as you join.
- The script must be named datasetname_cleaning.py e.g.: Latitude_cleaning.py and placed in the renewable_enrgy_analysis/cleaning folder.