Problem type: Time Series Regression
Includes solution of Gdz Elektrik Datathon 2023. I attended the competition solo and ranked 61th(top %27) out of 342 competitors and 234 teams.
- Exploratory analysis and visualization of the time series
- Testing, checking and visualizing the components of time series
- Checking the correlation of lag values with ACF-PACF and also lag plots
- Feature engineering; extracting calendar features, detecting and extracting most correlated lag features(based on a threshold), adding and fixing a few external data
- Feature selection with Sequential Feature Selection, RFECV, LOFO (not included in this repo)
- Model selection; CatBoostRegressor, LGBMRegressor, XGBRegressor. Continued with CatBoost-LGBM ensemble
- Hyperparameter tuning with Optuna
- Modelling and making predictions using both direct/recursive methods (I expected recursive method to improve my scores but it didn't help)
- I also added the function that I created to recursively predict with a chosen step size, instead of directly predicting the whole set. This method can also be called as walk forward method. More detail can be found in the regarding notebook.
- Weather data for both cities: İzmir, Manisa
- Definitions of station pressure and sea level pressure that I used to choose the right one for my problem: 1, 2
- Turkey COVID timeline
- Introduction to Time Series Forecasting With Python by Jason Brownlee
- Kishan Manani's presentation about time series and recursive forecasting
- Srivatsan Srinivasan's both Time Series Modelling and Analysis playlist and repo