Skip to content

An extensive analysis of past team performances, individual player performances, and the prediction of the winner for each T20 match.

Notifications You must be signed in to change notification settings

ramlanjekar/Amex-campus-challenge-2024

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 

Repository files navigation

Amex Campus Challenge 2024: Predictive Modeling for Data Analysis

Project Overview

The Amex Campus Challenge 2024 involved building a high-performing predictive model using a 1 lakh data point dataset derived from 3 different sources. The primary goal was to develop a model that could accurately predict outcomes, leveraging a range of advanced techniques, including feature engineering, data optimization, and machine learning. My solution aimed to push the boundaries of what could be achieved through thoughtful application of state-of-the-art machine learning methods.

Key Accomplishments

1. Feature Engineering:

  • Extracted and engineered 185 custom features, enhancing the dataset's utility for predictive modeling.
  • Employed KL Divergence to analyze the relationships and dependencies between features, identifying crucial patterns within the data.

2. Data Optimization:

  • Applied log transformations to key features to ensure a more normalized distribution, leading to better model training and performance.
  • Optimized dataset preparation techniques for improved memory usage and faster computation.

3. Advanced Representation:

  • Implemented neural network embeddings to capture complex patterns in the dataset, enabling more accurate model predictions.

4. Ensemble Learning:

  • Utilized an ensemble blending classifier to combine multiple machine learning models, resulting in improved performance with an accuracy of 58.75%.

Potential Improvements

XGBoost with Learning Rate Decay:

  • The model's performance could have been enhanced by training XGBoost with an exponential decay learning rate, which would have suited the dataset's characteristics and potentially led to even more exceptional results.

Results

  • Achieved 58.75% accuracy using the ensemble classifier technique.
  • Advanced to Round 2 of the competition, demonstrating the effectiveness of the applied techniques.

Technologies Used

  • Python: For data preprocessing, feature engineering, and model building.
  • Pandas: For data manipulation and feature extraction.
  • Scikit-learn: For implementing machine learning models and ensemble techniques.
  • Keras/TensorFlow: For implementing neural network embeddings.
  • Matplotlib/Seaborn: For data visualization and analysis.

About

An extensive analysis of past team performances, individual player performances, and the prediction of the winner for each T20 match.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published