Our top 2 solution to RuCode 2022 AI Payroll task
Public leaderboard | Private Leaderboard |
---|---|
Our team:
- Our task was to predict the salary for given vacanvies from trudvsem.ru
- The data was given as a .csv file with approx. 1kk samples in train.csv (71 columns) and 40k samples in test.csv (68 columns) (public - 40%, private - 60%)
- The training data included categorical, binary, continous and text features
- The target variable was
mean_salary
and additional targets werebase_salary_max
andbase_salary_mean
- Many of the values were missed, e.g. in one of the features 95% were nan values
- Fill in the misssing values
- Generate new features
- Select features
- Target Transformation
- Cross validate model
- Stabilize the model