In this third practical application assignment, your goal is to compare the performance of the classifiers (k-nearest neighbors, logistic regression, decision trees, and support vector machines) you encountered in this section of the program. You will use a dataset related to the marketing of bank products over the telephone.
The dataset you will use comes from the UCI Machine Learning repository (Links to an external site.). The data is from a Portuguese banking institution and is a collection of the results of multiple marketing campaigns. You can make use of the article (Links to an external site.) accompanying the dataset (in the .zip file) for more information on the data and features.
After understanding, preparing, and modeling your data, build a Jupyter Notebook that includes a clear statement demonstrating your understanding of the business problem, a correct and concise interpretation of descriptive and inferential statistics, your findings (including actionable insights), and next steps and recommendations.
- Histogram Plots of numerical attribtues
- Age Count Distribution by number of client subscribed
- Age Distribution and Occurence plots
- Job Distribution by number of clients subscribed
- Marital Distribution by number of clients subscribed
- Education Count Distribution by number of clients subscribed
- Accuracy of models without optimizing parameters using Grid Search CV
- Accuracy of models after optimizing parameters using Grid Search CV