Customer Segmentation and Acquisition - Bertelsmann Arvato

Machine Learning Engineer Udacity Nanodegree - Capstone Project

This repository contains code and report for "Capstone Project - Arvato Customer Segmentation" done as part of Udacity Machine Learning Engineer Nanodegree program.

Project Overview

In this project, the demographic data of German population and the customer data have been analysed in order to perform Customer Segmentation and Customer Acquisition. Arvato Financial Solutions is a services company that provides financial services, Information Technology (IT) services and Supply Chain Management (SCM) solutions for business customers on a global scale.

This project is to help a Mail-Order company to acquire new customers to sell its organic products. The goal of this project is to understand the customer demographics as compared to general population in order to decide whether to approach a person for future products.

This project is divided into two steps:

Customer Segmentation using Unsupervised Learning, in this part a thorough data analysis and feature engineering steps are performed to prepare the data for further steps. A Principal Component Analysis (PCA) is performed for dimensionality reduction. Then K-Means Clustering is performed on the PCA components to cluster the general population and the customer population into different segments. These clusters are studied to determine what features make a customer with the help of cluster weights and component weights.
Customer Acquisition using Supervised Learning, in this part of the project the customers data with defined targets indicating the past responses of the customers has been used to train Supervised Learning algorithms. Then the trained model is used to make predictions on unseen test data to determine whether a person could be a possible customer.

Data Description

The data has been provided by Udacity and Arvato Financial Solutions. The dataset contains 4 data files and 2 description files. The description files have information about the features and their explanation. The 4 data files include:

Customer Segmentation
- General Population demographics
- Customer demographics
Customer Acquisition
- Training data
- Test data

Technical Overview

The project has been divided into various steps which include:

Data Exploration and Cleaning
Feature Engineering
Dimensionality Reduction
Clustering
Supervised Learning
Model Evaluation
Predictions on Test data
Submission to Kaggle and Scoring

An explanation about each step and choice of algorithms, metrics has been given in the Report.pdf.

Requirements

All of the requirements are given in requirements.txt. To install Run: pip install -r requirements.txt

Results

The results have been clearly documented in the Jupyter Notebook. Please refer Arvato Project Workbook.ipynb.

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
imgs		imgs
terms_and_conditions		terms_and_conditions
.gitignore		.gitignore
Arvato Project Workbook.ipynb		Arvato Project Workbook.ipynb
README.md		README.md
Report.pdf		Report.pdf
proposal.pdf		proposal.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Customer Segmentation and Acquisition - Bertelsmann Arvato

Machine Learning Engineer Udacity Nanodegree - Capstone Project

Table of Contents

Project Overview

Data Description

Technical Overview

Requirements

Results

About

Releases

Packages

Languages

wpangxin/Bertelsmann-Arvato-customer-segmentation

Folders and files

Latest commit

History

Repository files navigation

Customer Segmentation and Acquisition - Bertelsmann Arvato

Machine Learning Engineer Udacity Nanodegree - Capstone Project

Table of Contents

Project Overview

Data Description

Technical Overview

Requirements

Results

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages