Skip to content

Repository for UniSA INFS5098 Kaggle Titanic Machine Learning Challenge

Notifications You must be signed in to change notification settings

rjshanahan/INFS5098_KaggleTitanic

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

INFS5098_KaggleTitanic

Repository for UniSA INFS5098 Kaggle Titanic Machine Learning Challenge

This WORKING repository contains my contributions to the Kaggle Titanic Machine Learning competition:

https://www.kaggle.com/c/titanic-gettingStarted

This repository uses R code to audit, cleanse, rework and model predictions for Titanic survivors. There are three R scripts in this repository:

  • '01_Titanic_Audit.rmd': R Markdown file using "knitr" package to explore and describe the Titanic dataset [in progress]
  • '02_Titanic_FeatureEngine.R': R script to audit, cleanse and create new variables [in progress]
  • '03_Titanic_Model.R': R script with various models to predict surivivors of the Titanic disaster [in progress]

Submissions using this code form part of the Kaggle team below:

https://www.kaggle.com/t/90000/unisa-masters

Note on the variables created for this exercise:

  • below is the structure of the data frame after the '02_Titanic_FeatureEngine.R' script has been run.
  • newly created variables are in lowercase
  • refer attached R script for objectives of each variable
'data.frame':	891 obs. of  27 variables:
 $ PassengerId   : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Survived      : Factor w/ 3 levels "0","1","model": 1 2 2 2 1 1 1 1 2 2 ...
 $ Pclass        : Factor w/ 3 levels "1","2","3": 3 1 3 1 3 3 1 3 3 2 ...
 $ Name          : chr  "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
 $ Sex           : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 2 1 1 ...
 $ Age           : num  22 38 26 35 35 ...
 $ SibSp         : int  1 1 0 1 0 0 0 3 0 1 ...
 $ Parch         : int  0 0 0 0 0 0 0 1 2 0 ...
 $ Ticket        : chr  "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
 $ Fare          : num  7.25 71.28 7.92 53.1 8.05 ...
 $ Cabin         : Factor w/ 187 levels "A10","A11","A14",..: 187 107 187 71 187 187 164 187 187 187 ...
 $ Embarked      : Factor w/ 3 levels "C","Q","S": 3 1 3 3 3 2 3 3 3 1 ...
 $ title         : Factor w/ 18 levels " Capt"," Col",..: 13 14 10 14 13 13 13 9 14 14 ...
 $ titlegroup    : Factor w/ 6 levels "Master","Miss",..: 4 4 6 4 4 4 4 6 4 4 ...
 $ lastname      : chr  "Braund" "Cumings" "Heikkinen" "Futrelle" ...
 $ nickname      : int  0 0 0 0 0 0 0 0 0 0 ...
 $ altname       : int  0 1 0 1 0 0 0 0 1 1 ...
 $ iceberg       : int  0 0 0 0 0 0 0 0 0 0 ...
 $ deck          : Factor w/ 9 levels "A","B","C","D",..: 9 3 9 3 9 9 5 9 9 9 ...
 $ subclass      : Factor w/ 15 levels "1A","1B","1C",..: 15 3 15 3 15 15 5 15 15 11 ...
 $ side          : Factor w/ 3 levels "port","starboard",..: 3 2 3 2 3 3 1 3 3 3 ...
 $ familysize    : Factor w/ 9 levels "1","2","3","4",..: 2 2 1 2 1 1 1 5 3 2 ...
 $ farepp        : num  3.62 35.64 7.92 26.55 8.05 ...
 $ marriagelength: Factor w/ 3 levels "Long","Short",..: 2 1 3 1 3 3 3 3 3 2 ...
 $ childage      : Factor w/ 3 levels "Old","Unknown",..: 1 1 2 1 2 2 2 3 1 3 ...
 $ faregroup     : Factor w/ 4 levels "<10","10-20",..: 1 4 1 4 1 1 4 3 2 4 ...
 $ classregion   : Factor w/ 9 levels "C1","C2","C3",..: 9 1 9 7 9 6 7 9 9 2 ...

References:

https://github.com/wehrley/wehrley.github.io/blob/master/SOUPTONUTS.md
http://trevorstephens.com/post/72916401642/titanic-getting-started-with-r
http://www.slideshare.net/michellebanzondarling/final-pink-panthers0331
https://www.kaggle.com/c/titanic/prospector

About

Repository for UniSA INFS5098 Kaggle Titanic Machine Learning Challenge

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages