Skip to content

Repository for UniSA INFS5098 Kaggle Titanic Machine Learning Challenge

Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



21 Commits

Repository files navigation


Repository for UniSA INFS5098 Kaggle Titanic Machine Learning Challenge

This WORKING repository contains my contributions to the Kaggle Titanic Machine Learning competition:

This repository uses R code to audit, cleanse, rework and model predictions for Titanic survivors. There are three R scripts in this repository:

  • '01_Titanic_Audit.rmd': R Markdown file using "knitr" package to explore and describe the Titanic dataset [in progress]
  • '02_Titanic_FeatureEngine.R': R script to audit, cleanse and create new variables [in progress]
  • '03_Titanic_Model.R': R script with various models to predict surivivors of the Titanic disaster [in progress]

Submissions using this code form part of the Kaggle team below:

Note on the variables created for this exercise:

  • below is the structure of the data frame after the '02_Titanic_FeatureEngine.R' script has been run.
  • newly created variables are in lowercase
  • refer attached R script for objectives of each variable
'data.frame':	891 obs. of  27 variables:
 $ PassengerId   : int  1 2 3 4 5 6 7 8 9 10 ...
 $ Survived      : Factor w/ 3 levels "0","1","model": 1 2 2 2 1 1 1 1 2 2 ...
 $ Pclass        : Factor w/ 3 levels "1","2","3": 3 1 3 1 3 3 1 3 3 2 ...
 $ Name          : chr  "Braund, Mr. Owen Harris" "Cumings, Mrs. John Bradley (Florence Briggs Thayer)" "Heikkinen, Miss. Laina" "Futrelle, Mrs. Jacques Heath (Lily May Peel)" ...
 $ Sex           : Factor w/ 2 levels "female","male": 2 1 1 1 2 2 2 2 1 1 ...
 $ Age           : num  22 38 26 35 35 ...
 $ SibSp         : int  1 1 0 1 0 0 0 3 0 1 ...
 $ Parch         : int  0 0 0 0 0 0 0 1 2 0 ...
 $ Ticket        : chr  "A/5 21171" "PC 17599" "STON/O2. 3101282" "113803" ...
 $ Fare          : num  7.25 71.28 7.92 53.1 8.05 ...
 $ Cabin         : Factor w/ 187 levels "A10","A11","A14",..: 187 107 187 71 187 187 164 187 187 187 ...
 $ Embarked      : Factor w/ 3 levels "C","Q","S": 3 1 3 3 3 2 3 3 3 1 ...
 $ title         : Factor w/ 18 levels " Capt"," Col",..: 13 14 10 14 13 13 13 9 14 14 ...
 $ titlegroup    : Factor w/ 6 levels "Master","Miss",..: 4 4 6 4 4 4 4 6 4 4 ...
 $ lastname      : chr  "Braund" "Cumings" "Heikkinen" "Futrelle" ...
 $ nickname      : int  0 0 0 0 0 0 0 0 0 0 ...
 $ altname       : int  0 1 0 1 0 0 0 0 1 1 ...
 $ iceberg       : int  0 0 0 0 0 0 0 0 0 0 ...
 $ deck          : Factor w/ 9 levels "A","B","C","D",..: 9 3 9 3 9 9 5 9 9 9 ...
 $ subclass      : Factor w/ 15 levels "1A","1B","1C",..: 15 3 15 3 15 15 5 15 15 11 ...
 $ side          : Factor w/ 3 levels "port","starboard",..: 3 2 3 2 3 3 1 3 3 3 ...
 $ familysize    : Factor w/ 9 levels "1","2","3","4",..: 2 2 1 2 1 1 1 5 3 2 ...
 $ farepp        : num  3.62 35.64 7.92 26.55 8.05 ...
 $ marriagelength: Factor w/ 3 levels "Long","Short",..: 2 1 3 1 3 3 3 3 3 2 ...
 $ childage      : Factor w/ 3 levels "Old","Unknown",..: 1 1 2 1 2 2 2 3 1 3 ...
 $ faregroup     : Factor w/ 4 levels "<10","10-20",..: 1 4 1 4 1 1 4 3 2 4 ...
 $ classregion   : Factor w/ 9 levels "C1","C2","C3",..: 9 1 9 7 9 6 7 9 9 2 ...



Repository for UniSA INFS5098 Kaggle Titanic Machine Learning Challenge






No releases published


No packages published
