Skip to content

Latest commit

 

History

History
21 lines (14 loc) · 1.29 KB

README.md

File metadata and controls

21 lines (14 loc) · 1.29 KB

housing_data

Step 1 - Accessing the Data Set:

The script starts by loading the housing dataset, casting variables, and providing an initial look at the data.

Step 2 - Exploratory Data Analysis and Data Visualization:

This section includes data summary statistics, correlation analysis, histograms, and box plots to gain insights into the dataset.

Step 3 - Data Transformation:

Here, missing values are imputed, and new variables are created to enhance the dataset. Feature scaling is also performed.

Step 4 - Creating Training and Test Sets:

The data is split into training and test sets for model development and evaluation.

Step 5 - Supervised Machine Learning - Regression:

Using the randomForest package, a random forest regression model is trained on the training data.

Step 6 - Evaluating Model Performance:

This step involves evaluating the model's performance using metrics like Root Mean Squared Error (RMSE) on both the training and test sets. Variable importance is also visualized to understand the key factors influencing the model's predictions.

Feel free to explore the code and analysis in this repository to learn more about data analysis and machine learning in R. If you have any questions or suggestions, please don't hesitate to reach out and contribute to this project!