The script starts by loading the housing dataset, casting variables, and providing an initial look at the data.
This section includes data summary statistics, correlation analysis, histograms, and box plots to gain insights into the dataset.
Here, missing values are imputed, and new variables are created to enhance the dataset. Feature scaling is also performed.
The data is split into training and test sets for model development and evaluation.
Using the randomForest package, a random forest regression model is trained on the training data.
This step involves evaluating the model's performance using metrics like Root Mean Squared Error (RMSE) on both the training and test sets. Variable importance is also visualized to understand the key factors influencing the model's predictions.
Feel free to explore the code and analysis in this repository to learn more about data analysis and machine learning in R. If you have any questions or suggestions, please don't hesitate to reach out and contribute to this project!