Featured image of post Property Rental Price Prediction

Property Rental Price Prediction

Determining Property Rental Prices with Machine Learning, study case Property on San Francisco, CA

Overview

  • Create a tool that functions to provide rental price predictions for various properties based on the features possessed by the property.
  • Machine learning regression model is used to predict property rental rates.
  • Using data sets of San Francisco,CA property rental price.

Code and Resources Used

Python Version: 3.10.3
Packages: pandas, numpy, sklearn, math
Dataset source: Datacamp case study

Processing Data

After collect Dataset, I need to check and clean up the dataset to make sure no missing and anomaly values on Dataset before creating a Machine Learning model. I made the following changes and created the following variables:

  • Import Dataset to Data Frame with pandas.
  • Fix datatype of every feature.
  • Check and dealing with missing values.
  • Cleaning object columns and labeled.
  • Analyze and resolve anomaly values.

Data Comparation

Before resolve anomaly Image 1

After resolve anomaly Image 2

Property Location

Color represents rental price.

Data Engineering

From property location we can calculate the distance of property location to downtown. So, i add new feature ‘distance’. Here the heat map of linear correlations

Heatmap correlations

Model and Result

Rank Model Score MAE RMSE
1 RandomForestRegressor 0.575 72.24 132.62
2 GradientBoostingRegressor 0.570 73.16 133.41
3 LinearRegression 0.445 84.19 151.52
4 Ridge 0.445 84.19 151.51
5 Lasso 0.445 83.69 151.59
6 DecisionTreeRegressor 0.225 91.96 179.02

From several models, RandomForestRegressor obtained the highest accuracy score, with score 57.5%. and with optimization get results:

Model Score MAE RMSE
RandomForestRegressor 0.622 69.23 125.13

EDA

Feature importance

Here we can see that the high and low price of property rental is strongly influenced by the number of rooms and sligly influenced by property location.

More rooms, more prices.

Distance? doesn’t really matter

Output Sample

Here, sample distribution of predict and actual price rent.

x-axis is simplified