Quick rental market analysis with Python, Panda

houses_image_drawing rental_analysis_boxplot_2016

Real estate and rentals are always interesting, especially when one is in the market either as a tenant or a landlord.  Though some data is available in the market through APIs it is generally old by days or weeks or already summarized to the extent that it is not much helpful. And if you like to get most granular data to perform some analysis it gets even harder. Craigslist website is one good alternative and you can get most recent (real-time) data that one can use.  But it needs quite a bit of work to pull, extract and clean before using it.

I use my own tool written in Python to help me performing the analysis.  The tool helps me in two steps – 1. by pulling and extracting data and then 2. cleaning and saving the data to files.  The tool can be configured to use specific web links to pull the relevant data.

You can pull rental data at multiple levels including zip code and city.  In the following example I used city name.  Once the tool pulls the data it looks for rent (dollar), size of the house (square feet) and number of bedrooms then save these data elements in a csv file.

sl,rent,bed_rooms,sq_ft,data_date
1,4300,4,2500,2016-01-07
2,3400,4,2100,2016-01-07
3,3095,4,1704,2016-01-07
4,3200,4,1700,2016-01-07
5,4995,5,4645,2016-01-07
...

Then comes using Jupyter and Panda to build metrics like mean, median, rent/sq_ft, etc.


import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib
matplotlib.style.use('ggplot')
matplotlib.use('TkAgg')

rents = pd.read_csv('/Data/craig_eby_housing_sam.csv')
rents = rents.drop('data_date', 1)
rents = rents.drop('sl', 1)
# rents.groupby('bed_rooms').describe()

new_rents_df = rents   # make a copy
new_rents_df['rent_per_sq_ft'] = new_rents_df['rent']/new_rents_df['sq_ft']
new_rents_df.head(5)

aggs2 = {
  'rent' : {
    # 'count': 'count',
    'mean' : 'mean',
    'max'  : 'max',
    'min'  : 'min',
    'median': 'median',
    'std'  : 'std',
  },
  'sq_ft' : {
    'mean' : 'mean',
    'max'  : 'max',
    'min'  : 'min',
    'median': 'median',
    'std'  : 'std'
  },
  'rent_per_sq_ft' : {
    'mean' : 'mean',
    'max'  : 'max',
    'min'  : 'min',
    'median': 'median',
    'std'  : 'std'
  },
}
metrics = new_rents_df.groupby('bed_rooms').agg(aggs2)
metrics

rental_analysis_craigslist_2016


fig, ax = plt.subplots()
new_df = pd.DataFrame(new_rents_df, columns=('bed_rooms', 'rent', 'sq_ft'))
new_df.boxplot(by='bed_rooms')
plt.show()

 

rental_analysis_boxplot_2016Note: Y axis is in $ for rent and ft2 for sq_ft chart

This gives quick insight into outliers, quintiles, mean, etc and helps in making one’s decision.  There is lot more goes into renting before the final decision but the above analysis could be the first step in the process.

This can easily be extended to see how rent/sq_ft varies with respect to number of bedrooms or see the trend over time.  With the column data_date above and storing all the scraped data in files or in cloud we can see how rental market is trending in a specific geo location.

One thought on “Quick rental market analysis with Python, Panda

  1. Shiva,good to see you have used python to pull panda out.
    See if you can use the tool to do similar analytics of Bengaluru JP Nagar near JP Nagar Metro station.
    Will be thankful to see the data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s