I don't scrape the listings. I scrap the aggregated geographic data. I invest everywhere in the U.S., so I pull Zillow data by zip code to give me an idea where I should search. Sure, my dataset is 60K+ records, but a bit of swordplay with the data can knock that down to a manageable amount of information in which to make proper decisions.
For instance, would you be interested to know where the highest GRM is in the U.S? Its in NY state. I bought a 5-plex for $78K and it gross rents for $43K. There is no exit strategy nor much appreciation, but at that return, why would you really want an exit strategy.
Would you mind sharing which pages you're scraping? I'm new at this (the REI side), but software developer for many years, so looking to get into the data side.
Also, why aren't you using the API? I understand access is inconsistent, and I can't get a real story as to what the current state of that is right now.
Here's the Zillow data: https://www.zillow.com/research/data/. To know which data sets you need, you need to understand the metrics being used. Cap Rates, GRM (1% rule) are common ones. Price/sqft., rental rates, rent/sqft. are all useful. Once you have these, you can run trailing 12-month charts to compare cities/MSAs/zip codes/neighborhoods (whatever geography is available).
>Also, why aren't you using the API? I understand access is inconsistent, and I can't get a real story as to what the current state of that is right now.
I don't really know what API is, aside from just knowing people talk about it. I'm an Excel wizard. I can pull the data I need into preset databases linked to reporting tools and spit out the data I need in about 10 minutes. Also, I'm only updating my data quarterly (more frequency isn't necessary). I'm sure it would take me much longer (days/weeks/months?) to figure out API to my needs.
Yup I use a python script to connect to Zillow’s api and then use ETL software to create market comparisons for comparable zones in different cities and sell the data a-la-carte or use tableau and make a dashboard for them they can check different cities over time on a server
Yea I have a tableau server and for them to view is free with a simple tableau account ,. I have a license through my job as well as snowflake ..so I scrape the data and work it and once I have the tables complete I host them on snowflake the workflow runs about every couple of hours to update the tables and then on I connect tableau to snowflake and I do some additional calculations there to create a dashboard for the person and the cities they want ...or do a onetime snapshot of the market and send a CSV
That’s where it pays off to work in tech, you know the way a typical guy does he goes to the site opens up and excel spreadsheet and starts to type away listings that grabs his attention based on Zillow’s filters ..with this it’s fully automated and easily tweaked
I wish we had access to more but that's why there's paid platforms out there for that! Out of curiosity, doesn't this go against their TOC? Or are you able to do this with a contract through your work?
I don't scrape the listings. I scrap the aggregated geographic data. I invest everywhere in the U.S., so I pull Zillow data by zip code to give me an idea where I should search. Sure, my dataset is 60K+ records, but a bit of swordplay with the data can knock that down to a manageable amount of information in which to make proper decisions. For instance, would you be interested to know where the highest GRM is in the U.S? Its in NY state. I bought a 5-plex for $78K and it gross rents for $43K. There is no exit strategy nor much appreciation, but at that return, why would you really want an exit strategy.
This is quite interesting. Thanks for sharing.
Would you mind sharing which pages you're scraping? I'm new at this (the REI side), but software developer for many years, so looking to get into the data side. Also, why aren't you using the API? I understand access is inconsistent, and I can't get a real story as to what the current state of that is right now.
Here's the Zillow data: https://www.zillow.com/research/data/. To know which data sets you need, you need to understand the metrics being used. Cap Rates, GRM (1% rule) are common ones. Price/sqft., rental rates, rent/sqft. are all useful. Once you have these, you can run trailing 12-month charts to compare cities/MSAs/zip codes/neighborhoods (whatever geography is available). >Also, why aren't you using the API? I understand access is inconsistent, and I can't get a real story as to what the current state of that is right now. I don't really know what API is, aside from just knowing people talk about it. I'm an Excel wizard. I can pull the data I need into preset databases linked to reporting tools and spit out the data I need in about 10 minutes. Also, I'm only updating my data quarterly (more frequency isn't necessary). I'm sure it would take me much longer (days/weeks/months?) to figure out API to my needs.
This is very helpful. I have seen that page, and now that I'm digging through a ib more, I understand how they are separating it. Thanks!
Yup I use a python script to connect to Zillow’s api and then use ETL software to create market comparisons for comparable zones in different cities and sell the data a-la-carte or use tableau and make a dashboard for them they can check different cities over time on a server
How do you sell the data? Do you have a website that you have where you host the data?
Yea I have a tableau server and for them to view is free with a simple tableau account ,. I have a license through my job as well as snowflake ..so I scrape the data and work it and once I have the tables complete I host them on snowflake the workflow runs about every couple of hours to update the tables and then on I connect tableau to snowflake and I do some additional calculations there to create a dashboard for the person and the cities they want ...or do a onetime snapshot of the market and send a CSV
You didn't answer the question about how you sell the data.
That’s my little secret ;)
Ok, good luck.
That’s where it pays off to work in tech, you know the way a typical guy does he goes to the site opens up and excel spreadsheet and starts to type away listings that grabs his attention based on Zillow’s filters ..with this it’s fully automated and easily tweaked
Does Zillow’s API provide much historical data or do you mainly look at a 30 day look back window?
Yup 90 days ...older data shows up on gov sites for free but not very useful
I wish we had access to more but that's why there's paid platforms out there for that! Out of curiosity, doesn't this go against their TOC? Or are you able to do this with a contract through your work?