In a Zillow Porchlight article, Zillow describes (barely) how they calculate the value of a home and how they are trying to become more accurate. According to this article, when they launched in 2006, their median error for these predictions was 14%. That means half of their estimates were closer than 14% of the actual selling price and half were farther away. They say their median error today is 4.3%, so they actually are becoming more accurate.

My favorite line in the article, though, is this: “A home’s value is ultimately determined by what someone else is willing to pay for it.” This is an obviously true statement, but it reminds us that the only thing that matters is willingness to pay. And although Zillow says they are calculating the value of homes, what they are really calculating is what buyers are willing to pay for a home.

That is exactly what you should be thinking about with your own data. Can you create algorithms to predict what your customers are willing to pay?

Notice that the big data Zillow uses is past transactions. They use the price a house sold at (how much a buyer was willing to pay) as the dependent variable and everything else they can learn about a house (locations, square feet, school district rating, etc.) as the independent variables.

You can do the exact same thing. Make the assumption that the price your current customers paid when they made the decision was their willingness to pay. That’s your dependent variable. Then you can create algorithms and formulas that take everything else you know about your customers and try to “predict” those prices.

This is not easy. While Zillow improved their median error from 14% to 4.3%, they are still trying to improve it. The previously mentioned article was about a contest where they are offering a \$1 million prize to outsiders who might be able to improve it. They say that 15,500 people have downloaded their contest data set.

So what should you take away from this blog? Here are a couple of thoughts: