The whole point of startup investing is to search for outliers. The way returns are distributed in the tech industry, it's not unusual for 1-2 companies to be responsible for 90%+ of a fund's financial returns.
Including Uber would probably have made most of the data meaningless - since their conclusions are valuation-weighted, their data would show that the ideal startup founder is...Garrett Camp. But then, that's how the startup investing business actually works - your data is useless unless you find the one outlier that everyone else missed.
Edit: It occurs to me that this effect could be overcome by taking the log of valuation (or whatever metric is of interest) and then running your statistics over that. That's standard procedure when trying to do statistics over a Zipfian or other power-law distribution; it lets the outliers count, but prevents them from distorting the averages too much.
The mean (or average) is a good choice for data with a normal distribution. However, if your data has extreme scores, such as the difference between an Uber and everyone else, you should look at the median or 90th percentile, because it's much more representative of your sample.
Median and 90th percentile are still pretty meaningless for the question that First Round is asking, notably "If I want to maximize my financial returns, what qualities should I look for in founders?" Miss that one company at the 99th percentile, and your return could be 10x lower.
http://www.paulgraham.com/swan.html
Including Uber would probably have made most of the data meaningless - since their conclusions are valuation-weighted, their data would show that the ideal startup founder is...Garrett Camp. But then, that's how the startup investing business actually works - your data is useless unless you find the one outlier that everyone else missed.
Edit: It occurs to me that this effect could be overcome by taking the log of valuation (or whatever metric is of interest) and then running your statistics over that. That's standard procedure when trying to do statistics over a Zipfian or other power-law distribution; it lets the outliers count, but prevents them from distorting the averages too much.