Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> I must believe there's any easy way to eliminate some "outliers" using mathematics, but I can't recall the function(s) to do so.

The median is one good way, as you already have. You can also use the interquartile mean: http://en.wikipedia.org/wiki/Interquartile_mean



At the moment I'm filtering out items 2 standard deviations out of the median. It catches the ridiculous cases, i.e. when some fool tries to get away with selling an iphone for $6000 (yes I've seen this before).

Perhaps I need to filter it within 1 or 1.5 stdevs. Will experiment with this.

However, sometimes you can easily see there are two clusters of results. Not sure how to mathematically determine this. Any ideas?




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: