Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Problem is, I doubt he would have blogged this if the better algorithm had beaten the larger dataset.

As it happens, I broadly agree with his conclusion (data trumps algorithms), but cherrypicking data-points doesn't provide any evidence for it.



I'm not familiar with this guy's site so I'm not sure if he means something more nuanced, but "more data beats better algorithms" is too vague a claim to test by picking any number of datapoints. 100 datapoints will of course beat 1,000 datapoints if the former is selected via a random sampling that uses a uniform distribution across the entire population with no response bias and the latter is selected by asking the first 1,000 people you happen to see.


For that conclusion, we need more data.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: