Only if you consider techniques like Word2Vec[1] as a non-breakthough. No one kn...

Only if you consider techniques like Word2Vec[1] as a non-breakthough.

No one knew you could do it, but yes, it build upon previous work. I was working in the field before and after and if anyone had asked me if one could represent all human languages in only 300 dimensions, and have vector composition be meaningful I'd have laughed at them.

Take using back-propagation to train deep neural networks. People had shown it worked in 1 or 2 layer networks, but despite years of work no one had been able to train anything deep enough to be useful. Then Krizhevsky, Sutskever and Hinton won ImageNet[2], proved it was possible and kicked off this whole ML craziness.

Neither of them is exactly because of more powerful computers, nor magical math breakthroughs. It was more lots of hard work by researchers trying many combinations of techniques until something worked.

These techniques, combined with huge volumes of data and - yes - more powerful computers are what have made the difference.

[1] https://papers.nips.cc/paper/5021-distributed-representation...

[2] https://papers.nips.cc/paper/4824-imagenet-classification-wi...