I think the big achievement is how it surpassed in performance previous models f...

gillesjacobs · on Jan 21, 2022

That's not true for NLU at least. It is on par with 2018's RoBERTa on GLUE, many larger and advanced language models came after.

It is still great work though, a robust masking representation architecture that works across modalities.

kuu · on Jan 21, 2022

They mention that on the article:

> We apply data2vec separately to speech, images and text and it outperformed the previous best single-purpose algorithms for computer vision and speech and it is competitive on NLP tasks.