The post mentions not getting great results with OpenAI Transformer. I haven't t...

sh33mp · on Oct 29, 2018

ULM-FiT and OpenAI's Transformer* are quite different. Both are pretrained language-models, but ULM-FiT is a standard stack of LSTMs with a particular recipe for fine-tuning, whereas the OpenAI's Transformer uses the much newer Transformer architecture, and no really fancy tricks in the actual fine-tuning. I suspect the difficulty is with the Transformer model itself - this is not the first time I've heard that it is difficult to train.

* = To be clear, this refers to OpenAI's pretrained Transformer model. The Transformer architecture was from work at Google.

gargarplex · on Oct 29, 2018

Do any freelance work? We have a small fastText project. Email in profile if you're interested.

sweezyjeezy · on Oct 30, 2018

I'd be very interested to know, thanks!