“Some popular models like Prophet [Taylor and Letham, 2018] and ARIMA were excluded from the analysis due to their prohibitive computational requirements and extensive training times.”
Anyone who work a lot in time series forecasting can explain this in some more details?
I’ve def used ARIMA, but only for simple things. Not sure why this would be more expensive to train and run than a Transformer model, and even if true, ARIMA is so ubiquitous that comparing resources & time would be enlightening. Otherwise it just sounds like a sales pitch and throw more obscure acronyms for a bit of “I’m the expert, abc xyz industry letters” marketing.
We love ARIMAs. That is why we put so much effort into creating fast and scalable Arimas and AutoArima in Python [1].
Regarding your valid concern. There are several reasons for the high computational costs. First, ARIMA and other "statistical" methods are local, so they must train one different model for each time series. (ML and DL models are global, so you have 'one' model for all the series.) Second, the ARIMA model usually performs poorly for a diverse set of time series, like the one considered in our experiments. The AutoARIMA is a better option, but its training time is considerably longer, given the number and length of the series. Also, AutoARIMA tends to be very slow for long series.
In short: for the 500k series we used for benchmarcking, ARIMA would have taken literally weeks and would have been very expensive.
That is why we included many well-performing local "statistical" models, such as the Theta and CES. We used the implementations on our open-source ecosystem for all the baselines, including StatsForecast, MLForecast, and Neuralforecast. We will release a reproducible set of experiments on smaller subsets soon!
I immediately tried to find a comparison with ARIMA as well and was disappointed. It's difficult to take this paper seriously when they dismiss a forecasting technique from the 70's because of "extensive training times".
Even then, 500 years of daily data is less than 200k observations, most of which are meaningless for predicting the future. Less than 16B seconds of data. Regression might not handle directly, but linear algebra tricks are still available.
While I could find some excuses to exclude ARIMA, notably that in practice you need to input some important priors about your time series (periodicity, refinements for turning points, etc) for it to work decently, "prohibitive compute and extensive training time" are just not applicable.
That part is a bit wanky, but the rest of the paper, notably the zero shot capability, is very interesting if confirmed. I look forward for it to be more accessible than a "contact us" api to compare to ARIMA and others myself
I have need doing time series forecasting professionally. ARIMA is computationally one of the cheapest (both training and inference) forecasting models out there. It suffers from many deficiencies and shortcomings but computational efficiency is not one of them.
> “Some popular models like Prophet [Taylor and Letham, 2018] and ARIMA were excluded from the analysis due to their prohibitive computational requirements and extensive training times.”
Yes, I've done some work in time series forecasting. The above sentence is the one that tipped me off to this paper being BS, so I stopped reading after that. :) I can't take any paper about timeseries forecasting seriously by an author who isn't familiar with the field.
Eh it's not as if you could just project down the 300k time series to something lower dimensional for forecasting. The TimeGPT would have to do something similar to avoid the same problem.
Though I can't quite figure out how the predicting works exactly, they have a lot of test series but do they input all of them simultaneously?
If true, then beating it and looking good will be easy.
Having trained ARIMA models in my day, I will say that long training times and training cost -- compared to any deep learning model -- is not something that ever crossed my mind.
High training times could be cost prohibitive. Currently, its over $100mil to train GPT4 from scratch (which possibly includes other costs related to RLHF and data acquisition). Not sure how this model compares, but its likely not cheap.
Anyone who work a lot in time series forecasting can explain this in some more details?
I’ve def used ARIMA, but only for simple things. Not sure why this would be more expensive to train and run than a Transformer model, and even if true, ARIMA is so ubiquitous that comparing resources & time would be enlightening. Otherwise it just sounds like a sales pitch and throw more obscure acronyms for a bit of “I’m the expert, abc xyz industry letters” marketing.