“Some popular models like Prophet [Taylor and Letham, 2018] and ARIMA were exclu...

LevoMX · on Oct 13, 2023

We love ARIMAs. That is why we put so much effort into creating fast and scalable Arimas and AutoArima in Python [1].

Regarding your valid concern. There are several reasons for the high computational costs. First, ARIMA and other "statistical" methods are local, so they must train one different model for each time series. (ML and DL models are global, so you have 'one' model for all the series.) Second, the ARIMA model usually performs poorly for a diverse set of time series, like the one considered in our experiments. The AutoARIMA is a better option, but its training time is considerably longer, given the number and length of the series. Also, AutoARIMA tends to be very slow for long series. In short: for the 500k series we used for benchmarcking, ARIMA would have taken literally weeks and would have been very expensive. That is why we included many well-performing local "statistical" models, such as the Theta and CES. We used the implementations on our open-source ecosystem for all the baselines, including StatsForecast, MLForecast, and Neuralforecast. We will release a reproducible set of experiments on smaller subsets soon!

[1] https://nixtla.github.io/statsforecast/docs/models/arima.htm...

mvanaltvorst · on Oct 13, 2023

I immediately tried to find a comparison with ARIMA as well and was disappointed. It's difficult to take this paper seriously when they dismiss a forecasting technique from the 70's because of "extensive training times".

a5seo · on Oct 13, 2023

Maybe if your time interval is super short and you have hundreds of years of data? Otherwise, I’m not sure what they’re on about.

tomrod · on Oct 14, 2023

Even then, 500 years of daily data is less than 200k observations, most of which are meaningless for predicting the future. Less than 16B seconds of data. Regression might not handle directly, but linear algebra tricks are still available.

m3at · on Oct 13, 2023

I was surprised too!

While I could find some excuses to exclude ARIMA, notably that in practice you need to input some important priors about your time series (periodicity, refinements for turning points, etc) for it to work decently, "prohibitive compute and extensive training time" are just not applicable.

That part is a bit wanky, but the rest of the paper, notably the zero shot capability, is very interesting if confirmed. I look forward for it to be more accessible than a "contact us" api to compare to ARIMA and others myself

alfalfasprout · on Oct 13, 2023

Excluding prophet and ARIMA makes it hard to take this seriously... those are super widely used.

bllguo · on Oct 14, 2023

are you aware of the significant amounts of criticism for prophet?

arima, sure

lr1970 · on Oct 13, 2023

I have need doing time series forecasting professionally. ARIMA is computationally one of the cheapest (both training and inference) forecasting models out there. It suffers from many deficiencies and shortcomings but computational efficiency is not one of them.

EDIT: typos

loxias · on Oct 14, 2023

> “Some popular models like Prophet [Taylor and Letham, 2018] and ARIMA were excluded from the analysis due to their prohibitive computational requirements and extensive training times.”

Yes, I've done some work in time series forecasting. The above sentence is the one that tipped me off to this paper being BS, so I stopped reading after that. :) I can't take any paper about timeseries forecasting seriously by an author who isn't familiar with the field.

nojito · on Oct 13, 2023

The test data is 300k different time series. There’s no way to do an arima in a reasonable amount of time and/or money on that volume of data

contravariant · on Oct 13, 2023

Eh it's not as if you could just project down the 300k time series to something lower dimensional for forecasting. The TimeGPT would have to do something similar to avoid the same problem.

Though I can't quite figure out how the predicting works exactly, they have a lot of test series but do they input all of them simultaneously?

agnosticmantis · on Oct 13, 2023

Really? And they could do LSTMs?

Even if true, they could take a random subset of size 100 out of the 300k and compare on those.

nojito · on Oct 13, 2023

ARIMA is very very slow and computational expensive.

>Even if true, they could take a random subset of size 100 out of the 300k and compare on those.

Sure...but there's a chance that ARIMA won't even finish training on that subset either.

wokwokwok · on Oct 13, 2023

It doesn’t matter.

If you write a paper and exclude comparisons to state of the art, this what happens.

They could have done something, and didn’t.

“It’s hard so we didn’t” isn’t an excuse, it’s just a lack of rigor.

dist-epoch · on Oct 13, 2023

ARIMA is not very good for anything but highly predictable time series (of the summer-hot winter-cold kind).

carbocation · on Oct 13, 2023

If true, then beating it and looking good will be easy.

Having trained ARIMA models in my day, I will say that long training times and training cost -- compared to any deep learning model -- is not something that ever crossed my mind.

aerhardt · on Oct 13, 2023

Ok but the claim is about high training times… Wtf?

ugh123 · on Oct 13, 2023

High training times could be cost prohibitive. Currently, its over $100mil to train GPT4 from scratch (which possibly includes other costs related to RLHF and data acquisition). Not sure how this model compares, but its likely not cheap.

warkdarrior · on Oct 13, 2023

The claim in the paper was the ARIMA has high training times.

cozzyd · on Oct 13, 2023

Maybe they're using a pure Python implementation or something...