We're mostly doing ETL on large datasets, so the code needs to parallelize well,...

We're mostly doing ETL on large datasets, so the code needs to parallelize well, but beyond that performance isn't really a big concern. We use ML in research, but no models in production, because the costs of increased maintenance/lost transparency generally outweigh the benefits in our use case.

In jobs that were heavy on ML, I would use high-performance tools for the models (imperative code, numeric computing packages etc.) and functional code for the ETL, which worked pretty well–no need to be dogmatic about it, a 70% pure codebase is still generally easier to reason about than a 20% pure codebase.