Is it common in Julia to use multiple-dispatch on 3 or more arguments, or just double-dispatch?
Julia definitely made the right choice to implement operators in terms of double-dispatch - it’s straightforward to know what happens when you write `a + b`. Whereas in Python, the addition is turned into a complex set of rules to determine whether to call `a.__add__(b)` or `b.__radd__(a)` - and it can still get it wrong in some fairly simple cases, e.g. when `type(a)` and `type(b)` are sibling classes.
I wonder whether Python would have been better off implementing double-dispatch natively (especially for operators) - could it get most of the elegance of Julia without the complexity of full multiple-dispatch?
It's not uncommon to dispatch on 3 or more arguments. Linear algebra specializations are one case where I tend to do this a lot, for example specializing on structured matrix types (block banded matrices) against non-standard vectors (GPU arrays), you then need to specialize on the output vector to make it non-ambiguous in many cases.
The paper "Julia: Dynamism and Performance Reconciled by Design" [1] (work largely by Jan Vitek's group at North Eastern, with collaboration from Julia co-creators, myself included), has a really interesting section on multiple dispatch, comparing how different languages with support for it make use of it in practice. The takeaway is that Julia has a much higher "dispatch ratio" and "degree of dispatch" than other systems—it really does lean into multiple dispatch harder than any other language. As to why this is the case: in Julia, multiple dispatch is not opt-in, it's always-on, and it has no runtime cost, so there's no reason not to use it. Anecdotally, once you get used to using multiple dispatch everywhere, when you go back to a language without it, it feels like programming in a straight jacket.
Double dispatch feels like kind of a hack, tbh, but it is easier to implement and would certainly be an improvement over Python's awkward `__add__` and `__radd__` methods.