Secondly, I remember watching a few months ago a video from Michael Penn, about something called Padé Approximations: Pade Approximation – unfortunately missed in most Caclulus courses. It was a subject worth exploring.
Best explanation would seem to be a rebalancing to increase efficiency - the M3 Pro was basically the same performance as the M2 Pro but at much greater energy efficiency. Now the M4 Pro has increased the performance again (I assume at a similar power draw as the M3 Pro)
I don’t know who was it .. but I heard some YouTuber suggesting it might have been a commercial decision to further distinguish between the pro and max / ultra versions of their chips. Which.. sucks but makes sense.
Another shout out to Emmy Noether's (First) theorem.
Informally stated, if a system has a continuous symmetry property, then there are corresponding quantities whose values are conserved.
As an illustration, if a physical system behaves the same regardless of how it is oriented in space, angular momentum of the system must be conserved, as a consequence of its laws of motion.
Another illustration, if a physical process exhibits the same outcomes regardless of place or time, then linear momentum and energy must be conserved.
It is regarded as the foundation of particle physics
The big problem I've had historically with non-native CUDA wrappers is that they always seem to omit or bug some feature that is critical for my application, and the amount of debugging pain and implementation or bugfix work to get around this problem exceeds the effort "savings" of a high level interface by an order of magnitude or three.