While you could run larger models with 128GB, I feel like 64GB is just the amount needed to run models at a reasonable speed with a M Max (I have an M3 Max 64GB).
That's kinda why I'd like a 64GiB M4 Air. 64 is the magic number for local LLMs of any reasonable capability. Example: deepseek-coder can do unit tests, gemma3 can summarize PDFs and stuff pretty well, etc. You can't do much with LLMs with only 32GiB, just run baby ones.
M4 is going to be too weak on GPU, even an M4 pro is probably not going to get you far. If you go with an M4, 16 or 32 is fine. An M1/M2/M3/M4 Max is probably what you need minimum to run quantized 72b models, where 64GB is needed.