>That will still require the compiler to serialize the three registers to the stack, to be able to pass the pointer to the structure to the callee.
why can't it simply pass a pointer to the struct (it's probably already on the stack) without rewriting the struct to the stack? isn't that what a reference is?
But it often isn't in the stack. This is a vector type so it is often modified and used as part of math operations. Each vector field is probably in some register because it was used to calculate something and then has to be stored back to stack to get valid data for the reference.
if current values of source-code-struct fields are in registers, there are two options, that the calling function and struct is so small and so compiler optimized that there is no memory allocation for the struct, or there is an allocation and it's just dirty and not updated. Which means update it and call, or spill and call.
You want to call a function that is not expecting its arguments to be in registers, and you don't have unlimited registers on this hardware at this time, so I don't understand all the hand-wringing about either option. I guess what I'm saying is that this is all being treated like "because we assume optimization and we know how optimization works, we're entitled to have what's important in registers all the time so things will go faster, so this must be a bug and we have to fix it."
The actual solution is to inline the callee and rely on the compiler, switch to asm and hand guarantee, or create a new language that has register calling or data flow semantics that are different than what you have now. The conversation that's taking place here sounds to me like relying on undefined behaviors, something we used to do because we knew we could rely on them but you can't any more.
why can't it simply pass a pointer to the struct (it's probably already on the stack) without rewriting the struct to the stack? isn't that what a reference is?