The point was that the amount of optimization work that went into making the circuit for factoring 21 only ~100x larger than the one for factoring 15 is not feasible for larger numbers. So, the general rule should be that you need ~500x more gates for a similar increase in the number to be factored, not just ~100x.
The more plausible amount of optimization is less optimization. Or, more accurately, the benefits of optimization at large sizes is expected to be less beneficial than it was for the N=21 circuit.
I think the idea is the minimal implementation will be unstable and unreliable. I don’t know the details, but there’s much work and thought on quantum error correcting qubits - where you hook up N qubits in a clever way to function as one very stable qubit. Terms such as “decoherence time” make appearances. You can imagine this quickly expands into an awful lot of qubits.
I don't get this part
If author already produced "115x", how can optimizations make it worse?