The Axiom of Choice is that I can have an infinite number of sets and I can "choose" a value from each of them. A choice from a set is basically what a bit is (give or take the question of probabilistic weighting of the values in the set, which is important in the finite case but can't save you in this case so it's a distraction here). Invoking the Axiom of Choice, which is only interesting in the infinite case, is injecting an infinite number of bits into the proof that uses it, as it involves an infinite number of choices.
At best the invocation has an acceptance procedure, but it does not come with a procedure for making the choice. Probably in most cases it doesn't even have an acceptance procedure, which is to say, if we were presented with the choices (or a finite subset of them, since we can't handle the infinite set) we couldn't even tell if they are "correct". So it is not the case that it's just like the primes case, which I mentioned is distinctly finite in bits even if the set is infinite in cardinality. The Axiom of Choice is an infinite (even uncountably infinite or more) number of selections with no way given of representing that in any finite manner. In many cases the Axiom of Choice is invoked when we can't even define the sets we are choosing from, if not most or all of them.
So, from my computer science, information theory point of view, proofs involving the Axiom of Choice are proceeding along nicely, finite statement here, finite statement there, finite logic done on these finite statements, everything's all nice and finite and comprehensible and then BANG an infinite number of unknownable bits are dumped on me, and then we proceed along like nothing weird has happened and do some more finite logic on finite values based on this infinite pile of unknownable bits. It is what it is, but what it is is strange. I mean, it's not news to mathematicians that it can be a bit strange, but I think this "infinite number of bits" is one of the better ways to see how strange it is.
Another way to look at it is, the AoC is generally invoked without even a theoretical decision procedure to make the choices. It just has a demonstration that such a choice exists. So it amounts to "and then god chooses a value in these sets for us in accordance to our criteria". Lowercase on purpose. A more conventional term in computer science is "oracle". Every invocation of the AoC is basically adding this mathematical entity of such incredible power that we can't determine anything about how it does its thing, or whether or not it is correct. Again. Legal. But something I think is taken more casually than I would like, and I think doing mathematics without scattering infinitely knowledgeable (if only in very constrained ways) and powerful oracles around everywhere with the same casualness as one might do a disjunction or implication in all our proofs is underrespected as a discipline within mathematics in general.
An equivalent way to phrase the Axiom of Choice is: there exists a function that takes in any non-empty set, and returns an element of that set.
IE, this exists: choose a :: NonEmptySet a -> a
That's it. All of this stuff about "take a choice from a countable number (IE, list) of sets" is fluff; once you have choose, you can make a function that takes in an Iterator (NonEmptySet a) and produces (NonEmptySet a), such that the output will contain every element.
oneFromEachSet :: Iterator (NonEmptySet a) -> NonEmptySet a
From a programmatic viewpoint that would be describing the axiom of countable choice, since such recursion can only occur countably infinite number of times.
Even if we ignore that by allowing for syntax where we fanout uncountably many times with each recursion, the construction as shown requires the collection of sets to be orderable, to be able to do have a "first", unless we are implementing first in terms of choose itself. (Alternatively one could also argue in terms of Axiom of Choice implying the well-ordering theorem, allowing up to pick a minimum via some such ordering (although you may need to use the choice function to pick one such ordering), but this won't fully help here anyway).
But even if you do that (implement iterator's first in terms of choose) it is not immediately obvious that you really have covered everything some full statements of the axiom of choice does. The axiom of choice is generally phased in terms of collections or families of sets, not sets of sets, but obviously the choose function takes in a set, not a more general family or collection.
It is easy enough to dismiss away certainly non-set collections of sets, like multisets of sets, since "choose" being a function will return the same value every time from a repeated input, and in the above formulation we are returning a set of results, so such duplicates won't matter. We could even handwave away a version where we return a multiset without too much trouble.
But the more expansive versions of the axiom of choice allow us to do something like "choose one element from each set in the class of non-empty sets." This is a real problem, as the "class of non-empty sets" is pretty well known to not be a set, and thus trying to implement iterator first in terms of it would not even be possible for that scenario.
"First" in this case was just getting the first element of the iterator, it wasn't related to choosing a value from the set.
The Axiom of Choice is equivalent to the following:
1) "For any set A, there exists a function from non-empty elements of the power set of A to elements of A."
Which is equivalent to:
2) "There exists a function that takes a non-empty set as input, and returns an element of that set"
2 -> 1, since we can take the function 2 defines, and use it to define the function from 1.
1 -> 2 since we can define 2 as such:
For a set A, take the power set of A. Then take the function that 1 claims exists. Use that function to get an element of A, since A is an element of its power set.
But back to your core point:
But I agree that Iterator is a bad fit for the axiom of choice, since we have an (uncountable) number of sets to pick from.
Here's a way to phrase it programmatically that allows for uncountable sets-of-sets:
The axiom of choice claims that given a set of non-empty sets, you can construct a new set which contains an element from each of the sets.
pickOneFromEach :: Set (NonEmptySet a) -> NonEmptySet a
At best the invocation has an acceptance procedure, but it does not come with a procedure for making the choice. Probably in most cases it doesn't even have an acceptance procedure, which is to say, if we were presented with the choices (or a finite subset of them, since we can't handle the infinite set) we couldn't even tell if they are "correct". So it is not the case that it's just like the primes case, which I mentioned is distinctly finite in bits even if the set is infinite in cardinality. The Axiom of Choice is an infinite (even uncountably infinite or more) number of selections with no way given of representing that in any finite manner. In many cases the Axiom of Choice is invoked when we can't even define the sets we are choosing from, if not most or all of them.
So, from my computer science, information theory point of view, proofs involving the Axiom of Choice are proceeding along nicely, finite statement here, finite statement there, finite logic done on these finite statements, everything's all nice and finite and comprehensible and then BANG an infinite number of unknownable bits are dumped on me, and then we proceed along like nothing weird has happened and do some more finite logic on finite values based on this infinite pile of unknownable bits. It is what it is, but what it is is strange. I mean, it's not news to mathematicians that it can be a bit strange, but I think this "infinite number of bits" is one of the better ways to see how strange it is.
Another way to look at it is, the AoC is generally invoked without even a theoretical decision procedure to make the choices. It just has a demonstration that such a choice exists. So it amounts to "and then god chooses a value in these sets for us in accordance to our criteria". Lowercase on purpose. A more conventional term in computer science is "oracle". Every invocation of the AoC is basically adding this mathematical entity of such incredible power that we can't determine anything about how it does its thing, or whether or not it is correct. Again. Legal. But something I think is taken more casually than I would like, and I think doing mathematics without scattering infinitely knowledgeable (if only in very constrained ways) and powerful oracles around everywhere with the same casualness as one might do a disjunction or implication in all our proofs is underrespected as a discipline within mathematics in general.