One of the principles for the C language is that you should be able to use C on ...

flatfinger · on April 15, 2020

Different implementations are used for different purposes. If 20% of implementations are used for purposes where a feature would be useful, which of the following would be best:

1. Have 10% of implementations support the feature one way, and 10% support it in an incompatible fashion.

2. Require that all compiler writers invest the time and necessary to support the feature without regard for whether any of their customers would ever use it.

3. Specify that implementations may either support the feature or report that they don't do so, at their leisure, but that implementations which claim to support the feature must do so in the manner prescribed by the Standard.

When C89 was written, the Committee decided that rather than recognizing different categories of implementation that support different sets of features, it should treat the question of what "popular extensions" to support as a Quality of Implementation which could be better resolved by the marketplace than by the Committee.

IMHO, the Committee should recognize categories of Safely Conforming Implementation and Selectively Conforming Program such that if an SCI accepts an SCP, and the translation and execution environments satisfy all documented requirements of the SCI and SCP, the program will behave as described by the Standard, or report in Implementation-Defined fashion an inability to do so, period. Any other behavior would make an implementation non-conforming. No "translation limit" loopholes.

apotheon · on April 16, 2020

That's obviously true, but at the same time the specifics of how one chooses to set criteria for inclusion in the standard should probably keep in mind the social consequences. If the intended consequence (e.g. ensuring that implementation is easy enough and desired enough to end up broadly included for portability) and the likely consequence (e.g. reduced standardization of C capabilities in practice, with rampant relianced by developers on implementation-specific behavior to the point almost nobody writes portable code any longer) differ too much, it's time to revisit the mechanisms that get us there.

flatfinger · on April 16, 2020

What is meant by "portable code"? Should it refer only to code that should theoretically be usable on all imaginable implementations, or should it be expanded to include code which may not be accepted by all implementations, but which would have an unambiguous meaning on all implementations that accept it?

Historically, if there was some action or construct that different implementations would process in different ways that were well suited to their target platforms and purposes, but were incompatible with each other, the Standard would simply regard such an action as invoking Undefined Behavior, so as to avoid requiring that any implementations change in a way that would break existing code. This worked fine in an era where people were used to examining upon precedent to know how implementations intended for certain kinds of platforms and purposes should be expected to process certain constructs. Such an approach is becoming increasingly untenable, however.

If instead the Standard were to specify directives and say that if a program starts with directive X, implementations may either process integer overflow with precise wrapping semantics or refuse to process it altogether, if it starts with directive Y, implementations may either process it treating "long" as a 32-bit type or refuse to process it altogether, etc. this would make it much more practical to write portable programs. Not all programs would run on all implementations, but if many users of an implementation that targets a 64-bit platform need to use code that was designed around traditional microcomputer integer types, a directive demanding that "long" be 32 bits would provide a clear path for the implementation to meet its customers' needs.

apotheon · on April 17, 2020

> What is meant by "portable code"? Should it refer only to code that should theoretically be usable on all imaginable implementations, or should it be expanded to include code which may not be accepted by all implementations, but which would have an unambiguous meaning on all implementations that accept it?

That's a good question. I'm not sure I know. I could hazard a guess at what would be "best", but I'm not particularly confident in my thoughts on the matter at this time. As long as how that is handled is thoughtful, practical, consistent, and well-established, though, I think we're much more than halfway to the right answer.

> Historically, if there was some action or construct that different implementations would process in different ways that were well suited to their target platforms and purposes, but were incompatible with each other, the Standard would simply regard such an action as invoking Undefined Behavior, so as to avoid requiring that any implementations change in a way that would break existing code.

If I understand correctly, that would actually be "implementation-defined", not "undefined".

> a directive demanding that "long" be 32 bits would provide a clear path for the implementation to meet its customers' needs

There are size-specific integer types specified in the C99 standard (e.g. `uint32_t`). I use those, except in the most trivial cases (e.g. `int main()`), and limit myself to those size-specific integer types that are "guaranteed" by the standard.

flatfinger · on April 17, 2020

> If I understand correctly, that would actually be "implementation-defined", not "undefined".

That is an extremely common myth. From the point of view of the Standard, the difference between Implementation Defined behavior and Undefined Behavior is that implementations are supposed to document some kind of behavioral guarantee with regard to the former, even in cases where it would be impractical for a particular implementation to guarantee anything at all, and nothing that implementation could guarantee in those cases would be useful.

The published Rationale makes explicit an intention that Undefined Behavior, among other things, "identifies areas of conforming language extension".

> There are size-specific integer types specified in the C99 standard (e.g. `uint32_t`). I use those, except in the most trivial cases (e.g. `int main()`), and limit myself to those size-specific integer types that are "guaranteed" by the standard.

A major problem with the fixed-sized types is that their semantics are required to vary among implementations. For example, given

    int test(uint16_t a, uint16_t b, uint16_t c) { return a-b > c }

some implementations would be required to process test(1,2,3); so as to return 1, and some would be required to process it so as to return 0.

Further, if one has a piece of code which is written for a machine with particular integer types, and a compiler which targets a newer architecture but can be configured to support the old set of types, all one would need to do to port the code to the new platform would be to add a directive specifying the required integer types, with no need to rework the code to use the "fixed-sized" types whose semantics vary among implementations anyway.