Email List: Xaustin-review-lX
[All Lists]

Defect in XSH System Interfaces/General Information/Error Numbers

To: yyyyyyyyyyyyyyy@xxxxxxxxxxxxx
Subject: Defect in XSH System Interfaces/General Information/Error Numbers
From: yyyyyyyy@xxxxxxxxxxxx
Date: Sat, 19 Jul 2003 22:05:32 +0100 (BST)
        Defect report from : Marc Rochkind , -

(Please direct followup comments direct to yyyyyyyyyyyyyy@xxxxxxxxxxxxx)

@ page 0 line 0 section System Interfaces/General Information/Error Numbers 
editorial {errors}

Problem:

Edition of Specification (Year): 2003

Defect code :  1. Error

In the section "System Interfaces/General Information/Error Numbers" it states 
"Some functions return an error number directly as the function value." It 
would be helpful if the term "error number" were defined in "Base 
Definitions/Definitions."

getaddrinfo and getnameinfo are specified to return a "value," not an error 
number, and this is an important distinction. This term, "value," should be in 
the Definitions as well, but clearly, before doing so, a more precise term 
should be invented. (Or not; see below.)

strerror and strerror_r are specified to take an "error number" as their first 
argument, which is fine, as the careful reader would know not to pass in a 
"value." Well, maybe not, because "strerror() shall map any value of type int 
to a message," which seems hard to implement, until one realizes that a return 
of, say, "Unknown value," completely meets the spec. The term "value" should 
not be used, however, because, although the "values" returned by getaddrinfo 
and getnameinfo work on all the implementations I tried, and they do return a 
message as required, the message is totally unrelated to the actual error. 
Surely this is not a useful implementation, and the spec should steer 
implementors away from it.

A "value" must instead be passed to gai_strerror, which is specified to take an 
"error value." It's very clearly worded, but, still, it ought to use exactly 
the same term as getaddrinfo and getnameinfo do, not an approximation, as the 
reader could easily confuse the two terms "error number" and "error value," 
which are, of course, completely different things; the latter is the same as a 
"value."

Merely touching up the terminology and adding some definitions doesn't really 
address the problem. The real problem is that there are three functions 
(getdate is one) in the entire SUS that don't follow the "error number" scheme, 
and they could easily be changed to merge everything together.

Action:

1. Keep the EAI_* codes, but change the specification for getaddrinfo and 
getnameinfo to say that they return an error number.

2. In the section "System Interfaces/General Information/Error Numbers" and the 
specification for errno.h, make it clear that the EAI_* symbols are error 
numbers, too. (Implementations that overlap the actual values of the EAI_* 
constants with errno values will have to change.) If anyone is bothered by the 
use of "EAI_" for an errno constant, define a set of synonyms that follow the 
"E*" pattern.

3. getdate is unusual because it uses actual numbers (1 - 8), not macros for 
its error indication, which it confusingly calls a "value" and returns through 
getdate_err. I would suggest that getdate be changed to specify that errno 
shall be set to one of 8 new constants to be defined. For compatibility, we can 
leave getdate_err and the numbers 1 - 8 alone. That is, on an error, an 
implementation would have to set getdate_err and errno, and the values they are 
set to are not necessarily the same. (It's perfectly OK with me if getdate_err 
and the numbers 1 - 8 are scrapped, but others care much more than I do about 
compatibility.)

4. With the new "error numbers," implementations of strerror, strerror_r, and 
perror have a bit more work to do, but since implementations are surely 
table-driven, this should be an easy upgrade.

5. I lied about there being only three weird functions; actually we also have 
gethostbyaddr and gethostbyname, which return their own values in their own 
global, h_errno. However, these functions are obsolete, so maybe there's no 
point in fixing them. The APPLICATION USAGE and FUTURE DIRECTIONS already 
indicate that they are deprecated (not sure if SUS uses that term officially).

6. For any function that returns an error indication, where the spec currently 
doesn't say that errno shall be set, it should be modified to say so. All is OK 
where errors are defined, but there are functions like time where it is stated 
that -1 is returned on an error, but nothing is said about errno. If an 
implementation defines any errors, it should be required to set errno. An 
example of the way to do it is uname, which specifies that errno shall be set, 
even though the spec doesn't define any errors.


Now, with all these changes, we have only one pool of distinct "error numbers," 
and they are delivered to an application in only one of two ways: as a return 
value, or as a value of errno.

Closing comment: One reason that a simple, easy-to-understand error-reporting 
mechanism is so critical is that in many cases testing can't be used to check 
that the application is showing (or logging, whatever) the correct error code, 
since the circumstances under which they are generated are too difficult to 
trigger and because the programmer's system may not even implement them.

<Prev in Thread] Current Thread [Next in Thread>
  • Defect in XSH System Interfaces/General Information/Error Numbers, rochkind <=