At the start of this thread I pointed out some serious problems with
the change proposed in XCU ERN 10 and expressed the opinion that
mandating "accurate" c_nlink values (in relation to the files in the
archive) as it does would be "madness". David Korn responded with "I
completely agree with Geoff Clare that mandating accurate c_nlink
values in the archive is madness." None of the other contributors to
the thread disagreed. (Note that when Paul Eggert said "if the
filesystem is not being changed while pax is running, then the link
counts should be accurate" he was not using "accurate" in the same
sense as David and me - he was referring to use of the file's st_nlink
value.)
I assumed that this would mean the end of that proposed change. I think
the other participants in the thread made the same assumption, as it
was not mentioned again - the discussion concentrated instead on finding
a viable alternative.
Now, against all expectation, it as been reaffirmed. The latest
minutes say:
ERN 10 A/M
This was revisited and additional rationale added as to why this
approach is being taken.
The change in the wording for c_nlink for TC/XXX/XXX was chosen to allow
old archivers to process archives created to conform to this standard.
Other proposals to date would (a) either break compatibility
or (b) require pax to maintain extra state for files that it
does not not need to do so.
Of these reasons one is lame, and the other is plain wrong.
(a) I don't see why ensuring old utilities can read new cpio archives
correctly is of great concern. Under the same circumstances those
old utilities do not themselves create archives that they can read
correctly. In any case the alternatives proposed to date would only
require a very slight change in order to ensure compatibility.
(Just require c_nlink to be set to its maximum value, or maybe
require it to be "greater than or equal to the number of links in the
archive referencing the file" if vendors really want to be allowed
the option of calculating accurate c_nlink values.)
(b) If the need to maintain extra state is undesirable this is an
argument *against* the current aardvark, not for it. Calculating
accurate c_nlink values means that when writing an archive pax
must store *every* pathname that is to be placed on the archive,
together with either the individual dev/ino values for each pathname
or a separate count of unique dev/ino values. (The latter would
only be viable if pax reads the files from a filesystem snapshot.)
On reading, the need to maintain state would be the same as in old
utilities.
Contrast this with an alternative that allows c_nlink to be set to
its maximum value, where no state needs to be maintained on writing,
and the worst case on reading is to have to remember *one* pathname
for each dev/ino. Thus if there are any hard links in the archive,
the current aardvark requires maintaining more state on writing than
the alternative requires on reading, and more state on reading than
the alternative requires on writing.
Now, after reading the above and having read and understood the
serious problems I described at the beginning of this thread, does
anyone still support the current aardvark? If so, please say so
here - don't wait until the next teleconference.
--
Geoff Clare <yyyyyyy@xxxxxxxxxxxxx>
The Open Group, Apex Plaza, Forbury Road, Reading, RG1 1AX, England
|