Skip to content
Dibyendu Majumdar edited this page Jul 21, 2018 · 23 revisions

Notes

portability

Newsgroups: comp.lang.c
Path: nntp.gmd.de!xlink.net!howland.reston.ans.net!gatech!udel!news.mathworks.com!yeshua.marcam.com!uunet!allegra!alice!dmr
From: [email protected] (Dennis Ritchie <7549-15328> 0112710)
Subject: Re: Advanced Books
Message-ID: <[email protected]>
Organization: AT&T Bell Labs, Murray Hill, NJ
Date: Sat, 5 Nov 1994 05:09:57 GMT
Lines: 30

Keith Goatman quoted David Mohorn, who doubted that cool games
like DOOM and Wolfenstein could be written in ANSI C.

As has been mentioned here before, DOOM is almost
completely in ANSI C, with a tiny but time-intensive part of
the rendering algorithm in assembler.  Of course the
interfaces to the screen, the keyboard, the mouse, and the soundbox
use interfaces that are not defined by ANSI and are platform-specific.
Probably some fraction of this is written in non-portable C and some
in assembler.

This is lesson 1: do as much as possible portably,
isolate the rest.

Here's the fun part of the story.  I was looking through
old mail a couple of months ago and found a letter (sent in
March) from a person named John Carmack, who praised the lcc compiler
by Chris Fraser of our group and Dave Hanson of Princeton.
Carmack found lcc nicer than gcc ("This has saved me SO much time
(it was fun too)").  Having seen DOOM overtake most of
the leisure time in our lab over the summer, I was now
prepared to appreciate the significance of Carmack's letter.
So I sent him some mail acknowledging his thanks.

By return mail, we got a copy of the beta version of DOOM II.

This is lesson 2: portability often pays off in unexpected ways.

	Dennis Ritchie
	[email protected]

lvalues



Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP

Path: utzoo!mnetor!seismo!rutgers!princeton!allegra!alice!dmr
From: [email protected]
Newsgroups: comp.lang.c
Subject: lvalues
Message-ID: <[email protected]>
Date: Fri, 16-Jan-87 01:27:09 EST
Article-I.D.: alice.6539
Posted: Fri Jan 16 01:27:09 1987
Date-Received: Mon, 19-Jan-87 23:58:26 EST
Organization: AT&T Bell Laboratories, Murray Hill NJ
Lines: 37


The question of lvalues has arisen recently both in the context
of the silly C book review, and also in relation to the operand
of ++.  The term is old (it comes from BCPL or earlier) and
just denotes the things that can appear on the left (`l') of
an assignment.

The White Book and the current ANSI draft both waffle about whether
the term is formal or descriptive; they introduce it by, respectively,
"an expression referring to an object [which is a] manipulatable
region of storage;" and "an expression that designates an object."

It might cause less confusion if the definition were explicitly syntactic,
and only certain lvalues were permitted by the semantics to be
assigned to.  In this scheme, an lvalue (eliding precedence) is defined
as one of

	identifier
	( lvalue )
	lvalue . identifier
	* expression

Also, by applying equivalence rules,

	expression[expression]     =>    *(expression + expression)
	expression->identifier     =>    (*expression).identifier

Only some lvalues can appear on the left of `=' (e.g.: not array,
not const, not function).  Even more restrictions apply to operands
of `&': not register, not bit-field.

This suggestion doesn't change the language, but it makes it
clearer what an lvalue actually is.  Both the old and new
reference manuals make it hard for the reader to enumerate the possible
lvalues.

		Dennis Ritchie
		research!dmr

Joy of Reproduction

Message-ID: <bnews.research.314>
Newsgroups: net.lang.c
Path: utzoo!decvax!harpo!npoiv!alice!research!dmr
X-Path: utzoo!decvax!harpo!npoiv!alice!research!dmr
From: research!dmr
Date: Thu Nov  4 07:31:07 1982
Subject: Joy of reproduction
Posted: Thu Nov  4 02:30:06 1982
Received: Thu Nov  4 07:31:07 1982

Some years ago Ken Thompson broke the C preprocessor in the following
ways:
  1) When compiling login.c, it inserted code that allowed you to
  log in as anyone by supplying either the regular password or a special,
  fixed password.

  2) When compiling cpp.c, it inserted code that performed the special
  test to recognize the appropriate part of login.c and insert the
  password code.  It also inserted code to recognize the appropriate
  part of cpp.c and insert the code described in way 2).

Once the object cpp was installed, its bugs were thus self-reproducing,
while all the source code remained clean-looking.  (Things were even set
up so the funny stuff would not be inserted if cc's -P option was used.)

We actually installed this on one of the other systems at the Labs.
It lasted for several months, until someone copied the cpp binary
from another system.

Notes:
  1)  The idea was not original; we saw it in a report on Multics
  vulnerabilities. I don't know of anyone else who actually went to
  the considerable labor of producing a working example.

  2) I promise that no such thing has ever been included in any distributed
  version of Unix.  However, this took place about the time that NSA
  was first acquiring the system, and there was considerable temptation.

		Dennis Ritchie

Operator ++

Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!alice!dutoit!dmr
From: [email protected]
Newsgroups: net.lang.c
Subject: Where did ++ come from?
Message-ID: <[email protected]>
Date: Sat, 21-Jun-86 04:22:22 EDT
Article-I.D.: dutoit.2140
Posted: Sat Jun 21 04:22:22 1986
Date-Received: Sun, 22-Jun-86 04:24:10 EDT
Lines: 36

Phaedrus@eneevax guessed that a lot of notation in C came from PDP-11
assembly language, and Chris Torek's reply did indeed drag me out
of my torpor.

Nothing in the C syntax came from the PDP-11, because all the relevant
parts were imported from B, which was in use before the PDP-11
existed.  In fact things are somewhat the other way around; the reason
the Unix PDP-11 assembler resembles B (and C) more than does DEC's, is
that I wrote the first Unix PDP-11 assembler, in B, before we had a DEC
assembler.  It was written from the machine description.  It used * and
$ rather than @ and # because the former were analogous respectively to
the B notation and to other assembly languages I knew, and (equally)
because @ and # were the kill and erase characters.

As to ++ and --:  these were Thompson inventions as far as I know,
or at least the idea of using them in both prefix and postfix form.
No doubt the autoincrement cells in the PDP-7 contributed to the idea,
but there was a significant generalization, or rather isolation of the
significant operations into ++ -- and *, as Chris pointed out.

If you haven't heard of autoincrement cells, here is the idea: certain
locations (010-017) in low memory in the PDP-7 (and also the -8, but
just one cell, probably 010), acted like ordinary memory locations,
unless indirection was applied through them.  In that case, after the
indirect reference, 1 was automatically added to them.  It was useful
for stepping through arrays, especially because these machines lacked
index registers.

* came from the version of BCPL we were using.  (Pure BCPL used "rv"
for "*" and "lv" for "&").

By the way,  B had assignment versions of all the binary operators,
including === and =!=.  Since it didn't have &&, the question of =&&
did not arise.  The ones missing from C were dropped for lack of interest.

	Dennis Ritchie

Pure and Applied C

Message-ID: <bnews.research.330>
Newsgroups: net.lang.c
Path: utzoo!decvax!harpo!npoiv!alice!research!dmr
X-Path: utzoo!decvax!harpo!npoiv!alice!research!dmr
From: research!dmr
Date: Mon Dec 27 06:19:21 1982
Subject: Pure and applied C
Posted: Mon Dec 27 01:36:35 1982
Received: Mon Dec 27 06:19:21 1982

John Levine wondered whether, in the call bar(x++) with x global,
bar would see the old or the incremented value of x.  The parameter
is clearly the old value; the question is whether the compiler might
be allowed to do the increment after the call.

Pure C gives little reassurance either way. "Otherwise, the order of
evaluation of expressions is undefined.  In particular the compiler considers
itself free to compute subexpressions in the order it believes most
efficient, even if the subexpressions involve side effects."

Applied C is a little more cautious.  Steve Johnson brought up this very
point when he was writing PCC, and I said that as far as I knew he was
entitled to delay the increment.  We agreed, though, that the result
might surprise people, and he decided not to rock the boat.
	Dennis Ritchie

Operator precedence

Message-ID: <bnews.research.310>
Newsgroups: net.lang.c
Path: utzoo!decvax!harpo!npoiv!alice!research!dmr
X-Path: utzoo!decvax!harpo!npoiv!alice!research!dmr
From: research!dmr
Date: Fri Oct 22 03:39:32 1982
Subject: Operator precedence
Posted: Fri Oct 22 01:04:10 1982
Received: Fri Oct 22 03:39:32 1982

The priorities of && || vs. == etc. came about in the following way.
Early C had no separate operators for & and && or | and ||.
(Got that?)  Instead it used the notion (inherited from B and BCPL)
of "truth-value context": where a Boolean value was expected,
after "if" and "while" and so forth, the & and | operators were interpreted
as && and || are now; in ordinary expressions, the bitwise interpretations
were used.  It worked out pretty well, but was hard to explain.
(There was the notion of "top-level operators" in a truth-value context.)

The precedence of & and | were as they are now.

Primarily at the urging of Alan Snyder, the && and || operators were
added. This successfully separated the concepts of bitwise operations and
short-circuit Boolean evaluation.  However, I had cold feet about the
precedence problems.  For example, there were lots of programs with
things like
	if (a==b & c==d) ...
In retrospect it would have been better to go ahead and change the precedence
of & to higher than ==, but it seemed safer just to split & and &&
without moving & past an existing operator. (After all, we had several
hundred kilobytes of source code, and maybe 3 installations....)
	Dennis Ritchie

Sign extension

Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!linus!genrad!mit-eddi!mit-vax!eagle!alice!research!dmr
From: [email protected]
Newsgroups: net.lang.c
Subject: re type casting
Message-ID: <[email protected]>
Date: Thu, 9-Jun-83 23:54:56 EDT
Article-I.D.: research.353
Posted: Thu Jun  9 23:54:56 1983
Date-Received: Fri, 10-Jun-83 11:37:43 EDT
Lines: 22

decvax!betz wondered about the construction

	char c;
	...  (unsigned) c;

His example was more complicated but this is the essence.  What is supposed
to happen is that the character is promoted to int, then cast to unsigned.
In other words, the sign extension is not prevented (which is what he
wants to accomplish, as portably and cheaply as possible).

In other words the v7 and DECUS compilers are wrong.  Unfortunately
they are not wrong by accident, or at least my compiler isn't.
To my shame I put in the construction as a special hack to accomplish
essentially the same thing as Betz wanted.  (This was before the unsigned
char type went in.)  In such ways do one's past sins come back to haunt one.

If you really want portability to compilers that don't have unsigned char,
I'm afraid you'll have to use the explicit mask.  At that, it may
not be too bad.  The 11 code generator would need to generate the mask
instruction anyway, and the Vax -O optimizer is smart enough to get rid of it.

		Dennis Ritchie

foo()*0

Message-ID: <bnews.research.329>
Newsgroups: net.lang.c
Path: utzoo!decvax!harpo!eagle!mhtsa!alice!research!dmr
X-Path: utzoo!decvax!harpo!eagle!mhtsa!alice!research!dmr
From: research!dmr
Date: Tue Feb  1 02:08:40 1983
Subject: foo()*0
Posted: Mon Jan 31 02:52:38 1983
Received: Tue Feb  1 02:08:40 1983

A couple of years ago I changed my C compiler not to throw out
0*x, 0&x, and the like where  x  is an expression with side effects.
I believed then and now that anyone who depended on such things was
mad, and the recent examples have not convinced me otherwise.
However, it was much easier to change the compiler than to attempt
to argue the implausibility of each carefully crafted example.
This is known in the trade as "covering your ass."

The change occurred post-v7 so it is not visible outside Bell.

		Dennis Ritchie

enums

Message-ID: <bnews.research.315>
Newsgroups: net.lang.c
Path: utzoo!decvax!harpo!npoiv!alice!research!dmr
X-Path: utzoo!decvax!harpo!npoiv!alice!research!dmr
From: research!dmr
Date: Mon Nov  8 03:18:21 1982
Subject: enums
Posted: Mon Nov  8 02:16:38 1982
Received: Mon Nov  8 03:18:21 1982

There has been a lot of grousing about the uselessness of the enum type
in C, most of it justified under the circumstances.  The circumstances
are that all versions of PCC that I know of are buggy in their treatment
of this type.

Enums were intended to be entirely equivalent to ints; just a way, really,
of defining names for constants understood by the compiler and subject
to the normal scope rules.

There was a clear choice here: enums as utterly separate collections of atoms,
more or less as in Pascal, or as ways of naming integers.  I chose the
latter after some waffling.  Unfortunately, some of the waffle batter
got mixed in with PCC and has stayed there.

		Dennis Ritchie

Type conversion

Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!linus!decvax!harpo!eagle!allegra!alice!research!dmr
From: [email protected]
Newsgroups: net.lang.c
Subject: bug in type conversion
Message-ID: <[email protected]>
Date: Wed, 4-Jan-84 00:32:17 EST
Article-I.D.: research.1021
Posted: Wed Jan  4 00:32:17 1984
Date-Received: Thu, 5-Jan-84 01:18:44 EST
Lines: 14

Mike O'Brien points out that in the C compilers he has available, the
expression
	i *= d;
where  i  is int and  d  is double is evaluated in fixed point, and
wonders why.  The answer: it is a compiler bug.  I fixed it in
(a post V7 version) of the 11 compiler, and it is fixed in the current
System V compiler (by "current" I mean the one I tried;
I don't know what is being shipped at this instant.)

The manual is reasonably clear and unambiguous on the point, but
it's not surprising that people for search for definition problems when
the compilers are unanimously wrong.

		Dennis Ritchie

Casts

Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!alice!dutoit!dmr
From: [email protected]
Newsgroups: net.lang.c
Subject: conversion of short to unsigned int
Message-ID: <[email protected]>
Date: Wed, 27-Mar-85 00:25:47 EST
Article-I.D.: dutoit.2026
Posted: Wed Mar 27 00:25:47 1985
Date-Received: Thu, 28-Mar-85 00:16:25 EST
Lines: 31

Mike Wescott wondered why, after

	unsigned short us = -3;
	short s = -3;

the comparison

	(unsigned int)s == us

should yield true (this, evidently, on a VAX).  The reason is that
his compiler, along with those of a great many people,
has a bug.  The short  s  should indeed promote to 0xfffffffd (an int)
and then be cast to unsigned (same bits, in 2's complement), and compare
unequal with the 0xfffd stored in  us .

This is another instance where a reasonably clear, if complicated,
description in the manual appears hopelessly confused because the
compiler doesn't implement the manual.

Incidentally, Ken Turkowski's remarks about casts: "... a cast, saying
that  s  is considered unsigned rather than signed.  It [a cast] is NOT
a conversion"  are quite wrong.  Casts specify conversions, not
requests to reinterpret a variable as containing some type other than
its own.  Many casts, especially those involving unsigned, or pointers,
do indeed not actually change any bits in the value.  This fact may have misled
Ken, and probably also the compiler writer.  The way to think of casts
is to imagine assigning their operands to temporary variables with
the type specified by the cast, and then using the temporary in the larger
expression in which the cast occurs.

	Dennis Ritchie

ANSI C

Relay-Version: version B 2.10 5/3/83; site utzoo.UUCP
Path: utzoo!watmath!clyde!burl!ulysses!allegra!alice!research!dmr
From: [email protected]
Newsgroups: net.lang.c
Subject: X3J11 thoughts
Message-ID: <[email protected]>
Date: Mon, 23-Jul-84 01:10:35 EDT
Article-I.D.: research.1036
Posted: Mon Jul 23 01:10:35 1984
Date-Received: Mon, 23-Jul-84 06:32:59 EDT
Lines: 166

A few comments on issues raised in net.lang.c.  (I do read it).
First, I am in general pleased with the work of the
X3J11 ANSI committee, and am especially content that they are now the
ones who have to worry about the kinds of questions raised in this group,
and that I don't have to rewrite the manual.

Henry Spencer's summary of Larry Rosler's Usenix presentation
on the current state of things was excellent.

I have only one serious qualm about the way things are preceding in X3J11:
it concerns argument declarations in function declarations.  This
is the only important change in the language as usually implemented.
Let's concede that this should have been done long ago; the only
interesting question is whether it is useful to do it now.
Recognizing that it was not practical to demand that all existing
programs be changed to declare function arguments, the committee
leans to allowing declarations but not requiring them.  For example:

   OLD					   NEW

extern	double sin();			extern double sin(double);
...					...
main() {				main() {
	double x = sin(1);			double x = sin(1);
}					}

In the new version the argument of sin will be coerced to double.
In the old version (still legal syntactically) there is no such
coercion, the program is just wrong.  Lint will tell you so, but
not everyone uses lint, and not everyone even has it.  The problem
is that because both programs will be accepted by most new compilers,
there is ample opportunity for confusion.  Thus, users of new
compilers will soon come to have include files that
nicely declare the arguments of sin() and other functions, and to depend
on the coercions.  Often their programs will appear to compile
properly under old compilers, but won't work.

The other problem with the proposal is that by allowing a mixture
of the old and new syntax, the compiler can't be sure whether actual
arguments were were declared and coerced at all call sites; this cuts
off some useful optimizations.  For example, the float->double widening
in arguments is very costly if you use a software implementation
of IEEE floating point.  If the compiler could be absolutely sure
that each caller knew that a function argument was declared float
everywhere, it wouldn't have to convert.

The committee had three choices:

1) Leaving things alone, with a subchoice of allowing argument declarations
   for checking purposes (no coercion);
2) The proposed scheme;
3) Requiring functions to be fully declared including all arguments
   (like Pascal, A68, indeed all other modern languages in which
   the question comes up).

Choice 1 with advisory declarations is not worth the trouble.
Leaving things completely alone was quite possible.  We managed to muddle
along so far.

Choice 3 is in an obvious way the correct one.  It has some costs
in complexity (see below).  The only problem is that it is
utterly impractical because it breaks every C program in existence.

Choice 2's problem is that it is neither fish nor fowl; it trips on
the same technical complexities of variable argument lists encountered
by choice 3 and complicates the language (e.g. "int f(void)" had to be
invented to make it work).  Rather than clearly stating
that getting argument types right is the programmer's responsibility
(as with 1) or that a mandatory previous declaration will coerce
and check the actual arguments (as with 3), it leaves everyone somewhat
confused as to what will happen at any particular call.	

All in all, I have to think that 3 is best but impossible, and that 1
is marginally better than 2.  Supporters of 2 are secretly hoping to be
able to go for 3 in the future.  Unfortunately, I suspect that
instead of having either 1 or 3 forever we will have a mishmash forever.
Appendix B of the current draft says that the compiler is entitled
to warn you if "a function is called but no prototype has been supplied";
this still seems to let you say   extern double sin();  ... sin(1);
with no warning.

Variable argument lists:
One of the problems you get into when argument types are supposed
to be known in advance, and even when they're not, is what to
say about the printf family in the language description.
The old manual said that actual arguments (after some widening)
had to agree in type and number with formals.  Unstated, but implicit
in all implementations, was that somehow printf had to work.
I don't know of any way to formalize this in the context of C.  A good
side effect of having syntax to notify the compiler that a function has an
unknown number and type of argument is that everyone is on notice
that something funny is going on.  A bad effect is that programmers
will come to believe that such things are in any way portable.
They are not.  Unless someone comes up with a brilliant invention,
neither ANSI nor I will promise anything in writing.  Suppliers
of C compilers and libraries are responsible for making printf work.
Users can't expect to do it themselves reliably.  Macros like va_args
(which came from BTL, not Berkeley) improve things in practice,
and can often make variable-argument functions more exportable.
What you will not get is a description of the complete semantics.
It's just too machine dependent.

Open-ended structures:
	Someone else asked about structure declarations with
things like  char x[1];  at the end, where the intent is to have
a fixed header with variable stuff tacked on at the end.
Once again, you are not likely to find a discussion of this in a C
reference manual.  Writers of language descriptions (even me) like
to have a firm idea of what various declarations of objects mean
and what operations can be performed on the objects.  Unless
people have a good alternative to unions or PL/I's iSUB or a better
idea please don't ask for formal blessing on this.

Enums:
	Was this the place I grossed out at Usenix?  I did say "botch."
In the current proposed standard, enum types are ints and there is no
restriction on their use, except that Appendix B says that a compiler
may warn you if you assign something to an enum variable except
a value of that enum type.  This is very close to my original design.
The choice with enums (as has been reported) was between

1) making them a neater way of specifying integer constants

2) making each a unique type as in Pascal

I decided against the second choice because to make them useful would
have required larger language changes than I was prepared for
(arrays indexed by enum values, arithmetic on sparse
values, that sort of thing).  I proceeded to put enums as integers
into the PDP-11 compiler, while publicly worrying about the choice,
and saying that it might be nice if lint warned about implausible
enum assignments.  At just that instant the Sys III compiler was being
completed, the same program that took a trip to California
to become the BSD compiler.  And unfortunately, it incorporated
halfway thoughts about what enums should be.  (So did the
the manual; it said that enums were a unique type but also were ints).
Let it be recorded that earlier Sys III and BSD compilers are buggy and
incorporate no useful realization of enums.

typeof:
	is a good deal.  Write the committee.

Grace Hopper:
	I think Kuenning meant Jean Sammet.

How to complain:
	If you feel strongly about something in the standard, it is
advisable to write to the committee instead of grousing here.
Some of them read this group, but I doubt if they save the submissions
away and take them to meetings.  Try a real, paper letter. Pick one of

	Larry Rosler
	Room 1337
	AT&T Bell Laboratories
	190 River Rd.
	Summit NJ 07901

	X3 Secretariat: CBEMA
	Suite 500
	311 First St NW
	Washington DC 20001

Letters to X3 should probably refer prominently to X3J11.
Suggestions should be specific and to the point.

		Dennis Ritchie

volatile



Path: utzoo!mnetor!uunet!lll-winken!lll-tis!ames!mailrus!tut.cis.ohio-state.edu!rutgers!bellcore!faline!thumper!ulysses!andante!alice!dmr

From: [email protected]

Newsgroups: comp.lang.c

Subject: volatile isn't necessary, but it's there

Message-ID: <[email protected]>

Date: 6 Apr 88 05:40:36 GMT

Organization: AT&T Bell Laboratories, Murray Hill NJ

Lines: 87



Since a good many messages have ardently defended the 'volatile'
type qualfier, it might be worth pointing out a few things.

Most important, the notion is not, so far as I know, under serious attack
in X3J11; there is every reason to believe it will be in ANSI C.
In my noalias diatribe I took a passing swipe at it (a position
I will defend below), but on the whole, I am willing to leave volatile
alone and concentrate on getting rid of noalias.

Volatile is of use only with optimizing compilers; those that don't
do some kind of data-flow analysis can ignore it.  More bluntly,
it is intended to be used in cases where your compiler will do something
other than what your program plainly asks it to do.  People seem to be
thinking of three categories of use.

1)  Accessing memory-mapped device registers

2)  Special cases involving automatic variables in a routine
    that calls setjmp

3)  Shared memory, and also interrupt routines.

The X3J11 documents (the dpANS itself and the Rationale) specifically
mention the first two, but carefully avoid talking about the third.

I have no real gripe about the first; I just think it is unnecessary.
It seems just as reasonable for purveyors of optimizing compilers for machines
with memory-mapped IO to have a compiler flag that says "please
be cautious about cacheing things in this routine."
Alternatively, they could accept a #pragma in the code.
It is true that putting volatile in the language encourages
everyone to do the job in the same way, but it is by no means
clear that, on balance, `volatile' makes C a better language.

There is a real cost in having the feature in the language.
It is just one more peculiar thing to learn and be confused
about.  If there is one thing I have learned from reading this
newsgroup, and other popular reactions to C, it is that people
have trouble understanding the language.  Features that do nothing
but request a compiler not to compile your program incorrectly
are not really what C needs.

(I'm especially amused by some of the more extreme positions
stated in favor of `volatile.'  May I be permitted to assert
that it is possible to write a successful operating system
without `volatile,' even on machines with memory-mapped IO?)

I am much more worried about the second justification.  It simply
caters to broken compilers.  Because some machines find it
hard to handle setjmp/longjmp properly, X3J11 ruled that "[following
a longjmp], [a]ll accessible objects have values as of the time
longjmp was called, except that the values of objects of automatic storage
duration that do not have volatile type and have been changed between
the setjmp invocation and longjmp call are indeterminate."
The Rationale is fairly explicit about the apologia.
This is just a botch; the committee should have insisted
that, if necessary, the compiler recognize setjmp and handle its caller
specially.  They edged around this anyway; what other use
is the insistence that setjmp be a macro?  (A footnote in the
Rationale poses this question, too.)

The third hope for volatile, namely shared memory, is in some ways the
most interesting because it nibbles at the edge of mechanisms
that will become more important in the future.  Nevertheless,
as several have pointed out, the Standard conspicuously avoids
the extensions needed to make shared memory work
(e.g. semaphores).  The dpANS even says, "what constitutes
an access to an object that has volatile-qualified type is
implementation-defined," and the sections that discuss
what volatile actually does mean are correspondingly inexplicit.
If you hope to find in the Standard that "extern volatile mutex; ++mutex"
has an iron-clad meaning, you'll be disappointed.
Thus, using volatile for shared memory may be syntactically
portable, but it is not semantically portable, because it
has no defined semantics.

To summarize, using volatile for device registers is plausible;
using it for longjmp is a rank copout; using it for shared memory
is premature.

Has anyone else noticed that a lot of the more peculiar things that X3J11
has added (volatile, and especially noalias) are there for the
benefit of compiler writers and benchmarkers, and not for the user?
(I know how it happens, though; after all, I invented 'register.')

		Dennis Ritchie
		research!dmr
		[email protected]

noalias

Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!thumper!ulysses!andante!alice!dmr
From: [email protected]
Newsgroups: comp.lang.c
Subject: Re: no noalias not negligible
Message-ID: <[email protected]>
Date: 22 May 88 08:53:50 GMT
Organization: AT&T Bell Laboratories, Murray Hill NJ
Lines: 58
Posted: Sun May 22 04:53:50 1988

Since the noalias issue has surfaced again, I offer a few thoughts on the
issue.

The comments of Hough, and the views of all the people on X3J11 who
wanted some way to express the "noalias" concept, are worth paying
attention to.  When a function with two pointer arguments is called,
the pointers are allowed to point to overlapping things, and this
inhibits otherwise plausible optimizations (vectorization especially)
in the function.  Other possibilities for pointers to overlapping
things occur, like externals vs. parameters, but in practice
Hough's example represents the problem that purveyors of vector
machines worry most about.

The problem with the noalias proposal, as embodied in the January
draft, was not at all that it attacked a nonexistent problem, but
that it did far too much.  If it had merely provided a way of
saying, "I promise that this pointer can be trusted not to point to
data accessed by another path," and if the scope of the promise
were limited reasonably, it would have been accepted without
any real quarrel; there might have been mutterings from mossbacks
like me about balancing the burden of language complexity against
the benefit, but no outcry.

Instead, the concept was illegitimately bound to the notion
of data type, and was made very dangerous; the rules as constituted
would have forced programmers into promises they didn't understand
and couldn't keep.  Some of the examples are in the screed I
posted some months ago.

I don't think there are easy answers to the problem.  As it
stands, C is hard to vectorize because of aliasing.  Fortran
is easier because of some global rules: for example, two parameters,
or a parameter and a COMMON, are not allowed to be aliased.
X3J11 was understandably unwilling to introduce Fortran-style
rules because it would represent a subtle and dangerous change
in the interpretation of existing programs.  Indeed, the Fortran
rules are subtle and to some extent dangerous even for Fortran.
Many years ago I was involved in a large system (Altran)
written in Fortran, and its most stubborn bugs owed to unsuspecting
violation of Fortran's aliasing (rather, noaliasing) rules.

I think there is little doubt that the best solution for C is to
use a #pragma, and that it would have been best for X3J11 to suggest one.
Because I thought it was absolutely essential to get rid of the
January version of noalias, and no variations on it worked any better,
I made a calculated decision not to propose an alternative; no idea
seemed attractive enough to avoid further controversy and consequent
distraction from the problems with the draft's version.  Gwyn,
by the way, seems to be correct in observing that the rules for #pragma,
as written, prohibit using it to make promises about aliasing.
Thus making a formal #pragma proposal would have opened up a wrangle about
#pragmas in general.  There was not enough time to do the job properly.
If all this had happened two years ago, something could have been worked
out, but it was too late for this standard.

	Dennis Ritchie
	research!dmr
	[email protected]

volatile

Path: utzoo!attcan!uunet!lll-winken!lll-lcc!ames!nrl-cmf!ukma!gatech!rutgers!bellcore!faline!thumper!ulysses!andante!alice!dmr
From: [email protected]
Newsgroups: comp.lang.c
Subject: optimization and volatile
Message-ID: <[email protected]>
Date: 23 Dec 88 08:03:37 GMT
Organization: AT&T Bell Laboratories, Murray Hill NJ
Lines: 24

For the record, I don't recall any transformation made by the PDP-11
compiler or peephole optimizer that caused problems with device register
references. Perhaps someone else has an example?  I'll try it out.

To repeat myself from last year, I think it would
have been plausible for X3J11 to do without 'volatile,' but certainly
this would have required some words in the standard discussing the
permissible effects of varying levels of optimization, and probably
mandating compiler flags.  This is unpleasantly non-linguistic (but so
too is the difference between hosted and non-hosted environments).

It is certainly desirable that programs run fast, and optimization
techniques evidently aid this goal; it is also desirable that programs
mean what they seem to say.  Notions like device registers and
asynchronous access to data (interrupts or multitasking) introduce
serious conflicts between these goals, and the conflict needs to
be relieved somehow (even if not perfectly).

All in all, 'volatile' is not a bad compromise.  It trades additional
language complexity and necessarily fishy semantics against improved
visibility (both to the human and to the compiler) of the fishy parts.

		Dennis Ritchie
		research!dmr
		[email protected]

ANSI C

Path: utzoo!utgpu!water!watmath!clyde!bellcore!faline!thumper!ulysses!andante!alice!dmr
From: [email protected]
Newsgroups: comp.lang.c
Subject: Evaluating ANSI C
Message-ID: <[email protected]>
Date: 18 May 88 07:44:43 GMT
Organization: AT&T Bell Laboratories, Murray Hill NJ
Lines: 88
Posted: Wed May 18 03:44:43 1988

Since the subject has come up, this is a good time to record my
thoughts on ANSI C and the work that X3J11 has done.  In brief,
I think the result is commendable.  I concur in the belief of those
who watch such things that X3J11 managed to improve on what they
started with, and that this is an unusual accomplishment for a large
committee that lasts for 5 years or so.  In particular, they successfully
resisted the usually inexorable pressure to add features and options.

The committee had certain explicit goals, among them to bring the
language specification up to date with reality, since much had changed
in the 10 years since the original definition; to add a very few things
deemed necessary (function prototypes); and to specify things more
completely than K&R did.  They also wanted to standardize the library,
something that has become important not only because the drift
in the System V vs. BSD worlds, but even more because of the
vastly increased use of C on non-Unix systems.

Perhaps as important as all the rest was a goal that was not stated
explicitly, namely to supply the stability, legitimacy and general
cachet that is possible only when an official committee mulls over
something for sufficient years, and complies with all the legalities
required to get the result accepted as an Official Standard by the
appropriate Official Bodies.  The need for this can be questioned,
but the fact that many people want it to happen, and that it tends to
occur regardless of desires, can't be denied.

I have only a few worries about the specification of the language.

First, the introduction of the new function declaration mechanism.
The new scheme is better than the old, but the change is going
to cause trouble, and the need to accommodate both ways confuses the
language specification and will confuse users.  I made my peace with
the change some years ago; it is better to do it than not.

Second, I don't think type qualifiers have been fully digested.
I never did object very strongly to volatile, even though I maintain
that it is not onerous to live without it.  The remaining qualifier,
const, suffers from tension between conflicting views of what
it is supposed to mean.  One possible view is that const things
are real constants.  If it prevailed, then one would expect that
things of type const could appear where constants are expected
(case labels and so forth).  Perhaps even the notion of pointer
to constant would become suspect.  This was not the view that the
committee eventually adopted, although (perhaps unconscious) sentiment
for it remains; instead, a more implementation-minded approach was
taken, namely that const means something that can be put in ROM
(hardware or write-protected memory).  Nevertheless, the tension
remains, and figured heavily in the committee's arguments over seemingly
unrelated things (including noalias).  Even now, questions of whether
a pointer to const can be assigned to an ordinary pointer, or
whether these two types are compatible (and when compatibility matters)
are lively issues.  It is clear that the rules on qualifiers are
to some extent artificial--they could be decided in several ways.
Good rules are not arbitrary, they are forced by the logic of the
design.

Some of the things that people are complaining about were completely
necessary, in particular the insistence that only a fixed set of
library routines is defined, that initial _ is reserved for the
implementation, and that all other names are reserved for the user.
It is just not possible to write a standard that permits the implementation
to intrude into the user's name space (by letting it define names
internal to its library that the user might use by accident), or
to give a defined, standard way for the user to replace system
routines.  The latter would constrain implementations too much.
At the same time, the consequences of such rules are misunderstood.
A program that calls "read" is not defined by the standard, but
I assure you that the supplier of your Unix system will arrange
that it works.

X3J11 did miss some opportunities.  Perhaps the most obvious lack
in the language is a scheme for variadic arrays.  However, the proposals
I have seen are awkward and don't fit smoothly, so it is not surprising
that nothing was done.

Another thing that must await the future is a genuine rethinking
of integer arithmetic, not just the fiddles that they did.

All in all, I think X3J11 did an excellent job.  When they began,
several years ago, I was somewhat apprehensive about what would result,
but also decided that I did not have the stamina to become involved in
their activities; I would have to trust their good sense.  I have never
regretted the decision, and I'm pleased with the outcome.


	Dennis Ritchie
	research!dmr
	[email protected]

noalias

Path: utzoo!mnetor!uunet!husc6!mailrus!umix!umich!mibte!gamma!ulysses!andante!alice!dmr

From: [email protected]

Newsgroups: comp.lang.c

Subject: noalias comments to X3J11

Message-ID: <[email protected]>

Date: 20 Mar 88 08:37:58 GMT

Organization: AT&T Bell Laboratories, Murray Hill NJ

Lines: 333



Reproduced below is the long essay I sent as an official comment
to X3J11.  It is in two parts; the first points out some problems
in the current definition of `const,' and the second is a diatribe
about `noalias.'

By way of introduction, the important thing about `const' is that the
current wording says, in section 3.3.4, that a pointer to a
const-qualified object may be cast to a pointer to the plain object,
but "If an attempt is made to modify the pointed-to object by means of
the converted pointer, the behavior is undefined."  Because function
prototypes tend to convert your pointers to const-qualified pointers,
difficulties arise.

In discussion with various X3J11 members, I learned that this section
is now regarded as an inadvertant error, and no one thinks that
it will last in its current form.  Nevertheless, it seemed wisest
to keep my comments in their original strong form.  The intentions
of the committee are irrelevant; only their document matters.

The second part of the essay is about noalias as such.  It seems likely
that even the intentions of the committee on this subject are confused.

Here's the jeremiad.

				Dennis Ritchie
				research!dmr
				[email protected]
----------

This is an essay on why I do not like X3J11 type qualifiers.
It is my own opinion; I am not speaking for AT&T.

     Let me begin by saying that I'm not convinced that even
the pre-December qualifiers (`const' and `volatile') carry
their weight; I suspect that what they add to the cost of
learning and using the language is not repaid in greater
expressiveness.  `Volatile,' in particular, is a frill for
esoteric applications, and much better expressed by other
means.  Its chief virtue is that nearly everyone can forget
about it.  `Const' is simultaneously more useful and more
obtrusive; you can't avoid learning about it, because of its
presence in the library interface.  Nevertheless, I don't
argue for the extirpation of qualifiers, if only because it
is too late.

     The fundamental problem is that it is not possible to
write real programs using the X3J11 definition of C.  The
committee has created an unreal language that no one can or
will actually use.  While the problems of `const' may owe to
careless drafting of the specification, `noalias' is an
altogether mistaken notion, and must not survive.

1.  The qualifiers create an inconsistent language

     A substantial fraction of the library cannot be
expressed in the proposed language.

     One of the simplest routines,

        char *strchr(const noalias char *s, int c);

can return its first parameter.  This first parameter must
be declared with `const noalias;' otherwise, it would be
illegal (by the constraints on assignment, 3.3.16.1) to pass
the address of a const or noalias object.  That is, the type
qualifiers in the prototype are not merely an optional
pleasantry of the interface; they are required, if one is to
pass some kinds of data to this or most other library routines.

     Unfortunately, there is no way in X3J11's language for
strchr to return the value it promises to, because of the
semantics of return (3.6.6.4) and casts (3.3.4).  Whether
the stripping of the const and noalias qualifiers is done by
cast inside strchr, or implicitly by its return statement,
strchr returns a pointer that (because of `const') cannot be
stored through, and (because of `noalias') cannot even be
dereferenced; by the rules, it is useless.  (Incidentally, I
think this observation was made by Tom Plum several years
ago; it's disconcerting that the inconsistency remains.)

     Although the plain words of the Standard deny it, plastering
the appropriate `non-const' cast on an expression to
silence a compiler is sometimes safe, because the most probable
implementation of `const' objects will allow them to be
read through any access path, and will diagnose attempts to
change them by generating an access violation fault at run
time.  That is, in common implementations, adding or taking
away the `const' qualifier of a pointer can never create any
bugs not implicit in the rule `do not modify a genuine const
object through any access path.'

     Nevertheless, I must emphasize that this is NOT the
rule that X3J11 has written, and that its library is inconsistent
with its language.  Someone writing an interpreter
using X3J11/88-001 is perfectly at liberty to (indeed, is
advised to) carry with each pointer a `modifiable' bit, that
(following 3.3.4) remains off when a pointer to a const-
qualified object is cast into a plain pointer.  This implementation
will prevent many of the real uses of strchr, for
example.  I'm thinking of things like

        if (p = strchr(q, '/'))
                *p = ' ';

which are common and innocuous in C, but undefined by
X3J11's language.

     A related observation is that string literals are not
of type `array of const char.'  Indeed, the Rationale (88-004
version) says, `However, string literals do not have
[this type], in order to avoid the problems of pointer type
checking, particularly with library functions....'  Should
this bald statement be considered anything other than an
admission that X3J11's rules are screwy?  It is ludicrous
that the committee introduces the `const' qualifier, and
also makes strings unwritable, yet is unable to connect the
two conceptions.

2. Noalias is an abomination

     `Noalias' is much more dangerous; the committee is
planting timebombs that are sure to explode in people's
faces.  Assigning an ordinary pointer to a pointer to a
`noalias' object is a license for the compiler to undertake
aggressive optimizations that are completely legal by the
committee's rules, but make hash of apparently safe
programs.  Again, the problem is most visible in the
library; parameters declared `noalias type *' are especially
problematical.

     In order to write such a library routine using the new
parameter declarations, it is in practice necessary to
violate 3.3.4: `A pointer to a noalias-qualified type ...
may be converted to ... the non-noalias-qualified type.  If
the pointed to object is referred to by means of the converted
pointer, the behavior is undefined.'  Thus, the problem
that occurs with `const' is now much worse; there are no
interesting and legal uses of strchr.

     How do you code a routine whose prototype specifies a
noalias pointer?  If you fail to violate 3.3.4, but instead
try to rewrite the declarations of temporary variables to
make them agree in type with parameters, it becomes hard to
be sure that the routine works.  Consider the specification
of strtok:

        char *strtok(noalias char *s1, noalias const char *s2);

It retains a static pointer to its writable, `noalias' first
argument.  Can you be sure that this routine can be made
safe under the rules?  I have studied it, and the answer is
conditionally yes, provided one accepts certain parts of the
Standard as gospel (for example that `noalias' handles will
NOT be synchronized at certain times) while ignoring other
parts.  It is a very dodgy thing.  For other routines, it is
certain that complete rewriting is necessary: qsort, for
example, is full of pointers that rove the argument array
and change it here and there.  If these local pointers are
qualified with `noalias,' they may all be pointing to different
virtual copies of parts of the array; in any event,
the argument itself may have a virtual object that might be
completely untouched by the attempt to sort it.

     The `noalias' rules have the assignment and cast restrictions
backwards.  Assigning a plain pointer to a const-
qualified pointer (pc = p) is well-defined by the rules and
is safe, in that it restricts what you can do with pc. The
other way around (p = pc) is forbidden, presumably because
it creates a writable access path to an unwritable object.
With `noalias,' the rules are the same (pna = p is OK,
p = pna is forbidden), but the realistic safety requirements are
completely different.  Both of these assignments are equally
suspicious, in that both create two access paths to an
object, one manifestation of which might be virtual.

     Here is another way of observing the asymmetry: the
presence of `const type *' in a parameter list is a useful
piece of interface information, but `noalias type *' most
assuredly is not.  Given the declaration


        memcpy(noalias void *s1, const noalias void *s2, size_t n);

what information can one glean from it?  Some committee
members apparently believe that it conveys either to the
reader or to the compiler that the routine is safe, provided
that the strings do not overlap.  They are mistaken.
Perhaps the committee's intent is not reflected in the
current words of the Standard, but I can find nothing there
that justifies their belief.  The rules (page 65, lines 19-20)
specify `all objects accessible by these [noalias]
lvalues,' which is the entirety of both array arguments.

     More generally, suppose I see a prototype

        char *magicfunction(noalias char *, noalias char *);

Is there anything at all I can conclude about the requirements
of magicfunction? Is there anything at all I can conclude
about things it promises to do or not to do?  All I
learn from the Rationale (page 52) is that such a routine
enjoins me from letting the arguments overlap, but this is
at variance with the Standard, which gives a stronger
injunction.

     Within the function itself, things are equally bad.  A
`const type *' parameter, though it presents problems for
strchr and other routines, does usefully constrain the function:
it's not allowed to store through the pointer.  However,
within a function with a `noalias type *' parameter,
nothing is gained except bizarre restrictions: it can't cast
the parameter to a plain pointer, and it can't assign the
parameter to another noalias pointer without creating
unwanted handles and potential virtual objects.  The interface
MUST say noalias, or at any rate DOES say noalias, so
the author of the routine has all the grotesque inventions
of 3.5.3 (handles, virtual objects) rubbed in his face, like
or not.

     The utter wrongness of `noalias' is that the information
it seeks to convey is not a property of an object at
all.  `Const,' for all its technical faults, is at least a
genuine property of objects; `noalias' is not, and the
committee's confused attempt to improve optimization by pinning
a new qualifier on objects spoils the language.
`Noalias' is a bogus invention that is not necessary, and
not in any case sufficient for its apparent purpose.

     Earlier languages flirted with gizmos intended to help
optimization, and generally abandoned them.  The original
Fortran, for example, had a FREQUENCY statement that didn't
help much, confused people, and was dropped.  PL/1 had
`normal/abnormal' and `uses/sets' attributes that suffered a
similar fate.  Today, these are generally looked on as
adolescent experiments.

     On the other hand, the insufficiency of `noalias' is
suggested by Cray's Fortran compiler, which has 20 separate
keywords that control various details of optimization.  They
are specified by an equivalent of #pragma, and thus, despite
their oddness, can be ignored when trying to understand the
meaning of a program.

     Perhaps there is some reason to provide a mechanism for
asserting, in a particular patch of code, that the compiler
is free to make optimistic assumptions about the kinds of
aliasing that can occur.  I don't know any acceptable way of
changing the language specification to express the possibility
of this kind of optimization, and I don't know how much
performance improvement is likely to result.  I would
encourage compiler-writers to experiment with extensions, by
#pragma or otherwise, to see what ideas and improvements
they can come up with, but I am certain that nothing resembling
the noalias proposal should be in the Standard.

3.  The cost of inconsistency

     K&R C has one important internal contradiction
(variadic functions are forbidden, yet printf exists) and
one important divergence between rule and reality (common
vs. ref/def external data definitions).  These contradictions
have been an embarrassment to me throughout the years,
and resolving them was high on X3J11's agenda.  X3J11 did
manage to come up with an adequate, if awkward, solution to
the first problem.  Their solution to the second was the
same as mine (make a rule, then issue a blanket license to
violate it).

     I'm aware that there are distinctions to be made
between `conforming' and `strictly conforming' programs.
Although the X3J11 rules for qualifiers are inconsistent,
and therefore most nominally X3J11 compilers will ignore, or
only warn about, casts and assignments that X3J11 says are
undefined, people will somehow survive.  C has, after all,
survived the vararg and the extern problems.

     Nevertheless, I advise strongly against sanctifying a
language specification that no one can possibly embody in a
useful compiler.  This advice is based on bitter experience.

4.  What to do?

     Noalias must go.  This is non-negotiable.

     It must not be reworded, reformulated or reinvented.
The draft's description is badly flawed, but that is not the
problem.  The concept is wrong from start to finish.  It
negates every brave promise X3J11 ever made about codifying
existing practices, preserving the existing body of code,
and keeping (dare I say it?) `the spirit of C.'

     Const has two virtues: putting things in read-only
memory, and expressing interface restrictions.  For example,
saying

        char *strchr(const char *s, int c);

is a reasonable way of expressing that the routine cannot
change the object referred to by its first argument.  I
think that minor changes in wording preserve the virtues,
yet eliminate the contradictions in the current scheme.

1)   Reword page 47, lines 3-5 of 3.3.4 (Cast operators), to
     remove the undefinedness of modifying pointed-to
     objects, or remove these lines altogether (since casting
     non-qualified to qualified isn't discussed explicitly
     either.)

2)   Rewrite the constraint on page 54, lines 14-15, to say
     that pointers may be assigned without taking qualifiers
     into account.

3)   Preserve all current constraints against modifying
     non-modifiable lvalues, that is things of manifestly
     const-qualified type.

4)   String literals have type `const char []'.

5)   Add a constraint (or discussion or example) to assignment
     that makes clear the illegality of assigning to an
     object whose actual type is const-qualified, no matter
     what access path is used.  There is a manifest constraint
     that is easy to check (left side is not const-
     qualified), but also a non-checkable constraint (left
     side is not secretly const-qualified).  The effect
     should be that converting between pointers to const-
     qualified and plain objects is legal and well-defined;
     avoiding assignment through pointers that derive ultimately
     from `const' objects is the programmer's responsibility.


     These rules give up a certain amount of checking, but
they save the consistency of the language.