Thursday, 22 January 2009

C and C++ are CRAZY!

I’ve learned a lot of crazy things about C and C++ in the last few years, as far as parsing goes. The one that’s always blown me away the most is that parsing C++ is undecidable: you can’t properly parse C++ without doing template instantiation, but template instantiation is turing-complete.

But every time I think I’ve gotten to the point where C and C++ can’t faze me anymore, I find something new. Like this, from the Wikipedia article about operator precedence in C and C++:

The binding of operators in C and C++ is specified (in the corresponding Standards) by a factored language grammar, rather than a precedence table. This creates some subtle conflicts. For example, in C, the syntax for a conditional expression is:

logical-OR-expression ? expression :
conditional-expression

while in C++ it is:

logical-or-expression ? expression :
assignment-expression

Hence, the expression:

e = a ? b : c = d

is parsed differently in the two languages. In C, this expression is parsed as:

e = ((a ? b : c) = d)

which is a semantic error, since the result of a conditional-expression is not an lvalue. In C++, it is parsed as:

e = (a ? b : (c = d))

which is a valid expression.

WHAT?? I had to try this out and determine whether it was true or some crazy Wikipedia anonymous editor who was trying to drive me mad. So I wrote the simple test program:

int main() {
int e = 1, a = 2, b = 3, c = 4, d = 5;
e = a ? b : c = d;
return e;
}

…and attempted to compile it first with gcc as a .c, then with g++ as a .cc, and sure enough:

$ gcc -o /tmp/test /tmp/test.c
/tmp/test.c: In function 'main':
/tmp/test.c:4: error: lvalue required as left operand of assignment
/tmp/test.c:4: error: expected ';' before 'return'
$ g++ -o /tmp/test /tmp/test.cc
$

The world is insane. Luckily Gazelle will be the sword in your right hand, your weapon against a world that conspires against your sanity.

Incidentally, it’s more and more clear that C++ is not a superset of C. Though it’s true enough that most people using the language don’t have to care, for language implementors the differences are significant.

0 comments: