Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
Linus vs C++, again (realworldtech.com)
275 points by hernan7 on June 10, 2010 | hide | past | favorite | 199 comments


As I get older and grumpier I tend to appreciate Linus' point of view more and more. It's easy to get swept up by arguments of the expressiveness of a language, particularly in small examples. However, I think in the long run it's better to have very explicit code. The less jumping around and inference I have to do to figure out what a block of code does the more likely it is that I understand it and that I can quickly verify that it does what it needs to do. If that makes code a little more verbose I think it's usually still worth it.


   write(fd, buf, size);
Can you list all the caveats, possible side effects, reasons of failure of this simple C function call? Hint: remember, fd can be a file, as well as a NFS-mounted file, pipe, network or local socket, FIFO, device, etc. Hint2: I doubt anyone can give a comprehensive description of the possible consequences of this call.

Bottom line being, C can be incredibly hard to understand, or C++ can be clean, succinct and straightforward. Personally, I'm for the second, although I know probably 99.9% of all existing C++ code is just awful.


In addition to the other already-good answers, I want to add the following answer, which I believe is even closer to the "real" answer Linus had in mind.

Of course you can not look at the function and immediately know all consequences. The real point is that there is a procedure you can follow which will let you determine the answer.

1. Locate the "write" function. There will only be one, because C has no namespaces.

2. Read the "write" function.

3. Repeat recursively as needed (including for macros).

For C++, the equivalent would be something like fd->write(buf), and the procedure is:

1. Determine the type of 'fd'.

2. Determine what subtypes you could have there and which you might actually have in hand. Or is the class not virtually inherited in which case it doesn't matter?

3. Determine what the write method does. In order to do so, have intimate knowledge of all operator overloading any value used in the write method may have.

4. Figure out what "buf" is and whether it magically overloads other operators.

write(fd, buf, size) resolves to one function with three arguments that themselves can't be that magical, and usually one basic approach to the question of memory management. fd->write(buf) involves classes, inheritance, potentially overloads, interfaces that 'buf' may correspond to and the potential need to follow a chain of some number of functions just to see whether that was constructed automatically into another type, endless permutations of how memory may be handled, and so on and so forth. The assembler that the C code will generate will basically push three arguments and call a function; the assembler the C++ generates is effectively unbounded in complexity. This assembler complexity directly corresponds to complexity that must be understood in order to understand the line of code.

In general I'd prefer the C++, but when writing a kernel where every twitchy detail counts for everything and the slightest bit of "wrong" could be a rootable security bug, I see the counterarguments.


This comment made things a bit clearer for me. This seems to also be an argument in favor of a Lisp-1 over a Lisp-2 (same symbol found in multiple packages) as well as an argument against use of generic methods (have to understand the types to find the right implementation). Something to think about.


Lisp-1 vs Lisp-2 is about whether functions and other variables are in the same namespace or not. Packages are orthogonal to this.

Also, generic functions in Common Lisp don't suffer from nearly the same amount of complexity as can be found in C++: there is always only one signature (lambda-list) for any function name, whether generic or not, there are no implicit conversions, no memory-allocation details, value/pointer/reference distinction, constness, virtualness, overloading of assignment operators etc.


By your wording, aren't CL packages considered to be just an implementation of a namespace? Is there another way to declare symbol namespaces in CL (outside of rolling our own organisms)?

I was under the impression that the function signature was not the issue raised... but the fact that the same function name could lead to completely different behaviors depending on a context. In the case of generic functions, the specific function implementation is usually (ignoring some of the other matching features) determined by the input types. The input types define the context.

All that said, I realize now that this might be less of a problem in CL if you just group your defmethods in the same area. If the problem is grepping for the definition, it is easy enough to rearrange things so that the related functions are in a close-enough place. In C++, the classes try to own their methods -- with the exception of warily-regarded "friend" methods.


Finding a method definition is quite easy in CL: just use the function 'find-method':

http://www.cs.cmu.edu/Groups/AI/html/hyperspec/HyperSpec/Bod...

Though grep would probably do just fine too, since method definitions are defined with 'defmethod' instead of 'defun', so you can see at a glance whether something is generic or not.


About packages and namespaces:

Package is essentially a collection that maps symbols to values. In CL packages have (at least) two namespaces: one for functions, other for any kind of variables. The reason for this is that when you write the form

    (fun #'a b)
you and the compiler can be sure that 'fun' and 'a' are both meant to be functions. There are both advantages and disadvantages to this.


That's userland code. In the kernel, write(2) very quickly finds the open file based on the fd and passes the call off to extremely explicit code that handles the write().

Stripping off ambiguity is almost the write syscall handler's only job.

It's worth mentioning here that not only is the convenience vs. explicitness tradeoff different in application code than in bare-metal systems programming code, but that that specific example of convenient app-level API is also an infamous Unix failure. In reality, app code cannot safely assume that an fd is just an abstract bucket you can read, write, close, and seek in.


I've been meaning to ask this for a very long time: What does the (2) mean? I see (n) in lots of descriptions of calls, but I've never found out what it means.


It refers to sections of the man pages. They are organized like this:

1. General Commands

2. System Calls

3. Subroutines

4. Special Files

5. File Formats

6. Games

7. Macros and Conventions

8. Maintenence Commands


and it's used like this: man 2 write


Exactly what I wanted to know. Thanks.


You got a lot of good answers here, but just so you know: you really only have to remember that open(2) is a syscall and fopen(3) is a library call. That's the important distinction.


The Unix manual is divided into sections (http://en.wikipedia.org/wiki/Man_page#Manual_sections). Section 2 is for system calls. Specifying the section number allows you to disambiguate which section of the manual you want to look in if a man page exists in multiple sections (e.g., "man 1 write" vs. "man 2 write").


It is where it lives in man. man write returns write(1) an application that sends a message to another user. man 2 write returns write(2) the system call mentioned above(or at least it does on Ubuntu).


It's the "man" section. According to "man man", 2 is the section for system calls. Often you'll have manpages for the same name under different sections, for example sync(1) is a command while sync(2) a system call (which, unsurprisingly, the sync command uses).


It's the man page section number. I believe this dates back to when the man(uals) were actually printed out. You'd go to section 2 to find X to disambiguate from a different Y.

http://www.december.com/unix/ref/mansec.html


And the disambiguation is still useful.


That same write call can exist in C++ too, with exactly the same consequences. Everything ugly in C can be done in C++, but C++ lets you be so much uglier.

  int i = 42;
  foo->bar(i);
In C, we can tell that foo is either a "struct S* foo" or "union U* foo" which has a member "X (*bar)(Y)". Type X is unknown from this context, but Y is some type compatible with int. We will call the function which bar points to with a single value 42. The value of i will be unchanged.

In C++, it could be the same. Or foo could be not a pointer at all, but some object with "operator->". bar() might take its first parameter by reference and end up changing i. bar() might have additional parameters with default values. bar might not even be a function, but some object with "operator()". etc. etc.

Most of the code I write is C++, so I'm not against it -- but I definitely understand the point that C requires less context.


This is one reason why idioms and patterns are important. I don't think there are many reasonable C++ coders out there who would override operator-> in such a way as to introduce such ambiguity.


Unfortunately that particular override comes up a lot when doing "smart pointers". Smart pointers are crazy difficult to write and fail in all kinds of subtle ways.


I've never had any problem with Qt or Boost smart pointers. They may be delicate to write, especially in multithreaded use cases, but once it's written it works rather well.


Hard to understand?

Really? write a block of data of size on bytes "bytes" on the file-pipe-network... pointed by the file descriptor "fd"??

It is extremely simple to understand, it makes a very simple thing, always right. I doubt you can make it simpler.

I have been using this all my life for NFS-mounted file, pipe, network or local socket... Never had any problems, and I have done very complex things.

Of course, in c++ you will use the very same function under a different calling, you can make it part of a class, whatever, but is going to use the same backend internally(because the OS only uses one), only that you can add a lot of abstraction(complexity) from c++ code.

Of course C++ could be clean, but the question is: what will happen if we make it into the kernel?. Linux thinks it won't work. I agree with him.


I've actually done this. Wrote an Infiniband kernel driver, in C++.

Overcoming the obvious difficulties, including "new" being a variable name in a library, there were no technical difficulties. (Had no one in history EVER written in C++ on linux before?)

The argument FOR C++ in the linux kernel is the same argument for using C++ anywhere else - object abstractions are useful. The project took far less time. The bugs were fewer. The code was of course more readable - because of the tremendous wealth of context that C++ provides thru strong typing.

C people argue its easier to use simple constructs, since they are instantly understandable. What is NOT understandable is WHY the code is putting an int into an array. Those types have no obvious meaning beyond the line of code they are in.

In an object-typed language, you CAN learn the scenery and rapidly become familiar with the object set. The argument above (write) is largely silly - right-click and GoToDefinition works in almost any IDE - yes there may be more than one possibility, the IDE will display them all. So navigation thru code is vastly simpler than grepping for names.

Anyway, in 18 months we had an Infiniband layer integrated into linux, not too hard. Had to invent fundamental kernel/user page primitives that were missing. Interesting to note: Windows had all the driver support we needed, didn't have to invent anything.


He's just pointing out that the first argument to write() is the exact same kind of abstraction as the "foo" in "foo->perform()".


C++ is one of the few languages where in "foo->perform()" the -> and () can be something completely different than what you expect. There's almost nothing in that simple statement you can be completely sure of.


They can be, but they shouldn't be. Operator overloading is useful when you want to do something that is semantically similar but implementationally different.

Even mathematics has different processes for the same operators on different types. If you see 'a * b' you should be able to assume that it's multiplying two variables, but if those two are real numbers it's different than if they're two matrices of real numbers, for example.

Implementing a Matrix class that overrides * to imlement matrix multiplication seems perfectly fine for me - It's just the same as implementing a Multiply(a, b) method. Just as someone who writes a Multiply method could in theory make it do anything, it is assumed by reading the word 'Multiply' that that is what it does. Just as someone who makes a Multiply() function that does something other than multiply is an idiot, so is someone who overrides operator * to do something other than multiply.


The STL overrides the bit shifting operators for output and the deference operator for accessing the current item in iterators.


That's a true and valid argument but beware that any C++ advocate could knock it down by saying "then the kernel code standard should reject things like that".


And when you reject things like that you end up with C anyway.


Meh, this is a straw-man. Conventionally rejecting overridden -> and () operators is a very long reach from ending up back at C. Disallowing operator overriding in general (as opposed to specific operators), namespaces, and classes would be much closer, and it still wouldn't be all that close to C (reinterpret_cast, templates...) There is a goodly sized amount of gray area between the two. Or did you mean your "things like that" to be much broader than my interpretation?


Similar thing with C macros.

(Note that my point here is that for sane/good code, you normally can be sure that it does what it looks like -- for C++ just the same way as for C.)


I have to disagree with you with regards to write(). The write(2) man page is deceptively simple, check out open(2), specifically the NOTES section. I count 12 occurrences of the string "NFS".

My favorite part of the whole man page is the Linus quote:

    "The  thing  that  has always disturbed me about O_DIRECT is that the whole interface is
    just stupid, and was probably designed by a deranged monkey on  some  serious  mind-con-
    trolling substances." -- Linus
But thats not a language failing. A language that actually prevents you from writing complicated interfaces is probably entirely unsuitable for... almost anything.


Just a few questions as an example:

* If you write to an NFS-mounted file and write() returns an error, does it mean the block wasn't written to the remote disk?

* If you write to a TCP socket and write() returns OK, does it mean data was received by your network peer?

* UNIX pipes: how many times is your block being copied before it reaches your peer's buffer supplied to read()?

And don't get me started on signals.


When we were taught an OO course, we had smalltalk first, and then we moved to C++. Professor enumerating all rules, ifs and buts was like an operator of a german machine gun that was mowing down wave after wave of good natured allied programmers. Or so it felt and classmates have conferred on my feelings.

C++ has too many rules and exceptions, while giving some sort of control to a genius - in hands of a good natured well meaning coder, it all goes to hell. That is so not in the beginning but in the long run.

All misgivings noted, compilers gone to future and back. They help alot.

C has few rules, and whats more if you use UNIX / ISO convention there are very few ways you can go wrong. C++ is an ugly duckling of era of 586 type machines, when you have tried to do something in binary , efficiently and with a style and yet still was found wanting.

Explicit is always best, because you always refactor. Say what you mean, write what it does. This way its easier to read, and easier to coax it to do something else. In the end good coders achieve things by correcting few things here and there. If you have some sort magic that developer can't trust to be exactly what they expect it to be - they can't code with honestly and without fear. And fear as we know is a mind killer. So it is a catch 22.

I am totally 100% with Linus on that one.


I can show you a lot more awful looking C code than awful looking C++ code.

Edit: Well, seems you don't believe it.

Take a look here: http://www.google.com/codesearch?hl=en&lr=&q=lang:c%...

And here: http://www.google.com/codesearch?hl=en&lr=&q=lang:c+...

Just browse a bit through both, check a few different projects. Check also some of the big projects. (Apache, GCC, glibc, Linux, LLVM, clang, WebKit, Chromium, etc.)


Its all just header comments...


Ugh, in C under UNIX this could mean two things:

* If some moron created a "write()" macro, it means expand that macro (preferably it should also mean email the director of HR to suggest the author of the macro consider having the company pay for their MBA, to make sure they aren't allowed to touch code again)

* Otherwise, if a header has a definition for write use that. Otherwise, emit instructions to call put "fd, buf, size" on the stack and call write.

In C++ it could mean:

* There's a class "write" which has a constructor that takes three arguments

* There's a function in the global namespace called "write" that takes three arguments

* There's a macro called write

* There's method called write in the local class. That method could be invoked virtually or non-virtually. It could be inherited from a parent class.

* write could be an instance of an class that has "operator ()" defined

I probably left out some more. In fact, there's likely a firm somewhere that thinks "what could write(a, b, c) mean in C++" is a wonderful interview question (perhaps they could ask it to all those losers who don't know what "explicit" keyword does but still have the gall to think they're competent enough to work for them!)

In a UNIX system, with right headers included, what write does is specified by the POSIX API. POSIX is one of the best defined and cleanest APIs. It has existed before the world of IDEs and plug-and-play libraries. I can write non-blocking C network code for a UNIX OS with vi, from muscle memory after reading through man pages and working through Richard Stevens books.

Writing Java NIO code requires: an IDE to prevent RSI, traversing Javadocs to understand the non-intuitive APIs and searching mailing lists through Google to e.g., find out that Java NIO selector doesn't let me use edge triggered epoll because it would involve "tight coupling" (read: it might be difficult for somebody writing code on an AS/400 to use it).

JDK7 NIO2 potentially changes this (I can implement internals of a selector myself). I'm also sure I'll be able to target JDK7 with a Perl6 compiler... that I'll use to implement the firmware for my flying car, in which I'll travel pick up Hans Reiser when he's paroled from prison.

Note: in this case, it has nothing to do with the language. java.util.concurrent API is very well defined because it was written by great programmers (Doug Lea and Joshua Bloich) who are apt at API design. Unfortunately, Java doesn't "force you" to create a clean API like C does: there are no design patterns available, no IDEs, no built-in tools for literate programming in C; you either build a clean API, or no one will use it. While I'm in no way of Joshua Bloch's caliber, I can relate to him when he says that he stayed with imperative C until finding Java.

While Java doesn't force you to build a clean API, C++ almost makes it impossible to build a clean API: witness boost::spirit ("generic programming", C++'s idiom for extending the language), compare with lex/yacc (external DSLs) and parser combinators (internal DSLs) in Haskell or Scala.

I want to like C++. I have programmed it for a living before and will almost certainly do so again: there's a certain combination that requires low-level code with no memory management and Object Orientation; my chosen specialty (distributed systems) often requires that combination (fortunately, not always: Erlang and JVM languages have been used to build some incredibly impressive systems).

Perhaps Go and D could come along and step up to that challenge, but I am skeptical: Modula-3, despite influencing other languages hasn't been able to step up to that plate. I feel C++ 0x, Intel collections for C++ and some parts of boost, STL and tr1 e.g., tr1::unordered_map, boost::scoped_ptr are very cool and useful. Boost Graph library is simply awesome and has no equivalent.

It seems, though, as if C++ was built by warring hordes: one horde that wanted generic programming and thought OO was pointless, another horde that wanted OO but thought generic programming was pointless and yet another that hated both. Neither won nor lost, each side said "mission accomplished" and developers were treated as "collateral damage". I could care less about which one of those to use: I am productive doing OO programming in Perl, Python, Scala and Java. I am also productive doing generic programming with CLOS in Common Lisp or with type classes (or their equivalents) in statically typed languages. Likewise, I am perfectly productive writing imperative code in C. I just want clean, easy to program to APIs with understandable error messages (either at run time or compile time, I am not picky about dynamic vs. static typing -- they're tools, means to an end) and C++ doesn't allow for that.


How do you explain the success of ActiveRecord? Do you think it's illusory? The exception that proves the rule? Something else?

http://blog.objectmentor.com/articles/2009/07/13/ending-the-...

"The fact that it took decades for the industry to arrive at something as useful as ActiveRecord in Rails is due primarily to the attitude that some language features [in this case, meta-programming] are just too powerful for everyone to use. "


I think ActiveRecord has exactly the sort of problems that Linus describes, when you get to large codebases and large teams. For example, what does 'save!' do? Well, it doesn't just update the row in the database; it also runs any validations on the object, and any before/after validation/save hooks. So, in general, you need that context to know whether a patch that updates something is correct or reasonable. And if you've got any triggers in your DB, there's a whole other area where something unobvious might be happening.

There's also lots of subtle ways for the DB and a particular object or set of objects in memory to get out of sync and cause hard-to-understand bugs.

ActiveRecord is definitely one of those tools that "make easy things easy". But, at the same time, it's not hard to get overly clever and make a mess with it as well.


"a tool that make simple things easy" - thank you for this =)


I was quoting Larry Wall who said that a programming language should "make easy things easy and hard things possible". (Or something close to that... the internet seems a little uncertain on his exact wording.)


I usually hear it in the form of a denouncement of tools that "make easy things easier and hard things impossible". I think that perfectly describes a bad abstraction layer.


Haskell is claimed to make hard things easy, and impossible things happen. (Though, who knows, perhaps easy things will become impossible with it?)


Not before making easy things hard, though.


You get used to it. The stuff that's easier in other languages than in Haskell is mostly done by hiding distinctions.


Though not completely. Some stuff is made more complicated than necessary because of the choice of syntax. E.g. variadic function in Haskell vs Scheme. Or the the Haskell record-syntax.


A little nit pick - "The exception that proves the rule" is commonly misused as above.

It means that by their being an explicit exception a rule must exist for their to be an exception. For example:

"Special leave is given for men to be out of barracks tonight till 11.00 p.m."; "The exception proves the rule" means that this special leave implies a rule requiring men, except when an exception is made, to be in earlier. The value of this in interpreting statutes is plain.

http://en.wikipedia.org/wiki/Exception_that_proves_the_rule


ActiveRecord, like much of Rails, is all about context. ActiveRecord is designed to create a domain specific language for interacting with a relational database. By Linus' reasoning, if you had a project where interacting with a relational database was a significant and pervasive part of the app, then may be worth the tradeoff of bringing in the context and assumptions of a system like ActiveRecord.

In a huge, sprawling project in which the relational database is a small portion, it may not be worth trading the power of ActiveRecord for the accompanying loss of explicitness and clarity.


Interesting question. To be honest I'm kind of on the fence with AR. It makes trivial SQL very simple but I find myself forced to drop down all the time into raw SQL for a lot of cases and I also find that AR queries with a lot of options (order, group, select, limit etc) aren't much of an improvement in readability or abstraction over the raw SQL.

What's more, I finally gave up after a year of waiting for this pretty fundamental bug in the postgres driver to be fixed and finally just hacked around it: https://rails.lighthouseapp.com/projects/8994/tickets/2622-p...


" I also find that AR queries with a lot of options (order, group, select, limit etc) aren't much of an improvement in readability or abstraction over the raw SQL."

This was how I felt when I first scratched the surface, but by using find(...) instead of raw sql you leave open other options for the future like using :include


I think a lot of this will improve with Rails 3's new query syntax. It looks a lot like LINQ for .NET.

Yeah, a lot of times you're using the same keywords you might use in SQL, but the new chainable extensions are really nice for building up queries and reusing common chunks (either in LINQ or in Rails 3)

I should note that this new relational syntax is provided by a library called Arel (http://github.com/nkallen/arel)


If the kernel was written to the common standards of Ruby code, it would come apart completely.

-- > That doesn't mean Ruby is bad. Ruby is great. ActiveRecord is a great model for MVC web development. The standards of Ruby aren't good for more performance-oriented, "hard core" development - Ruby's a pretty mature language but Rubinius, the Ruby-in-Ruby project still isn't production-level. One might guess that compilers/interpreters require a different standard than web frameworks. Neither is bad though (writing a website in c/c++ would be onerous).

If the space shuttle code was written accord to the standards of Linux code, it would come apart completely too.

It's shame that Linus had to frame things as good language versus bad when it's a matter of the right tool for the right job.

Still, the proofs in the pudding and anyone who can write a kernel in C++ or Ruby would sure give those languages a boost (and these are the languages I like best).


There really wasn't much in terms of "good and bad" overall, just "good and bad" for Linux. At the end he says (paraphrased)"I'm not saying this is for every project, but if you are going for more than C, skip C++ and go for <list of features that ruby qualifies for>".


Tu put it more plainly, he said that for any possible job, C++ is never the best tool for it.

  ∀job, ∃L | L(job) > C++(job)
This, plus stating that C++ is very complicated amounts to say that C++ is bad. Because even if C++ is "good enough" for a wide range of jobs, it's complexity makes it longer to learn than several, simpler, more specialized languages.

My personal opinion is that C++ is best only when legacy code is involved, or when the team just won't learn other languages. I can understand them. Learning C++ is such an investment that they are more likely to "throw good learning after bad", or may think that learning another languages will be as difficult as learning C++.

I may change my mind when I bother to look at LLVM or V8. Perhaps.


ActiveRecord is godawful as soon as you want to use for something that wasn't in mind when people designed it. I'm working on a Ruby application that has to talk to a legacy database with composite keys and does so via ActiveRecord. Oh, the pain.

As others say in this thread: ActiveRecord makes easy/simple things very easy. Unfortunately, it makes harder things damn near impossible. To aggravate matters, people that only use it for what it was designed for, tend to blame you, instead of acknowledging that it may very well be that their favorite tool doesn't support a particular kind of use.


i disagree with the assumption that a good tool will support all or even 90+% of use cases. you lose a lot by piling in feature after feature.

I much prefer a few different tools that do a few different things really well. I don't use active record if I have to deal with legacy schemas. I'll use data mapper or roll my own. If I get to design my own schemas then I love the benefits AR gives me.


> However, I think in the long run it's better to have very explicit code.

You should love (parts of) Haskell then. You even need to specify whether your code can have any side-effects there.

(Haskell's Type-classes are awfully implicit, on the other hand. Though not nearly as bad as overloading in C++. You do not need to abuse bit-shifting for sending stuff to streams in Haskell. If you want/need to, you can just make up your own new line-noise operator. That new operator won't be used anywhere else, and is thus perfectly grep-able.)


As with most things, there are tradeoffs. If the expressive code was written with a good level of abstraction, it should be pretty straightforward to figure out what it does. Of course, it means that you need to understand those abstractions before you understand the code, and you might have to do more jumping around to figure them out. But in the long run, it makes it much easier to remind yourself how certain features work.

On the other hand, you can write obtuse and difficult to understand code in expressive and non-expressive languages. I suppose a point could be made that bad code in say Haskell or Lisp might be worse than bad code in Java.


If the expressive code was written with a good level of abstraction, it should be pretty straightforward to figure out what it does. Abstractions are necessary, of course, but they also leak. Good judgment in these issues is one of the hallmarks of an experienced programmer. I'm just finding in my own code that I'm gravitating towards less abstraction, not more.

Clojure in it's current incarnation is a great example of this problem, IMO. It provides a very expressive and highly abstract interface to the JVM and java libraries, but once you hit a stack trace the abstraction comes tumbling down and you have to start picking through the mixed java/clojure stack trace to figure out what went wrong. Throw macros into the mix and things get even hairier. I've found that in some cases it's better to just bang out vanilla java. Sure it takes more LOC to get the same things done, but the result is often dead-easy to understand and debug.

Like everything else in engineering though, it depends a lot on exactly what you're trying to build.


Gotta disagree.

Good stack traces are about the maturity of the compiler and tool support- they have absolutely nothing to do with clarity of the language. Clojure is simple and Clojure opts for explicit context over implicit context almost everywhere (dynamic binding being a big exception)- this is exactly what Linus argues for. All you have in Clojure are functions and values - how much straightforward can you get? No objects, no hidden behaviors, no private variables, everything in namespaces, etc.

After two years of hacking on Clojure, I haven't really found macros to obfuscate anything. But that's because I learned not to use them unless I need them.

But, yeah, I'm looking forward to the community growing and providing better error messages and tools for deciphering raw Clojure stacktraces.


You're looking at Clojure from the point of view the language specified by the docs though, and I'd agree as far as that goes.

From another point of view, Clojure, like all higher level languages, is just an abstraction over an underlying machine. From this angle and in it's current state it's (IMO) a pretty leaky abstraction.

I think it's fair to say that this has more to do with the implementation than the design though.


C is just an abstraction over the underlying machine. The amount of tooling to required to make C programming bearable is staggering - how much leakier can you be?


> One of the absolute worst features of C++ is how it makes a lot of things so context-dependent - which just means that when you look at the code, a local view simply seldom gives enough context to know what is going on.

Good point. Very good point.

Having spent few years developing for the Linux kernel I can say that the most cluttered code I dealt with was the network stack. And exactly because it was done in C++-ish way. It makes an extensive use of tables of function pointers, which is basically an analog of a C++ virtual table. A socket depending on its type would get a pointer to a different table and that table would define the actual flow, say, of the recv() call in the kernel. The concept is very elegant, and it translates into a more compact code, but it also makes tracing the code by hand hard.

So, yeah, Linus got a point there :)


I started designing a language where you had lexically scoped "contexts" to specify the meanings of words. In general a package/library would be a context, and could declare its dependency on other contexts. Any words with conflicting interpretations would have to use its full global specifier to compile.


Perl does this, Perl 6 especially.


Sadly x86/arm/mips/sparc/power processors aren't Perl machines and Perl6 thusly can't be used to make a kernel for a Unix derivative.


One could probably get Scheme to do this and use it to make a kernel. But what would be the point of making it a Unix machine then? Make it a Scheme machine.


They aren't C machines, either. But I agree, that Perl is probably harder to compile to native code than C.


Yes, I think we all wept when we first learned that.


Actually, you refute his point. If they were actual classes instead of ad-hoc vtables, the flow would be clear.


No, he didn't. A call to ipv4_recv() is obvious, a call to impl->recv() less so.


[deleted]


This post makes me wish that I had enough karma to downvote.


You can save your spiteful single clicks. That post sinks its own ship plenty well.


It got deleted, a little context so that I understand?


I was under the impression things were easier to understand without context.


Maybe if you find it easier to work with figments of your own imagination rather than operating on actual data, perhaps.


I think your example proves that it is how features are used, not the mere fact of the possibility for use in a language, that creates the problems linus refers to.


I think Linus is making the right call here. At a 10,000 foot view, the real argument is about style. The style of C happens to be one that lends itself to easier groking of code amongst hundreds of developers (as he illustrated well). Sure, C++ may be a more modern tool with some more powerful features, but using it may not be the best thing organizationally.

This kind of thinking is refreshing and is seemingly rare in classrooms. There, you mostly learn 'what to use' and not 'how to determine what you need'. Just as I'd expect, over design is the most consistent issue I come across in co-workers' code. 'Write code to be read' is a metric that seems to have gone out of style. Code that does X-Z-Y should, in my opinion, look like code that does X-Y-Z. Otherwise the reader has to invest time and energy figuring out the design. There had better be a good reason to justify a hard-to-grok design, because it will continue to be a drain for the life of the project.


I actually rather like this quote:

    Anybody can say "yes". Somebody needs to say "no"


it's interesting to note that open source works in reverse of entreprise where any manager can say no, but only the boss can say yes...


It corresponds to Brooks's idea (law?) that one architect (with view) is good, while many is a disaster.


Linus has a touch of what I think makes Steve Jobs great: the ability to say no and stick by those convictions. No argument from me that Linux is wildly successful and one of the most important tech developments in the last 10 years.

However, I always cringe when I read some of his comments that seem to reject any element of forward progress. While C++ certainly has its share of issues, Linux will never evolve if the programming paradigm stays in the 60s. My gut feel is there is a lot of opportunity to do better over the next ten years.


It is very silly to say that something is bad because it started a long time ago. I do not see how any of the modern languages of the 21th century would be even remotely suitable to write an OS kernel with. Considering that most of those languages are at least partially interpreted and require GCs, a kernel written in them would be painfully slow.

Just because something is old does not mean it is worse, and just because something is new it does not mean it is better.

If you think about it, the modern languages are clearly made for programming higher on the stack (i.e., at application level). That makes a lot of sense. After all most programming is done at the application level, and it would make sense to write a language that makes that type of programming easier. And of course many modern languages are very successful in that respect. But it is silly to assume that just because they are successful at the application level they would make good kernel programming languages.

The reason why C++ is the language being considered is that it is also a relatively low level language which means that it can potentially replace C for kernel level programming. That of course does not mean that it should, and Linus does a good job of pointing out the issues with C++.


>. I do not see how any of the modern languages of the 21th century would be even remotely suitable to write an OS kernel with.

Not true. You just have to be smart enough to implement the stack and structure properly. Have a look at some of the experimental operating systems out there. Singularity is especially worth a look.


Yes, and still hasn't solved its performance problems that span all of the OS's functionality. Performance is something we might not care about at a high level, but on a kernel it's a first priority, and a very difficult problem to solve.


> I do not see how any of the modern languages of the 21th century would be even remotely suitable to write an OS kernel with.

Lots of people write lots of kernels in lots of languages. I have some (small) contributions to one in D, for example. While still explicitly a systems language, there's still lots of room for people to write kernels in whatever they want.


The counter arguement would be that it is silly to reject sonething because it is new. I've written. Lots of C++ to test SOCs at a lower level than the linux kernel. Discounting OOP techniques because of complexities in C++ is shortsighted.


From linus's post - "And the best way to avoid communication is to have some "culture" - which is just another way to say "collection of rules that don't even need to be written down/spoken, since people are aware of it". Sure, we obviously have a lot of documentation about how things are supposed to be done, but exactly as with any regular human culture, documentationis kind of secondary." He seems to suggest that sticking C keeps the focus on attracting people who like to build things. There are surely very productive programmers who use c++ and other newer languages but sometimes embracing new developments means opening the door to people who are just following fads. Just as an example - the kernel has managed to avoid the following developments - uml, xml etc. Not saying that uml and xml dont have their legitimate uses, but sometimes keeping things bare and simple has the effect of keeping posers out. As a sidenote, I dont seem to recollect any recent groundbreaking progress in languages for systems level programming (singularity from microsoft might be an exception). At some point people were trying things like oberon, lisp machines, but I dont seem to recall any new developments which explore dynamically different ways of building computer systems. For better or for worse we seem to have settled on a local optima of c and unix based systems.


I don't think anyone can surprise us with something big now. It was easy to come up with a new, revolutionary OS in 80s/90s - you could write one yourself. Right now, anyone writing an OS for the OS itself (not for research) has failed before they start (unless they've got a large experienced team and loads of money to burn). You have hardware you cannot easily access, loads of applications you cannot make a compatibility layer for, etc. etc.

When you created a lisp machine, it was a lisp machine - everyone knew why and how you did it (to some extent). If you wanted to run it, you ported "the program" to it and all was good. Right now you have users expecting at least the stuff they can get from other popular systems - and that's years of work away for any new project.

The best you can probably do, as far as "progress" goes is to create a new research system and port the good stuff back to mainstream (singularity, plan9, etc.)

To some extent we know all there is to know right now... New languages, periferials, usage ideas are still created. But for a new system language you need to either create or port a system and gather some followers. That's a big step.

Edit: Actually I believe that there's a new hardware change coming that will make current computers ineffective in some way. We could switch to new hardware, new languages, new systems... we could start from scratch :) I don't believe the current ways of parallelising computation can do that.. but if someone constructs a concurrent bus/memory/cpu that could invalidate some of the normal programming ideas.


As someone contributing to a new OS, these are hard problems, for sure. At the same time, I personally see a pretty big future where the web enables lots of new systems work.

See, even though Joe Blow may use Windows, he still uses Linux when he Googles. The web allows people to not have to worry about what OS the servers are running... so I think there's lots of room for innovation on the OS front, but on the server side of the cloud.


Fully agreed, though i would love to be proven wrong by someone coming out of the blue and creating something from scratch.


If anyone does manage that, it will be someone who was never told it was impossible to do.


I'd like to see Linus tackle a systems programming language now that he's tackled version control. I'd think that would be entirely within his domain.

But C++ is not the answer here. It wants to be all things to everyone (low-level, high-level, object-oriented, functional, etc, etc) and is therefore useful for almost every kind of project but well matched to none of them.


Linux is FOSS, so if there are huge benefits to using C++ then the people who advocate such things should simply take the tree, fork it and start doing some C++ stuff. If their arguments are correct it should soon be fantastic and much better than Linux.

Also, there is nothing wrong with programming paradigms from the 60s as evidenced by the popularity of Clojure and other functional languages that borrow heavily from LISP.


The clojure analogy actually works in favor of them switching to C++. Clojure takes the good ideas of Lisp and bolts extra convenience (and modernization) on top, just like C++ adds extra convenience to straight C.


C++ is not just C with extra stuff. C++ programming and C progamming are different enterprises. The Linux kernel is full of all sorts of explicit "this block of memory is a struct txaz_fooblock and here's how we link them together" that C++ programmers simply don't write.

This is before you even get into things like the template-y standard C++ library.


C++ played the C as a gateway card very heavily early on, in the 80s that was a brilliant strategy, now not so much. If youre interested in efficiency straight C does fairly well (also because building foreign function interfaces for C code is much easier), if youre interested in correctness then some variant of haskell or ml will serve you with actual type checking, and if speed of writing code is what youre after then some dynamic language will do.


Agreed. C++ is more like Bjarne Stroustrup bolts everything he's ever heard of onto C than "C with classes".

Blatantly plagarized from: http://www.cvaieee.org/html/humor/programming_history.html


I'm not making value judgements (yet). I'm just saying that, for instance, programming with templates, or modeling events as objects that are stored and passed around, or just basic nuts-and-bolts stuff like what your go-to code is for an array of elements; these are very different in C++ than they are in C. The idioms are different, the libraries are different, the principles are different.


True. I hadn't considered that.


Clojure actually removes a lot of things from lisp. So it might not be a very good analogy to c and c++. For eg clojure doesnt have an object oriented framework like CLOS, does not have reader macros, currently has a much smaller set of inbuilt functions. Not to say it hasnt added new stuff, it has, eg persistent data structures, deep integration with the jvm. Also importantly all valid lisp programs are not valid clojure programs. In short, its a clean break from lisp which carries the useful ideas and the spirit of lisp forward.


No. It isn't about avoiding "modern" conveniences, it's about avoiding the specific baggage (a highly contextually-dependent grammar and operator overloading are two examples given) that C++ drags with it.

He ends with: "But C++? I really don't think the "good features" of it are very good at all. If you leave C behind, do it properly and get some real features that matter. GC, some concurrency support, dynamic code generation, whatever."


Well, he did create git. Not that your point isn't valid, just pointing out Linus is actively helping forward progress.


Excellent point, git is certainly innovative. It's a great example of someone smart seeing a problem with his workflow and fixing it (although this isn't 100% what happened, the whole bitkeeper/closed source debacle played a hand). I guess my point is that I haven't seen too much development in the core kernel lately, especially wrt helping use multicore better.


Not sure I'd call intertwingling the repository/workspace like that, or conflating branches and repositories, "progress". But I suppose it's better than svn/cvs/etc that it's largely taking its users from.

(disclaimer: I work on a VCS that's written in C++ and uses a real database, nice object-oriented libraries, and nice C++ abstractions :)


The repository/workspace issue I can agree with (though there is an environment variable that you can use to specify a repository outside the current directory), but what do you mean with "conflating branches and repositories"?

In git, branches are not a special entity. They are a property of the contents of the repository. The commits form a DAG, which is also a tree, and trees often have branches. A repository might have dozens of branch points that are never referenced by name, but that doesn't mean they don't exist.

You can even consider the whole set of repositories for a single project as forming a logical DAG in this manner. However, because of the distributed nature of git, there will be branches that are not visible to you.

Given this, what does it mean to somehow "separate" the concept of repositories from branches?


Perhaps a better way of saying that would be that they picked a very unique (re-)definition of what a branch is (something like "pair<repository * , commit * repository::*>").

I tend to think of a branch more as "a buch of commits that go together", which seems to fit very nicely with common usage where you have a release branch, dev branch, etc. It just seems very bizzare that people working on the "dev branch" are actually working on entirely separate branches just because they're in different offices which use separate local mirrors in case the internet breaks.


If git has defined 'branch' to mean something different from other VCS, then I think git got it right and everyone else is wrong.

The distributed nature of git (and any DVCS) means developers on the 'dev branch' are working on different branches, even if they all call it 'dev' in their local repositories. Their local development may have started from the same parent, but it's not the same branch. If you send the commits to someone else, they'll be completely separate from the receiver's dev branch until merged.

It seems to me that your idea of the "dev branch" is in git a purely non-technical thing. That is, a branch in some blessed repository that points to the so-far merged efforts of all the developers' dev branches, representing 'current' development.

I think this is a good thing, and I'm yet to be convinced otherwise.


> (disclaimer: I work on a VCS that's written in C++ and uses a real database, nice object-oriented libraries, and nice C++ abstractions :)

Wait, you can't just write that and end your comment! What VCS do you work on?, if you're at liberty to share.


I work on monotone ( http://monotone.ca/ ) which is older and I think conceptually cleaner, and that was a reference to this earlier rant (last paragraph): http://lwn.net/Articles/249460/ . We've also gotten a few comments about the code being very nice, so I doubt it's really all that messy and unmaintainable.


If C is 1960, C++ is 1961. It'd be nice to join the 21st century, or even 1980.


C++ is from the early 1980ies. (That doesn't make it a better language, though.)


Metaphor.


I prefer my metaphors sharp and cutting.


He's not rejecting progress. He explicitly says that C isn't for everything. But he wants real progress: a language with GC, concurrency, etc., not just C-with-classes-plus-etc-etc-etc


It is actually good as it is, like scientific progress, if someone wants to make a new kernel in .NET or Lisp, he could fork it and prove is better that way. But please don't force everybody to follow you.

MS created Vista using 2000s paradigm. It was slow as a snail.

Experiments could fail too.


> Linux will never evolve

Do you terribly mind elaborating on this rather bold statement?



Great way to edit my comment, what I actually said was:

> Linux will never evolve if the programming paradigm stays in the 60s

C is a great language, but I firmly believe that OOP creates more maintainable code that is more robust. Sure, a simple C program is easy to understand, but the kernel isn't an easy program. Don't get me wrong, C++ has major downsides, but C isn't a magic bullet.


The kernel does use object-oriented paradigms.


The kernel is also shockingly simple. Sure there are some tricky bits, but writing kernel code is very straight-forward. In fact every time I've written kernel code it has always been a "it can't really be this simple" type moment -- but some of that code is still running production machines today so... it must have been.

NOTE: this is not me bragging, it's me suggesting the mystique is a bit undeserved.


I definitely agree. It is the simplicity of the POSIX userspace APIs and the Linux kernel internals that make me enjoy Linux so much.

I still think there is some art to writing simple, efficient code, though. Most system-level code looks quite simple when it's done, but it may have taken several iterations to get it fast, clean, and small enough (particularly on embedded devices -- I once had to find 200 spare bytes to fit a bug fix in a 256KB firmware by hand optimizing various ancient parts of the code base).


Well, the kernel is so shockingly simple in part BECAUSE no one let C++ in. And it is a testament to Linus' great leadership and censorship that it remains that way.


Let's put it this way without any subjectivity: every production kernel that you folks use every day is written in C. NT is, Solaris is, Darwin is, and Linux is.

Not one production-level kernel that is in wide use as a general-purpose operating system uses C++. I don't believe that to be a coincidence.

(Nitpicker's corner: Darwin's device tree subsystem is written in Embedded C++, an extremely cut-down version of C++ that's more like "C with classes" than C++).


That's perhaps because all these kernels were written before C++ exists? The first C++ standard is from 1998, and stable STL/C++ support is even more recent.

FYI : in NT many device drivers are written in C++


I actually liked his point about context very much. In fast moving projects changes are visualised in terms of diffs, and the closer the code is to what will be executed, the more chances that obvious flaws will be caught. btw related discussion in an older thread - http://news.ycombinator.com/item?id=1318489


It is funny how a very simple and innocent looking question by "newbie" started a big discussion. It is like watching somebody drop a candle in a bed.

    Can you use C++ in Linux kernel?
    In windows kernel you can, with some restrictions.


What are the chances he's just a troll? A simple google search would show many similar debates.

I'm pretty sure you could cause the same in a month or two, by asking "I know there are projects using different languages to create kernel modules - even crazy ones like haskell. Can I use C++ in the kernel?"


Newbies? Not Google before asking a question on a newsgroup? What's the world coming to?

It's gotten to the point now that when I can't find a question asked previously, I start getting nervous, and 3/4ths of my initial post to a group starts off by explaining how I really did Google, honest! This is what I tried, and I still couldn't find anything, so if it's been asked before, please just point me to the right place, sorry!


Yes, but it is good to have such debates. It allows the Linux community continuously to reassess its choice of a c-only environment. Such introspection is necessary, and when people are ready to move on, I think the community will.


There's not much "Linus vs C++" in there (except for several random bashes with nothing to back them up). It's mostly Linus sharing his POV as the maintainer of the kernel as to why C++ would be harder to work with in a project that has many contributors due to code being more context dependent (i.e member functions, function overloading).


There are so many large scale C++ projects that prove Linus wrong every day. One of them, you probably use it every day. hint: it's a search engine.

Whatever language you use, you need rigor and diligence. You need to manage the project and follow up on developers.

You need to agree on a subset of the language, that will become your local dialect. I'm pretty sure that Linus doesn't accept "any C source code".

C++ has got more features than C. That's neither intrinsically good or bad.

I could argue about the merits of templates and what they enable you to do, and someone would tell me "it's hard to understand".

Well, any language you don't know is "hard to understand". C++ is not "C with classes" anymore. It's something else. It's a different language.

In the end what really matters is the quality of the developers and the quality of your process. The language is just how you implement your concepts.


Main problem with C++ is non-technical. Based on some experience interviewing a dozen people a week for several months, C++ programmers tend to routinely overestimate their proficiency with the language, while C programers tend to have the opposite self-assessment. To put it differently - on average C++ devs are cocky, eveyone is a guru and C devs are basically modest.


I agree with you. I often interview people who consider themselves "C++ expert" but struggle to use the STL's algorithms correctly.


I would inject into your list of what really matters something Linus said: communication. Having each member of the team (1000 contributors to the kernel by Linus's estimation) in a different corner of the world is a huge consideration.


This leaves me with a burning curiosity.

Linus, if you could change the Linux kernel to another language, would you?

In a perfect world (i.e. migration concerns and such aside), if you were changing the Linux kernel to another language, what language would you choose? If you feel C is still the #1 choice, what would the #2 choice be?


I would like to hear Linus opinion on Go.


Linus gives his opinion on Go later the referenced thread:

http://www.realworldtech.com/forums/index.cfm?action=detail&...

> Hey, I think Go picked a few good and important things to look at, but I think they called it "experimental" for a reason. I think it looks like they made a lot of reasonable choices.

> But introducing a new language? It's hard. Give it a couple of decades, and see where it is then.


Or D.

I had thought D was designed to be, among other things, a reasonable next-generation replacement for projects that would otherwise use something like C.


Yep. I'm actually contributing a bit to an exokernel written in it.


Wow, Linus is getting older : he didn't flame the poor guy down.

I feel a bit concern that Linus don't write code anymore but still decide on how code is written, etc... Happily, he has some experience.

Linus has good reasons, nobody is going to rewrite linux in C++ anyway, this has been debated to death.

Why I am still loosing my time commenting on this ?


It's right, that in theory I can't know what a C++ function call means.

But that's also the case for C. In C I can assign a function pointer to a variable. When the function is called through the variable, than I don't know which function is called.

Almost never the language is the problem, but their usage. There's always context, and the quantity depends mostly on the complexity of the software. And regardless which language you use, you have to define a convention for the usage of the language, also in C.


Interesting point about function pointers. Maybe Linus is saying (in many more words) to use virtual dispatch only in appropriate places but not as a basis for an entire language? He would be in disagreement with every language designer who has added some kind of first class dispatch or pluggable modules. Not that such disagreement would be surprising, but that seems to be what he's saying. He suggested looking to other languages for GC and concurrency support, but in that thread I didn't see any desire for dispatch or modularity beyond what C offers.


I've seen some pretty clean C++ code and some ugly C code. I think what's important is how a programmer use a language rather than what language he uses. I mean, if someone enjoy writing clever one liners, he/she will probably do it in all languages, might it be Perl or C. However, the important point here is that C does a good job and the programmers are used to it and like it. THEY, people that submit patches, like it, period.


Even if the Linux kernel were rewritten in Erlang and it won't change the fact the vast majority of *NIX systems level code is entrenched in C. The most popular LAMP stack ingredients are also written in C. Face it; it's a good language.


Non sequitur?


Regarding C and C++ I sometimes think of them like this:

If you want a language that's like C but better, tough, just use C.

If you want a language that is better than C, do not use C++.


It makes a lot of sense if you think about it as optimising code readability for the case of seeing it through the lens of

  diff -pu -c 3
which is basically what Linus does all day. This also implies that the argument has little meaning in the context of projects that aren't comparable in scale (both code size and contributor pool size).


Could Linux explain spinlock.h then? Depending on which type you give to the PICK_OP macro, it performs a different operation. Its basically a C template. Seems a bit weird to complain about type context specific code and then use it in one of the most important parts of the kernel..

Here's the code: #define TYPE_EQUAL(lock, type) \ __builtin_types_compatible_p(typeof(lock), type )

#define PICK_OP(op, lock) \ do { \ if (TYPE_EQUAL((lock), raw_spinlock_t)) \ __spin##op((raw_spinlock_t )(lock)); \ else if (TYPE_EQUAL(lock, spinlock_t)) \ _spin##op((spinlock_t *)(lock)); \ else __bad_spinlock_type(); \ } while (0)


"And the best way to avoid communication is to have some "culture" - which is just another way to say "collection of rules that don't even need to be written down/spoken, since people are aware of it". Sure, we obviously have a lot of documentation about how things are supposed to be done, but exactly as with any regular human culture, documentation is kind of secondary."

This is so true - I have zero problems with context sensitivity and difficulty reading C++ if its all consistently done according to my prefered standards (which I mostly inherited from an especially well managed employer), avoiding the "bad practices" that introduce these problems like namespaces and such. Yet if I dive into some poorly organised and written code I can easily spend a whole day debugging a trivial problem - simply because the time is wasted trying to understand what is happening.

The sad truth is that the latter case is more common - its not C++'s fault though (although it could just /not/ try and provide every imaginable feature) - its more bad management than anything else. Though I can't help but wonder if that is just unavoidable for inherently difficult to manage projects like this...


Many of the arguments he gave are mostly just based on the fact that there aren't that much powerful tools around which can do what he requests (and what of course is something you want to have).

Like grepping for all usages of some function. Or getting some code snippet in a way that it comes with all necessary context. Or whatever.

Though, with recent development (particularly on clang), these things may become much more easier. Of course, you will not be able to do magic things (like checking at what places what particular virtual function implementation is called exactly) but it will be trivial to check for example all calls on std::string::size etc.

In the same way, you could also implement some small helper tool for sharing code snippets which will add some meta information about the context. So that when you share some code like 'a += "foo";' it will contain the meta information that a is an std::string.

LLVM/clang is anyway also kind of a disprove to his arguments about maintainability. I would say that one reason that working on/with the LLVM/clang code is so much nicer compared to working on/with the GCC code is because it is written in C++.


You're wrong. I've actually had to maintain the big C++ projects. That have Zillion of classes, fully different, but each one has more Read(), Write() functions. Some are virtual, some are not. Some have two parameters, some have three, some four. And they even do fully different stuff! Which Read() will be called depends on anything and everything. I claim you just can't figure out what some piece of code does in any other way than by actually single-stepping through the debug build. Any time the debug build pieces of code don't reflect the source, you're fully lost. Note that Linus usage scenario is "read the chunk of code outside of the whole source base" and you argumenting "if I use some hypotetical very clever tool the tool would maybe able to show me what some line in the chunk of code actually does." I've actually made some "very clever tools" since some twenty years ago. And I wouldn't want to have to use them on the really big projects.


Btw., I was describing basically two tools:

1. Some replacement for grep to search through the code. Where you can search for "std::string::size()" or sth like this and it will show you exactly all calls to that function. This tool is really trivial to implement. (Probably someone has done that already.)

2. Some tool which adds some metadata to a code snippet. Which would just add exactly those information of the context which is needed to understand the snippet. I.e. this fully depends on your code. If it is bad code with full of macros, many operator overloadings and other tricks, there will be a lot of context around, otherwise, not.


What exactly is your point? You have the same problem if you use virtual function tables in C.


I know no tool would solve every problem, but I can't help but wondering if someone developed and IDE that truly got C++ would that allow C++ to be used by more projects. Seeing grep referenced for code searches reminds me that grep has no knowledge of the context of anything in a searched file. It almost seems like C is the only choice if you can't go beyond context-less tools.

//a grep specifically for C++ code - hum


One problem with C++: It is hard to build tools for, because parsing it is nearly impossible. Clang and GCC plugins may change this situation, though.


Are you implying C++ isn't widely used? I'm sure that isn't what you truly believe...


oh no, I know it is widely used (Visual C++ and the ilk). I was more thinking in these huge C open source projects.


C as a context free language? Bullshit. How many global variables are there? How many functions with ridiculous names like htons?

Edit:

I realize that "context free language" was a poor choice of words; i meant it in the sense that linus did - his claim that you can 'look at a piece of code and understand what it does' is laughable in the face of accepted c programming styles.


I'm not really sure you understand what a context-free language[1] is. I suppose global variables can be seen as being context-sensitive, but functions with weird names definitely don't qualify a language as context-sensitive.

[1] http://en.wikipedia.org/wiki/Context-free_grammar


No, what he means is that something like

    foo * bar;
Can mean two different things. If foo is a defined type (typedef int foo, for example) then that declares a variable bar of type foo. If it is not, then we're multiplying variables foo and bar.


Yes, that is context-sensitive, but it's pretty tame. First of all, if I look at that, I can tell it's probably a pointer. After all, it doesn't make much sense to multiply two variables and then not do anything with the result, does it?

On the other hand, it can mean a variety of things in C++. Are we multiplying numbers? Is foo or bar an instance of a class that has the * operator overridden? Is there a global overload of the * operator? Are we declaring a pointer?


But in a well-written program the use of overrides would make sense. Just as much as this function: ExitProgram(); should be clear in the context of a program. Now it's possible that this function actually creates a new process that then plays tic-tac-toe. But in a decent program, it will do its best to be consistent with what you expect. Likewise, if I see A * B, I expect that we're multiplying A and B. What does multiplication mean? Depends on the application, but I expect it to make sense.

And with a decent IDE you should easily be able to get the def for the '*', just like you'd get the def for any other function.

The problem isn't in the C++ language, but rather the C expectation that many developers have.


I think he means that many global variables create a big context. Which as you point out, it is not the same as context sensitive.


The "context-free"dom you reference here is purely syntactic. Linus here is certainly discussing the semantics of the language, since he is discussing reading and understanding the code. For that you need at least a (perhaps mental) syntax tree, of which there may be many for a given program, and which requires semantic context to disambiguate.


Ironically, this is an overloaded usage to the words "Context Free Language." (Whether you intended it or not.)


Yes. C and C++ are about equally bad in their usage of side-effects.


I'm immediately skeptical of people who think modularity, inheretance, and polymorphism are a bad thing. Linus never comes out and says as much, but by indicating that he doesn't want to jump around to learn how a bit of code works he's implying it by desiring as few dependencies within code as possible.

I certainly apprecate Linus' genius and the revolution he brought to computing, but we live in a different world than when he first took up C programming. Computers are far more capable than they were decades ago and we've taken advantage of that by making code much more flexible than in years past. The result is far more code reuse than early OO programmers ever thought possible.

Could this possibly be the worlds first programmer-generation gap?


Who said C isn't modular? You're confusing language features with programming paradigms. C can and is modular and flexible. Objects are not just a language construct A struct with corresponding functions is just as much of an object. And he didn't claim those were bad things. He claimed that the context dependency, the massive size of the language, the different implementations, etc., contribute to an environment that's not good for kernel development. On that count, I must agree.


I absolutely could not be the world's first programmer generation gap, because we have had so many before :-)


That is a huge problem for communication. It immediately makes it much harder to describe things, because you have to give a much bigger context. It's one big reason why I detest things like overloading - not only can you not grep for things, but it makes it much harder to see what a snippet of code really does.

I think this is a limitation of our using flat text to program in. So long as our primary means of communication is linear speech and flat text, we will have this communications overhead. Collaborative systems that display the context to all concerned can overcome this. In fact, this is already happening! (Exercise for reader.)


I came here to quote that same piece, but for a different purpose. I am in complete agreement with him.

I think writing grep-friendly code is a good thing. I do my best to make all my Perl code easily grepped, it just makes debugging infinitely easier.


Would a search tool other than grep change how you write code?


Probably not. I just said "grep" because it's such a basic standard, I really meant "easily searchable from a command line."


What about a syntax-aware search? If this were as responsive as grep, wouldn't that be better? I think so. I use one in Smalltalk all the time.


Somewhere down the list, someone asks about Go

Hey, I think Go picked a few good and important things to look at, but I think they called it "experimental" for a reason. I think it looks like they made a lot of reasonable choices.

But introducing a new language? It's hard. Give it a couple of decades, and see where it is then.

Linus

(http://www.realworldtech.com/forums/index.cfm?action=detail&...)


Because the Linux kernel is so important, and so difficult to replace, people will put up with whatever language inconveniences they must in order to get their contribution accepted.

Therefore, I don't see Linus' choice as having any larger significance.

I already know that C++ is a vastly superior language to C for small to medium scale programs where computational time+space efficiency is paramount.


Come again, please? All computational time+space efficiency of C++ comes from the "C", not the "++" part!


You misunderstand - I'm just qualifying the experience I'm speaking from.

I've used both languages, but only when time and space efficiency can't be sacrificed.

I've never worked on a large program in either.

Under those conditions, I know from experience that C++ is better. It allows me to abstract safely (I benefit from some compile time type-checking) with little or no run-time cost.


Excuse me but C++ is only vastly superior of C if you are a C++ programmer and do not know C. I am a C++ programmer, and I would never make such a claim. And for time and space efficiency? That doesn't make sense.


I know C and C++ very well. I don't use either language unless I need to be as efficient as possible.


Original Linus rant on C++:

http://lwn.net/Articles/249460/


Is that Linus post originally comes from some mail list? What is the name of the list? (google gives me only links to realworldtech, which is weird)


The post is from the forums attached to the realworldtech website.


He falls in the same mistake of many people: thinking that because there's operator overloading in the language, you have to use it. The same for templates.

Most successful C++ projects never use some features. Some even use C macros instead of templates, like wxWidgets. It both works and compiles fast.


Is there anything in C++ that I couldn't do in C from Application pov?


Linus rules! That's right, C ain't for everything


This rings hollow. It's like saying "C++ feature X can go horribly wrong when used without adult supervision, therefore C++ is unusable." This is especially silly coming from someone who favors a professional stunt language like C.

If somebody uses too much overloading, reject their patch. If somebody makes a template tarpit, abuses operator(), or gets too fancy with operator overloading, send it back to 'em.

Context dependency bad? I like having the superclass define a contract so that you can ignore which subclass is being used. And C++ even rubs your nose in the name of the superclass, to avoid duck typing (which Python/Ruby use with near impunity anyway).

Polymorphism also gives you what amounts to functional programming without touching C's awful function pointer syntax. I suspect a lot of Linux code uses baroque switch/case or "if" statements just to avoid the pain of function pointers.

Edited.


I'll deal with this in order of decreasing insanity:

1) Polymorphism is NOT functional programming. Using function pointers, which you seem to detest, is one aspect of functional programming. If you've only used C++, then you don't know what you're missing from languages like Lisp or JavaScript. (I should know. I was once like that.)

2) I seriously doubt the Linux kernel avoids function pointers. Ever heard of dispatch tables? Good C hackers know that smarter data structures make for simpler code.

3) Any feature that can go horribly wrong will go horribly wrong when you have a large project with lots of contributors. Everybody who successfully uses C++ on a large project has a huge "coding standards" document to keep the project from crashing and burning. I know Google has it, and I've seen others, too. Even I have it for my own personal C++ projects!

4) What Linus meant by "context dependency" is that you have to know the object's type in order to know what the function will do and you even have to figure out which version of the function will be called, based on the arguments. This is hell when you get a small patch in the middle of a larger function.


1) You can pass around worker objects and apply them to things.

2) Declaring, initializing, and using dispatch tables requires a boat load of syntax, or GTK+ style macro hell. To the best of my recollection, Linux only goes to the trouble for a few large, complex subsystems, like networking and filesystems. Many other areas could probably benefit from it but cannot afford the price.

3) True. But once you climb the hill, you get to offload a lot of crap onto the compiler. Forever.

4) Linus already applies his +18 Axe of Correction to large functions and nesting depth. Anybody who reviews patches without a color-highlighting code browser deserves what they get.


I think his main point vis-a-vis C and C++ is that the kernel has a unified C culture and adding C++ would import a bunch of different C++ cultures. As a C++ developer in a C++ shop, I can appreciate this. Our shop has a very solid and consistent C++ culture. We can read each others' code because we all use the same language features and coding style. You can't do coherent C++ development without having this unity. The only way to develop a unified C++ culture for the Linux kernel would be for it to be grown slowly over time, starting from a small group of developers, the same way the kernel was. That can't and won't happen. If C++ is allowed into the kernel, a bunch of extremely skilled C++ programmers from different C++ cultures will start to contribute kernel patches written in C++, and it will be a tower of Babel.

However, I don't buy his argument that C++ is unsuitable because it's linguistically a high-context language. Part of this problem of context is solved by having disciplined, humble coders who know the limits of human intelligence. (I do not think these are new requirements that would exclude existing kernel contributors. C has, and the kernel uses, macros.) Part of the problem is solved by having a unified culture. The rest of the problem of "context" in C++ is really what would be called "state" in C, and of course you can't read state in the text of a C program. If you could magically create a unified and disciplined culture of C++ usage in the kernel, it would be fine. But that can't happen, so C++ is out of the question.


Thanks, nice to read a sober judgment of the situation. As a both a C and C++ programmer I really appreciate your point. I would however argue that some effort should be done to enable the possibility to create some common culture. I am sure that some unbiased eyes could find subprojects where c++ (or other compiled oo/functional languages) would be better suited than c. I strongly believe that the paradigms all have their uses, and that in time the Linux project would be more powerful if it used the technology available. After all, isn't that one of the great strengths of openness?


GCC is now allowing C++ code. It will be interesting how they can keep C and C++ style unified and coherent.


> Polymorphism also gives you what amounts to functional programming without touching C's awful function pointer syntax.

Which aspects of functional programming are you talking about?


Probably operator() and passing classes that have this operator implemented around.

For me it isn't better than function pointers, also because of method - function distinction we have 2 different incompatibile kinds of functions.


The operator() does not bring you any closer to functional programming. It's just syntactic sugar, maybe even nice sugar.


> Which aspects of functional programming are you talking about?

Passing around things that do work on other things -- easy with C++ worker objects, painful with C function pointers (or worse, pointers to structs full of function pointers).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: