The fallacy of high-level languages

There’s been a meme going around the open source community for a while now.  That programming in C is somehow dirty, distasteful and worst-of-all inefficient compared to programming in a high-level language such as C# or Python.

Its detractors will tell you how it takes much longer it takes to program anything in C.  They’ll point at how much C code it takes to do something as simple as create a GObject sub-class compared to the equivalent in Python or C#.  They’ll also probably complain that everything in C has to be compiled first, which takes even more time up.

No argument would be complete without them pointing out that C has no standard types for strings, let alone linked lists, binary trees, associative arrays, etc. and that you have to spend all that time implementing your own.  They’ll probably make a point about how C’s static type system means that even if you have a array type, you need to know in advance what types you’re going to put into it and can’t just mix and match.

Don’t feel tempted at this point to counter with a discussion about how great and flexible pointers are.  You’ll receive a lashing about how they’re even more evil than people who talk in the theatre.  The rant about the C problems of uninitialised memory, out of bounds pointer errors and segmentation faults is a timeless classic.  Especially when they get to the bit about how much time was lost debugging them.

And do you know what?

I simply do not agree with them.

I cannot think of a single project where the majority of time was wasted writing GObject header files compared to the single line of Python I needed.  I can think of lots where I’ve sat for hours trying to figure out which class I needed to derive from, or reworking the code after I realised I derived from the wrong class to begin with.  The high-level language doesn’t make this any easier.

As to the number of projects where I’ve needed to write a linked list or hash table implementation because C lacked a convenient dynamic array or associative array type like Python?  If it takes you any time to write that kind of code, you’re doing it wrong.  I’ve spent far more time realising that the structure is a performance bottleneck, and planning on the whiteboard a faster alternative.  Neither language helps with this whiteboard time.

And all those pointer issues?  This comes down to the tools that you’re using.  If you’re writing in a language and not using its development environment properly, then it’s little surprise that you’re not being as efficient in it.  gdb, valgrind, gprof and gcov are your friends.  Use them well.  I’ve spent just as much time dealing with other language-specific issues to make me believe that pointers aren’t any more evil as (for example) monkey patching.

The vast majority of my time on any new project is first of all spent thinking, and on a more mature project its figuring out what I did wrong and how I need to rework it.

Yes, the next biggest use of my time is working out what the best way is to express that.  If I’m writing in C, that means I’m deciding whether it needs a linked list, or a hash table, or some other fancy structure.  But if I’m writing in Python, believe me I spend just as much time normalising my class structure and coming up with all sorts of insane Pythonic ways of doing things.

If you’ve ever written in Perl and not lost a day or two to optimising your regular expressions, or eliminating code to arrive at the shortest possible expression, you’ve never written in Perl.

Refactoring takes you just as long in Python as it does in C.  Just because when you do it to C code you end up setting segfaults doesn’t mean that when you do it to Python code, suddenly your class structures don’t match anymore.

Proponents of test-driven-development, AGILE, LEAN and ANEMIC programming methodologies will probably argue that it’s easier to practice their religion with a high-level language.  I’m not buying that either, I’ve managed to write several large software projects in C that have a comprehensive test suite – including testing for allocation failures.

Ah, Rapid Application Development I hear you say?  Well, the only people I’ve ever heard say how great RAD is are people who’ve never had to support the software that was written rapidly, or debug issues with it years later.  RAD is great when v2 is going to be a complete rewrite, and v3 a complete rewrite again.  Very few websites have upgrades without announcing a completely new codebase.

It’s certainly true that it’s faster to mash up some code in a high-level language.  I use shell scripts and bits of Perl for this kind of thing all the time, and I frequently even do basic mock-ups or essays in Python.  But ultimately it all tends to be throw-away code, that I don’t really ever intend to take seriously or attempt to support later on.

For larger projects, I just don’t see any difference in the time it takes to write.

I’d like to cite an example.  The GIT and Bzr revision control systems are about the same age, one of them is written in C and one of them is written in Python.  It hasn’t taken them any less time to write the one in Python than it’s taken the others to write the other in C.  The one in Python doesn’t have extraordinary features that the one in C lacks.

C# fans would point out how much faster it is to UI code.  Really?  Then why isn’t Banshee that much dazzingly better than Rhythmbox?  Sure they’re different, but there’s nothing there that suggests one language is better than the other.

And do you know what?  I trust code written in C far more than I do any higher level language.  No, that’s probably not fair.  I trust C programmers far more than I do programmers of other languages.  If you tell me I have the option of choosing a program or library written in C over one written in Python or C#, I’ll take the C one every time.

66 Comments

  1. Alex Turner says:

    You seem to be addressing two issues here:

    1) C as a language is as good as any other language

    2) The lack of available standard libraries in C is not a barrier to entry, because it’s easy to write your own.

    C is a fine language. Pointers are awesome. Dealing with your own memory allocation is the job of any responsible programmer, anyone who disagrees simply hasn’t tried to run a 8GB Java VM and then cried when their VM spends 90% of CPU cycles doing garbage collection. If you can’t hack pointers, please for the sake of humanity, stop programming and become an artist. I have come across so much code written by idiots that didn’t understand memory allocation it makes we want to change careers regularly. It’s like being a graphic designer and coming into a job where the CEO’s five year old did the corporate branding, and they ask you to ‘just work with it’. It’s just impossible and horrible. Debugging C is no slower than debugging anything else, in fact I would argue that it is FASTER. Static typing is your friend, not your enemy. The compiler catches things that are stupid before you even run them. If the compiler can catch them, then so can your IDE, which means they are pointed out to you the second after you wrote them, not six months later when someone finally runs that code because QA didn’t catch it.

    Compilers are slow? really? Ever worked in Java that isn’t crippled by make and other stupid OSS build tools (can anyone really argue that automake/autotools aren’t the uggliest thing ever) that compile every file in a separate process? I think my 20,000 line Java project compiles in less than five seconds. Last time I compile C++, it was about the same size and took around 20 minutes (though that was awhile ago).

    let’s face it though, Object Oriented is a useful paradigm, and C just can’t do it. For goodness sake, if you are going to pick a language like C, pick C++ or Objective C. It might have flaws, but you do at least get inheritance which is a very useful concept.

    Debugging someone else’s Perl is a living nightmare. Ever try to find memory leaks in mod_perl? *shiver*

    As for libraries. I can’t believe you actually have a job working for a business. It has taken years to come up with the optimal implementations we have in modern libraries today. You can’t just sit down on the weekend and recode these things in five minutes. Ever tried to implement a library to read a PDF? It takes a week just to read and understand the spec. Why bother installing somebody else’s operating system, after all, it’s just a bunch of libraries, surely you could hack together your own in a weekend? Seriously? Have you thought about that for more than two seconds?

    I work in Java. For ONE single reason. Libraries. I have access to a massive collection of high quality object oriented libraries that make developing new web based applications a piece of cake and have largely been standardized around. It means my employer doesn’t have to pay me a year’s salary to re-implement the wheel.

    Java has one major problem, the garbage collector (We’ll just ignore the fact that Sun spent the first 10 years making a hash of it, and coming up with embarrassing UI implementations). Syntactically it’s pretty similar to C++. It’s statically typed, and better yet it avoids the pain of multiple inheritance. The Java world seems to have a good number of mature programmers, who understand that dependency injection is a useful thing, and that testing is good, and that you need an ORM tool that doesn’t suck if you are going to be successful building Enterprise software. Perl doesn’t have a good ORM tool, Python doesn’t have one (a bunch of smart people implemented Zope DB, which is a great idea, but horribly impractical, and the implementation is so lacking in basic tools, it’s embarrassing), and neither does PHP (DBO is a joke), and I don’t remember reading much about database access tools in C lately – though that maybe because I’m just not in that community. The people who work in languages like Python and Perl are really far behind the curve, and frankly, too busy implementing mostly pointless libraries to catch up. PHP comes out of the box with zero connection pooling for MySQL that works. I mean can anyone argue that that ISN’T stupid?

    If you are a typical Canonical employee, we can see why Linux is so far behind OS X, and will probably never catch up.

    If I were the people who wrote most of the libraries underpinning what Canonical do, I would be calling for your head on a platter based on your comments. You are calling these peoples work trivial.

  2. Mike says:

    I have to say its funny to see this appear again. As someone mentioned it comes up everytime theres a “new” language on the horizon. I remember well the ASM vs C argument, and back then when it all kicked off, ASM was still better, the code generation of C was terrible and you could write much “better” code in ASM then “C” anyday. These days “C” is pretty close to ASM, so much so you’re sometimes hard pressed to beat it (particually with CPUs being so damn complex now).

    C# (my language of choice these days) has the same issues C had in the old days, its quite a bit slower than C, and actually takes a long time to learn properly. I’ve spent 2 or 3 years coding in C# now and I’m pretty good at knowing whats good and bad about it, but I still get surprised sometimes at just how bad the JIT can be in places. I think new programers are more productive in C# because a lot of grunt work is done for them, but you need some experiance with the langauge to get the best out of it, and even then I’ll spend just as long trying to optimise it that I did C or even ASM in some respects. I realise the JIT is still evolving but it sucks in some important places. However, the bit I really like about it is that long after my program has finished and shipped, it’ll still be getting optimised and faster as M$ or the Mono guys keep updating the JIT – and I’ll never have to lift a finger.

    That said, I do still get dirty with pointers in C# for real speed ups, but that doesn’t effect the JIT just memory management, and managed memory is great for the little things, but sucks when you need to manage important data. Although again this will improve.

    I guess what I’m trying to say is that we’ve seen this all before. A new language comes along as we don’t want to have. Its slower than what we’re used to, we can’t find the functions we need in the librarys so we’re more productive “the way we were”. However eventually the compilers (and JIT) will improve and speed will be comparable, and we’ll get enough experiance with it that we’ll finally make the switch. I’m not a bit python fan, but I love C#. The lack of header files along is a godsend, not to mention the rapid build times. Sure theres times where I still want to drop down into assembler and even the C# guys recognise this by allowing you to plug in native DLL’s so you don’t lose out completely.

    Give it time… new languages are always hard swap over to, you have to be convinced that they work as well as your current choice, but it’ll happen eventually.

  3. Brandon says:

    I often hear an argument that “higher-level” languages and massive standard libraries / frameworks keep you from reinventing the wheel or waisting time repeating your code. Typically, its from people who implement software that has a well-known solution with mechanisms that can be reused from implementation to implementation. Programmers who write this kind of software see the same problems over and over and don’t want to solve the same problem twice. They let other people create faster databases, better data structures, and more useful platforms for them to build on top of. This saves a lot of time and effort and allows everyone who needs a solution to share in developing it and debugging it.

    For those of us implementing novel solutions that have not been explored before, choosing the best cookie-cutter library is not a very important part of the job. Most of our time is spent drawing up solutions on white-boards and writing algorithms and data structures that have just the right performance and memory characteristic. When there is a library that does what I need, I usually end up plugging it in as a prototype to get my system functioning and then throw it out and replace it with my own after I pair down which features I actually use from it and which ones are causing it to be overcomplicated.

    In response to the comment about libraries, C has the most comprehensive set of libraries available to it, hands-down. Sure, the standard libraries on a system are very minimal but a C compiler can link with libraries spanning across 30+ years from FORTRAN, C, and C++, or anything else that compiles to your system’s library format. Venerable libraries such as ARPACK and LINPACK are common examples.

    I concur that run-time systems like Microsoft’s CLI and more expressive languages such as python make the task of programming easier. But when the simple task of writing out the program is not your greatest challenge, you just don’t care about those extra convenient features provided by more complicated systems. They may only give you the extra complication and none of the advantages.

    I decided to put in my 2 cents because I see a lot of web developers and application developers who think that’s all anyone ever writes. Contrarily, there are a lot of developers working on custom embedded systems, new hardware, scientific simulation, development tools, and high-performance libraries who would see things very differently.

  4. glok twen says:

    the fundamental flaw i have with this post is that the author appears to conclude choice of language is a valid way to judge or infer programmer capability. there are many flaws with this logic i could point to. but i’ll keep it brief. i’d put language choice as one of the lesser, near useless indicators of capability. near the top would be testing habits (rigorous unit test experience and peer-reviewed unit test condition/results walk-throughs); ability/experience in getting to clear requirements and functional designs; and naturally also standard measures of logical capability and symbolic transformations.

    the most valuable part of the post was laying out the various pros and cons for when any given project might lean toward one language versus the other. and this was largely evident via the comments. the conclusion of the op to me is utterly useless.

  5. I happen to have written a very similar essay (click through my name, I think).

    The key element is the fact that most of the time that you spend on your project is not spent churning out repetitive code that benefits from a bunch of language features. Most of the code that you write is not this sort of repetitive code either. The important decisions influence the structure of your program and are language-neutral. If you make these decisions well, your program will flourish in any language if you have the competent programs. If you make them poorly, it will flounder in any language. But the little gains that you get by having a language that does fancy type magic, pointer checking, built-in objects, yada yada…they simply don’t matter compared to all of the other factors. Unless, of course, you just absolutely must release constantly. But you _do_ get a performance penalty.

  6. Kieran says:

    @glok twen:

    I don’t think the OP was saying that writing code in C causes it to be better, but rather that there appears to be a correlation between good programming/programmers and programs written in C.

    As you say, important factors are things like testing habits and experience, but at least in my experience the projects with the best unit tests and code reviews are often written in C. It’s a correlation, not necessarily a causation though.

  7. Matt Elmer says:

    On fluency in C and programmer competence:
    http://www.joelonsoftware.com/articles/LeakyAbstractions.html

    From an implementor of novel solutions:
    http://archive.gamespy.com/legacy/articles/devweek_c.shtm

  8. Darcy says:

    Use the right tool for the job, and keep an open mind. That’s my 2 cents. Quick and dirty 10 dialog recipe app? VB. PC game? C++. Solid state micro chip for specific embedded need, like a sensor? Assembler. If every experienced programmer would simply embrace the real world and use the right tool for the job, they wouldn’t treat programming choices like religion. Just get real, and be real.

  9. Darcy says:

    Use the right tool for the job, and keep an open mind. That’s my 2 cents. Quick and dirty 10 dialog recipe app? VB. PC game? C++. Solid state micro chip for specific embedded need, like a sensor? Assembler. If every experienced programmer would simply embrace the real world and use the right tool for the job, they wouldn’t treat programming language/compiler choices like religion. Just get real, and be real.

  10. glok twen says:

    @Kieran – agree i was commenting on an implication of the last paragraph. also agree on the difference between causation and correlation which the world often confuses.

    in that case, then, i’ll say that in the OP’s opinion, evidence seen indicates a correlation between c programmers and “trust”-worthiness. the evidence i have seen is that there is no reliable correlation between language coded in and “trust”. instead, the factors more highly correlated with trust that i’ve observed are the ones listed.

  11. Steve says:

    C is great. Python is great. Ruby is great. Heck, Java is great (I feel dirty saying that) … assuming you have good programmers. I agree that bad programmers can hide in higher level languages much easier than in lower level ones and it’s important to have good engineers working on a code base no matter what language it is in.
    I do have a hard time agreeing that higher level languages don’t give you any productivity improvement. If that is really the case there is a lot of wasted time and effort writing and updating higher level languages … and by higher level languages I mean anything above assembly. If your using the right tools assembly should be just as fast and easy to write as Ruby! If it isn’t that easy, you must be doing it wrong … right? :-)

  12. JanC says:

    Actually, using pointers is much easier to understand in assembler than it is in C, so you’re right that higher level languages are the wrong way to develop software… ;-)

    PS: of course that’s because the C syntax for using pointers is probably the worst and most inconsistent of any language in common use…

  13. xteraco says:

    I haven’t read the comments, but I agree with this post. It seems that new fads are driving people to forget how powerful our old languages are. School teaches industry standard languages. CEO’s decide the industry standard language, CEO’s decide what language their coders use. CEO’s have no business making this type of decision. Now because n00bs all learn Java as their first language (in schools), and learn that it is “right”, they will accept nothing else.

    I need to start a blog, maybe I can meet more people like the original poster! =D

  14. Mughal says:

    As for as the programmer is concern, language is only the tool, it depends how we utilize it. so in this regard assembly is the best giving us the full capability to handle pc’s power.

  15. Jason Carr says:

    This is certainly interesting, though I have to disagree. With any programming language comes a learning curve, and I’m not convinced that the learning curve is any greater for a language like C# than C (in fact, I would argue in the opposite direction). The closer we come to representing common spoken language and real life within our programming languages, the easier and faster programming will be. Clearly, C is a lot farther away from this than many modern programming languages.

    To say that it is just as quick or quicker to develop in C as it is to develop in C# is to throw away much of the progress that has been made in recent years. It is painfully obvious that developing a straightforward Windows (or any other GUI-based) application is much, much quicker using a modern, truly object-oriented programming language than using C, for many reasons. Some of them include fewer lines of code, better representations of real world objects, easier to read code, better designer tools (IDE GUI tools), and event-based development. C was not designed for these approaches.

    Straight up it is quite odd and misplaced to argue that C is just as fast or faster for development than modern languages. We all struggle with finding our comfortable favorite languages and approaches and sticking to them, but in this industry we do not have the luxury of sitting back and relaxing where we are at. I’m afraid that’s what you’re encouraging and fighting for here. It seems pretty silly to me.

  16. Niklas Cholmkvist says:

    But isn’t the C# language owned by Microsoft?

Leave a Reply