Wednesday, October 5, 2011

Perl vs. Python dog fighting

So Python is the hot new thing and I, as an old Perl dog, should make a statement on the subject.  You're guessing I'm going to tell you why Perl is better?  You guessed right!  Specifically, I want to address the arguments that Python encourages better programming practices or that it is better for teaching.  This is true.  However, as educators (in the field of bioinformatics but also elsewhere) we need to consider what is our ultimate goal - easy training process or the product of skilled trained people.  The analogy that immediately comes to mind is naturally fighter training in the Israel Air Force.  In the first decades of Israeli history, when the threat of annihilation was imminent and the culture was very much influenced by the Soviet model, Israeli fighter pilots were taking lots of "stupid" risks during training for combat.  As a result, some classes had lost even 50% of the fighter pilots over their 20 years of service, and training accident were more common than KIAs.  The other result was that Israeli pilots could win fights against less well trained pilots even when heavily outnumbered.  This practice have been reversed in the more recent decades (after stabilization of the state and influence of American culture...) and now training accidents are much less common.  We cannot judge if Israeli pilots are not good dogfighters any more because (a) the last heavy air-to-air fighting was in 1982 and (b) dogfighting is less important with modern fighter jets.

Nevertheless, the point remains and the analogy is useful.  Perl requires harder training but will ultimately be more powerful.  Python is nicer for teaching but more limited.  Python is a more constrained language with more proper syntax and object oriented design, which are attractive qualities for the teaching of programming.  Perl is the "Pathologically Eclectic Rubbish Lister" (Larry Wall).  It doesn't restrict the programmer from using bad programming practices.  And it is true that you can find a lot of badly written Perl code out there.  But this is not because Perl is a bad language.  It is because some Perl programmers are lacking good education in programming.  Guns don't kill people.  People kill people.  A well written program can be written in any language if the programmer knows how to structure the code intelligently and use syntax choices that make clear and understandable code (rather the notorious possibility for code obfuscation in Perl).  So it is up to us as educators to produce students well educated in good coding practices.

The flip side of this coin is that Perl is much more versatile, flexible, and powerful than Python (and other alternatives).  It allows more efficient development of new code, i.e., shortening the time and effort from the moment you sit down to write a bit of code to the moment you can run it and get your results.  This is highly desirable for practical programming such as short scripting for bioinformatic analysis.  So as educators of practical programmers we should make the effort to teach good programming practices while using the most powerful language available for our students.  At the present time, especially for bioinformatics, this language is Perl.  Teaching constrained languages to "force" our students into good practices is taking the easy way out as educators and handicapping the future potential of the people we train.


  1. I like Perl much better than python but don't think it's a good language to start with. Old school C is my preference as it forces you to actually understand some details before you start programming. Where I work now they start with Java, which is probably also OK. Scheme is also good for teaching, although I'm not a great fan of its syntax.

  2. Thanks for your comment, Ran. You do have a point there. But when I was writing this post I was thinking about molecular biology students, who will most likely never want to do anything other than scripting. For them C or Java are a waste of time and energy. They should have good Perl teachers that will be able to properly explain the details using Perl. For example, C is good for learning about pointers. But you can also get a good enough understanding of them from the use of references in Perl, if taught well. My bottom line is - it all comes down to good teaching rather than choosing an inappropriate language because it "forces" the students to learn...

  3. How many of the molecular biology students actually write their own scripts after the course? My guess is that very few, and those few would have benefited from learning other languages. Yet - I may be wrong here. I haven't got much experience with biology students learning to program.

  4. Oh well - I was thinking about those who will program. The course is meant for them. Those that won't ever program might as well save the time and skip this class. From my experience with those that do apply what they learned - Perl does the job!

  5. As Eyal knows, I favor Python for teaching.
    I will start with my own pedigree: first exposed to programming through BASIC, formal training in Pascal, then moved to C during my master, and used C in for 99% of what I did, including teaching biology undergrads and small stuff for which a scripting language would have been better adapted. Nowadays I hardly program anymore, but when I do it's a mix of shell and R, with very little Python, Perl and C.
    My reasons for supporting (some would say pushing) Python are that:
    - I agree with Eyal that C or Java are not useful for most biologists, even among the subset of biologists who program. Those who really need these big languages or others, can learn or take specialized courses, but are not the object of the discussion "which language to teach biologists?".
    - In my generation, Pascal was the major teaching language, and I believe that we gained a lot from the rigor and constraints of Pascal. I do not believe that we lost much from the fact that it was hardly used in research or industry. When I transitioned to C, it was relatively painless, and I benefited from the good habits that Pascal had forced on me. Similarly, Python enforces good habits, which Perl does not. After training in Python, it will be easy to transition to Perl if that's what's used in your new environment. The inverse transition is also easy, BTW, which limits the importance of these discussions in a way. But I believe that like Fortran or BASIC in another age, Perl will let you develop (maybe lead you to develop) bad habits which will keep. Yes you know you should stop smoking, but it's not that easy.
    - Finally, I favor Python over more exotic choices (Ruby, OCaml, whatnot) because it is all the same one of the major languages of bioinformatics, with a thriving and modern Biopython project, and easy interfacing with R. Moreover, it is one of the main languages of Data science, which shares many of the aims and issues of bioinformatics. And for what it's worth, in every discussion I've had with computer science professors about teaching programming, the mere mention of Perl sends them running to Java and C++, whereas Python is welcomed, because apparently it allows to teach well all the important concepts of a modern programming language. I'm not a computer scientist, but that is my experience.
    Finally, I think that the analogy is flawed. We will not lose 50% of the students and keep 50% of good programmers, we will teach 100% of the students to program quick and dirty. I do not think that Perl is "harder training". That would be teaching C (which I have done, and is a bad idea). Perl is not harder to learn than Python, it is just less structuring.
    Long rant, but you asked for it. ;-)
    I will refrain from offering a better analogy, because of my personal lack of experience in combat, for which I am grateful.

  6. Note added in proof: Dennis Ritchie, of Kernigan and Ritchie, which taught me what a modern programming language was like, just passed away.