Why you should probably not bother about Ruby’s speed

Home > A few words about... > Why you should probably not bother about Ruby’s speed

Why you should probably not bother about Ruby’s speed

2011/03/21 Xavier Noëlle Leave a comment Go to comments

The expressivity and power of Ruby make it a really efficient programming language. Its dynamic nature results in lots of enjoyable benefits (introspection, dynamic class reopening…); it is also responsible for a part of its performances, which are sometimes believed to be poor.

Is Ruby really slow? While being obviously slower than most natively compiled languages, is the performance gap really worth noticing and should it prevent you from using this great language?

If all you’re coming for is pure performance benchmarks, you may directly skip to the last part of this article (“Lies, damned lies, and benchmarks”), since this article does not aim at benchmarking Ruby.

ruby –version

Depending on your OS or the way you installed Ruby on your computer, you may be facing two different major releases of Ruby. If you take a look at Ruby’s history, you can notice that the 1.8.x branch started in 2003 and was joined by the 1.9.x branch at the very beggining of 2009. Since then, these two versions are coexisting, while being different in a number of ways.

The most interesting change is related to the interpreter itself: YARV, the mainstream Ruby 1.9.x interpreter, is now dealing with bytecode, rather than constantly parsing and executing unoptimized Ruby code. The addition of a few other optimizations induced a great speed boost for most Ruby scripts (about three times faster on average, according to benchmarks which are legion on the Web: for example, here and here).

Sadly, most beliefs about Ruby are based on the old 1.8.x branch, which was the most used branch until now, but tends to be replaced by newer versions; a great effort is being made to migrate major gems (Ruby packages, such as Rails) and applications to the 1.9.x branch. The new Ruby interpreter comes with a revision of the language itself, which makes this migration not as easy as it seems and explains why some people are still late to the party.

One language to rule them all, and in the runtime bind them

Aside from the well known MRI and YARV, which are respectively shipped with Ruby 1.8.x and Ruby 1.9.x, some other mature Ruby interpreters are available and quite widely used. They feature some additional or structural refinements, which make them better suited than the mainstream interpreters in some situations. Two of the most notable challengers are:

JRuby: an interpreter based on the Java Virtual Machine, providing an easy integration with Java libraries as well as the obvious advantages of running inside the JVM;
Rubinius: an interpreter written in C++, which features Just-in-time (JIT) native machine code compilation.

These interpreters outperform the standard interpreters for some specific problematics (one of them being parallelism efficiency, which will be the main topic of an upcoming article) and may be worth benchmarking when performances become a concern, before deciding to drop Ruby.

C for hostpots, Ruby for everything else

Most of the standard Ruby library, as well as a number of gems (Ruby packages), is written in pure C. Indeed, Ruby features easy interactions with C and C++: you can use the Ruby API in your C code and your Ruby code can, in turn, easily retrieve results from your C code. This is roughly achieved by using one or more C files, a file composed of a few lines of Makefile-like syntax which will build a library for you, a function call and… that’s it!^[1]

Knowing that, it would be wise to reconsider Pareto’s principle: most of the time, 20 % of your code is responsible for 80 % of your application’s runtime. Rewriting these slow parts in a compiled language like C, while keeping structural parts human-readable by using a high-level language should be the way to go, when possible (i.e. most of the time); that is exactly the Ruby way of thinking. Moreover, when profiling your applications, you will often notice that the slow parts of your application are completely independent from the language you’re using.

And believe me: most of the time, you won’t even need C at all! I have myself never really needed to rewrite parts of my applications in C, even when raw performances were a vital concern. In fact, the only time I really thought I needed it led me to a complete waste of time, since I was trying to improve performances by focusing on the wrong problem. This is not to say “never use C” but rather “use C when all other reasonable options failed”; mixing code may reduce your source code readability and maintainability.

Lies, damned lies, and benchmarks

If you expected benchmarks in this article, you will be disappointed. The only general truth that should come to your mind is the following ranking^[2] (sorted by decreasing performance):

Natively compiled languages (C, C++, Fortran, Pascal…)
Byte-compiled languages (Ruby 1.9.x, Python, Java…)
Interpreted languages (PHP, Shell-scripting languages, Ruby 1.8.x…)

Everything else is usually a waste of time for general applications, especially when you are allowed to mix compiled chunks of code in your Ruby or Python application. Keep in mind that the time you save by using a more powerful (in terms of expressivity, readability, standard structures…) language can be used to throw up some heavy algorithmic as well as parallelization optimizations, which will make a greater difference, most of the time.

Well, if you really want benchmarks (because I know you’ll be upset otherwise), you may want to follow this link (but I warned you!).

Conclusion

I hope this article convinced you that considering Ruby as a slow language is mostly false and generally irrelevant. For those of you who need extreme performances, a natively compiled language like C or C++ will certainly perform better in some cases.

But if you don’t fit in this category and, nevertheless, plan to drop Ruby^[3] just because you think it won’t be fast enough for you, you may want to reconsider your thoughts…

This will be discussed more thoroughly in an upcoming article about Ruby and parallelism.
Languages are categorized with respect to their current main implementation (eg. MRI for Ruby 1.8, YARV for Ruby 1.9…).
The same reasoning applies to Python as well.

Categories: A few words about... Tags: benchmarks, Ruby, speed

Comments (3) Trackbacks (1) Leave a comment Trackback

mart-e

2011/06/06 at 18:38

Reply

Interesting (and certainly not wrong) way of seeing things.
I’m learning C++ to improve the speed of my programs but maybe I’ll change my mind and stop using this terrible language (I hate segmentation fault!)
Xavier Noëlle

2011/06/27 at 15:30

Reply

Glad you like this article, but don’t get me wrong: my point was not to state that C++ was a terrible language (I love C++ :-)), but rather to fight against the conventional wisdom by which using natively compiled language is the only way to get decent performances.

Be sure to read my forthcoming articles about distribution using Ruby, and an easy way to use C++ inside Ruby 😉
Quentin Raynaud

2011/09/09 at 16:35

Reply

I read the whole article and found it interesting. I’m bothered only on one point : the lack of real data about the scripting languages speed comparison. I won’t deny that Ruby has made some huge improvements in speed, this is common knowledge.

But I might argue a little about the point made. Even if Ruby acquired a large community over the time, Python still has a much bigger one. Google and other companies are putting a lot of efforts in making it more and more efficient. I believe that it is still on par, if not better than Ruby. And you also forget to mention another scripting language : Javascript. It is common to put it in the “interpreted language” category. It would be very wrong.

You all know that browsers are fighting over Javascript performances. This allowed Javascript to make a huge comeback. And when I say huge, I really mean it. Three years ago, it was even slower than PHP, really it was poor. Now, it is not only byte compiled, it goes even further than that. It is just in time compiled when this makes sense. It uses inferences too. There is a lot of optimizations in Javascript that makes it even able to compete with C in some cases !

It might be a somewhat crude language at first look. If you go deep in Javascript, you might find it is a lot more subtle than you thought. And it is really powerfull. Probably as much as Ruby is, probably even more. And it is definitely more efficient than Python or Ruby for the time being.

So, I might suggest people to think about the huge win it represents for web development. One language for server and client side. When you write your code to check your last form, you can make it so it will validate the form both on the client before sumbitting and on the server side after that. Sure, you’ll need some sort of header function to collect the data that will be different on both sides, but you get my point.

Better, there already are HTTP servers out there that allows you to write your Javascript server and client side code in a single file, allowing you to choose where your code should be processed depending on the function (server-side or clint-side) without even having to worry how the data will be sent from one to another. each of those function call will only be replaced on each side to ensure the data is sent over the network at the appropriate time and processed as it should be. you can do so using, depending on your need, both a synchronized call or asynchronized call allowing you to fine tune your performances on the client side.

Well, I’m not telling you Javascript is ready to become the next answer for everything, but I’m definitely telling that it’s bad to forget it in such an article!

2011/05/25 at 20:11

Taking advantage of multicore architectures with Ruby « RTFB.log

RTFB.log