Happy Birthday R! R is turning 20 years old next Saturday

Articles From: Jozef's Rblog
Website: Jozef's Rblog

Originally posted on February 22, 2020 on Jozef’s Rblog:
https://jozef.io/r921-happy-birthday-r/

Here is how much bigger, stronger and faster it got over the years

Excerpt

It is almost the 29th of February 2020! A day that is very interesting for R, because it marks 20 years from the release of R v1.0.0, the first official public release of the R programming language.

The first release of R, 29th February 2000

The first official public release of R happened on the 29th of February, 2000. In the release announcement, Peter Dalgaard notes:

“The release of a current major version indicates that we believe that R has reached a level of stability and maturity that makes it suitable for production use. Also, the release of 1.0.0 marks that the base language and the API for extension writers will remain stable for the foreseeable future. In addition we have taken the opportunity to tie up as many loose ends as we could.”

Today, 20 years later, it is quite amazing how true the statement around the API remaining stable has proven. The original release announcement and full release statement are still available online.

You can also still download the very first public version of R. For instance, for Windows you can find it on the Previous Releases of R for Windows page. And it is quite runnable, even under Windows 10.

Faster – R today versus 20 years ago?

With the 20th birthday of R approaching, I was curious as to how much faster did the implementation of R get with increasing versions. I wrote a very simple benchmarking code to solve the Longest Collatz sequence problem for the first 1 million numbers with a brute-force-ish algorithm.

Then executed it on the same hardware using 20 different versions of R, starting with the very original 1.0, through 2.0, 3.0 all the way to today’s development version.

Benchmarking code

Below is the code snippet with the implementation to be benchmarked:

col_len <- function(n) {
  len <- 0
  while (n > 1) {
    len <- len + 1
    if ((n %% 2) == 0)
      n <- n / 2
    else {
      n <- (n * 3 + 1) / 2
      len <- len + 1
    }
  }
  len
}

res <- lapply(
  1:10,
  function(i) {
    gc()
    system.time(
      max(sapply(seq(from = 1, to = 999999), col_len))
    )
  }
)

Results

Now to the interesting part, the results – the below chart shows the boxplots of time required to execute the code in seconds, with R versions on the horizontal axis.time (seconds)Execution time by R version1.0.01.4.12.0.02.10.02.11.02.12.02.13.02.14.02.15.02.4.02.6.02.8.03.0.03.1.03.2.03.3.03.4.03.5.03.6.0devel025050075010001250

We can see that the median time to execute the above code to find the longest Collatz sequence amongst the first million numbers was:

  • February 2000: More than 17 minutes with the first R version, 1.0.0
  • January 2002: A large performance boost came already with the 1.4.1 release, decreasing the time by almost 4x, to around 4.5 minutes
  • October 2004: Even more interestingly, my measurements have seen another big improvement with version 2.0.0 – to just 168 seconds, less than 3 minutes. I was not however able to get such good results for any of the later 2.x versions
  • April 2014 – Another speed improvement came 10 years later, with version 3.1 decreasing the time to around 145 seconds
  • April 2017 – Finally, the 3.4 release has seen another significant performance boost, from this version on the time needed to perform this calculation is less than 30 seconds.

Visit Jozef’s Rblog to read more about the history and programming development of R https://jozef.io/r921-happy-birthday-r/

Disclosure: Interactive Brokers

Information posted on IBKR Campus that is provided by third-parties does NOT constitute a recommendation that you should contract for the services of that third party. Third-party participants who contribute to IBKR Campus are independent of Interactive Brokers and Interactive Brokers does not make any representations or warranties concerning the services offered, their past or future performance, or the accuracy of the information provided by the third party. Past performance is no guarantee of future results.

This material is from Jozef's Rblog and is being posted with its permission. The views expressed in this material are solely those of the author and/or Jozef's Rblog and Interactive Brokers is not endorsing or recommending any investment or trading discussed in the material. This material is not and should not be construed as an offer to buy or sell any security. It should not be construed as research or investment advice or a recommendation to buy, sell or hold any security or commodity. This material does not and is not intended to take into account the particular financial conditions, investment objectives or requirements of individual customers. Before acting on this material, you should consider whether it is suitable for your particular circumstances and, as necessary, seek professional advice.