About the Site
Why (another) Website about R?
Usually, my response to any question that begins with “why” and has to do with some act or behavior on my part tends to be along the lines of a) “why not?”, b) “why anything?”, and/or c) “because I can.” However, considering the fact that this website is about the R Programming language, something that I admire, support, and enjoy a great deal, I feel like a slightly more detailed explanation is warranted. To that end, I consider this site to be loosely oriented toward three main purposes:
- A homage to what, in my opinion, is one of the most rewarding programming languages there is.
- An attempt to delineate the distinct benefits that come with using R as (one of) your primary data-focused language as well as an attempt to counteract false beliefs and foolish assumptions that have propagated across the internet over the years. In other words, propaganda!
- As a launching pad where I can document both my own, and others, attempts to explore, extend and better understand the language both in the past, and going forward into the future.
I'll attempt to touch on each of these briefly and in order...
First, I'll start by discussing what I personally appreciate about the language. In particular, whenever I reflect on this topic, I always think about the character Ed Harris plays on HBO's Westworld where he obsessively roams the theme park seeking to find “A DEEPER LEVEL OF THE MAZE.”Westworld Much like Ed Harris, minus the wanton violence and sadism of course, I find myself realizing that there is always another level of understanding to reach and vast swaths of information that I remain ignorant of, which makes R both a challenge and incredibly rewarding.
A number of examples help illustrate this point. Notably, there is a paper hosted on the R-project website itself that explores this very themeCourtesy of Paul Johnson.... During a talk, Hadley Wickham himself noted that before he started writing Advanced R, he really understand what “advanced R” was. I recommend taking my word for it since the sound is terrible... This is especially notably given the fact he first began developing R packages circa early 2000's and the book was published in 2014!!
The final example I'll give centers around the complete and utter flexibility of the language that allows a user to do, well, just about anything... I won't delve deeply into this here, but as a tease look at the base-R documentation for
`?asNamespace`R Documentation by pasting that line of code into your console and tell me what other language is so chock full of esoteric functions and cryptic warnings from the core development team? Needless to say, there is a reason so many statisticians and scientists who happen to come across R while pursuing activities related to their own domains come to the realization that statistical programming is actually more rewarding than anything their field could provide!
Moving on to purpose number two, why should you use R for your data programming needs? I'd argue because it was designed, and lives up, to doing exactly that: letting you easily and efficiently program with data. John Chambers, designer of the S language, the precursor to R, and current member of the R core-team, put it, the purpose of the language is to:
To turn ideas into software simply and faithfully.
Along with the two principles :
The first principle I propose is that our Mission, as users and creators of software for data analysis, is to enable the best and most thorough exploration of data possible. [and 2] ...computations and the software for data analysis should be trustworthy: they should do what they claim, and be seen to do so.
Once you get the hang of its syntax and its idiosyncrasies, R will let you slice through your data unlike much else that I've encountered. As long as you are comfortable with a computer keyboard, R will let you code nearly as fast as you can think. Compare this to SAS where the user must specify whether their intended action is a 'data step' or a 'proc step', imperatively write out each command followed by the requisite semi-colon, and finish it all off with a 'run' statement.
In terms of misconceptions such as “R is hard to learn” (No) or R is slow (well yes, but...)See Efficient R Programming I plan to take a more nuanced approach to deal with both imagined and actual shortcomings of the language (yes, they do exist). Briefly though, I think the strongest counterargument to criticisms of R lies in the truly extensible nature of the language. In other words, if there some aspect of the language that you don't like, then by all means, write a package or contribute to an ongoing project that attempts to fix said issue. R is incredibly accessible for non-programmers who want to began programming and while understanding R's S3 system or what dynamic scoping mean will make things easier, they are by no means necessary.
Even if you don't have the time or inclination to work on something yourself, chances are other people who have similar frustrations as yourself will also take up the challenge. For example, Matt Dowle wanted a more efficient wayuseR Talk 2014 to work with large data sets as well along with a more simple system for sub-setting and other data frame operations. So, along with others, he created the data.table
package which now ranks among the most popular R packages of all time. How? As the authors note, mainly by studying and taking apart base-R code and then altering a few key features (e.g., update by reference)SO answer by M. Dowle to vastly enhance efficiency and speed.
Finally, I think the third purpose of this site is more or less self-explanatory. Instead, of covering the same ground that has been covered by others many times before, and far better than I could ever hope to do, I want to use this platform as a basis where I can explore whatever area of the R language, and other languages as they relate to R, most interest me at the current moment. I hope there is something here at least approaches some semblance of interesting and/or useful.