In this post I wanted to highlight the wonderful “Excel vs R: A Brief Introduction to R”  by Jesse Sadler. This is full of useful and practical advice on using R in place of Excel (or any other spreadsheet) for simple data analysis.

I use Excel – a lot! I’m a biologist so what did you expect? However many of my colleagues use R for their data analysis and I’ve long seen many of the benefits, although the steep learning curve put me off for a long time. Earlier this year I attended the wonderful data carpentry course run by The University of Cambridge, which I would highly recommend – find something similar nearby you here.

Jesse’s post starts by highlighting some of the good reasons spreadsheets became so dominant – primarily there ease of use. But also highlights the major weakness – little, or even no, transparency over what has been done to the data. She uses a worked example and produces some pretty basic  (ugly) plots, but these very clearly highlight the advantages of moving to R. So if working with “data frames” (tables), functions, packages and piping are a mystery to you – read the post. I’m sure you’ll be inspired to give R (or better still R Studio) a go, and maybe leave those spreadsheets behind?

Her tutorial covers:

  • Basics of the R command line
  • A description of R Studio and R packages for data manipulation
  • A worked example that covers manipulating: numbers, strings and dates. As well as graphical representation of the newly created data frames with the ggplot2 package

PS: I’m still learning and my focus is on using R to do the spreadsheet tasks that I most commonly repeat. For me this means analysis of monthly usage data for my Genomics Core facility. When I need to quickly analyse some tabular data I’ll probably still dive into Excel and save myself the effort.