Let’s All Go Down to the Barplot!

I’m back for another pean to ANOVA-loving experimental ecology. Barplots (or point-range plots – but which is better is a discussion for a different time) are the thing! And a friend of mine was recently asking me how to do a decent barplot with ggplot2 for data of moderate complexity (e.g., a 2-way ANOVA).

So I tried it.

And then I hit my head against a wall for a little while.

And then I think I figured it out, so, I figured I’d post it so that there is less head-smashing in the world.

To do a boxplot is simple. And may statisticians would argue that they are more informative, so, really, we should abandon barplots. Take the following example which looks at the highway milage of cars of various classes in two years.


#a simple qplot with a boxplot geometry - n.b., I like the bw theme
qplot(class, hwy, fill=factor(year), data=mpg, geom="boxplot", position="dodge")+theme_bw()

A few notes. The x-axis her is class. The fill attribute splits things by year (which is continuous, so we need to make it look like a factor). And the final position=”dodge” really is the key. It splits the bars for different years apart, and it shall come into play again in a moment.

This code produces a perfectly nice boxplot:

Lovely! Unless you want a barplot. For this, two things must happen. First, you need to get the average and standard error values that you desire. I do this using ddply (natch). Second, you’ll need to add in some error bars using geom_errorbar. In your geom_errorbar, you’ll need to invoke the position statement again.

#create a data frame with averages and standard deviations
hwy.avg<-ddply(mpg, c("class", "year"), function(df)
return(c(hwy.avg=mean(df$hwy), hwy.sd=sd(df$hwy))))

#create the barplot component
avg.plot<-qplot(class, hwy.avg, fill=factor(year), data=hwy.avg, geom="bar", position="dodge")

#add error bars
avg.plot+geom_errorbar(aes(ymax=hwy.avg+hwy.sd, ymin=hwy.avg-hwy.sd), position="dodge")+theme_bw()

This produces a perfectly serviceable and easily intelligible boxplot. Only. Only... well, it's time for a confession.

I hate the whiskers on error bars. Those weird horizontal things, they make my stomach uneasy. And the default for geom_errorbar makes them huge and looming. What purpose do they really serve? OK, don't answer that. Still, they offend the little Edward Tufte in me (that must be very little, as I'm using barplots).

So, I set about playing with width and other things to make whisker-less error bars. Fortunately for me, there is geom_linerange, that does the same thing, but with a hitch. It's "dodge" needs to be told just how far to dodge. I admit, here, I played about with values until I found ones that worked, so your milage may vary depending on how many treatment levels you have. Either way, the resulting graph was rather nice.

#first, define the width of the dodge
dodge <- position_dodge(width=0.9) 

#now add the error bars to the plot
avg.plot+geom_linerange(aes(ymax=hwy.avg+hwy.sd, ymin=hwy.avg-hwy.sd), position=dodge)+theme_bw()

And voila! So, enjoy yet another nifty capability of ggplot2!

Great! I will say this, though. I have also played around with data with unequal representation of treatments (e.g., replace year with class or something in the previous example) - and, aside from making empty rows for missing treatment combinations, the plots are a little funny. Bars expand to fill up space left by vacant treatments. Try qplot(class, hwy, data=mpg, fill= manufacturer, geom="boxplot") to see what I mean. If anyone knows how to change this, so all bar widths are the same, I'd love to hear it.

First you get the urchins, then you get the power, then you get the data!

Well, it’s been a few weeks since I put the revised version of my diversity-disturbance experiment out in the field. I completed sampling on Monday, and, *whew*, my power analysis paid off! Although not entirely how I expected…

First off, kicking the maximum urchin density up to 50 per cage (as opposed to 16) was indeed helpful. Even after 1 week, there was a strong urchin effect. And after three weeks, well…

What is perhaps most interesting is what one sees in looking at the relationship between the number or urchins added to a cage and the change in the amount of cover. I sampled by counting the amount of cover of benthic sessile species under points in a fixed grids. Looking at the change in total number of points we see the following relationship:

A far cry from the results in trial 1. What’s fascinating, however, is to look at the points in the 0-16 urchin range in trial 2. Take a gander at the above figure, and, as you can see, there appears to be no real pattern – another scattershot. It really did require high densities of the spiny little buggers to generate a strong grazing effect. The guidance from my power analysis was right on!

What is perhaps the most exciting (and puzzling) is my preliminary results regarding diversity and disturbance. The following only really shows up when looking at algal cover and algal species richness (i.e., number of species in a plot). It shows that, yes, the effect of urchin grazing appears to change with initial algal species richness – but not in the way one would expect. Basically, low diversity plots with low number of urchins GREW. High diversity plots with few urchins in them LOST COVER. However, at high densities of urchins, everything got chomped, and pretty evenly.

What does this mean? I have no clue as of right now! As soon as the data was collected, I began feverishly entering and analyzing. Why such a rush? I had to decide whether to move the cages and run yet a third trial, or, just cut off the fencing on the cages, and use the weighted frames to mark the plots in order to follow how the community in each plot recovers. After seeing the preceeding graph, and weighing many costs and benefits, I’m sticking with my single trial (the data from trial 1 is being treated as pilot data and won’t be used in the final analysis). So, yesterday was a crazy day of cutting hundred of cable ties and herding thousands (about 1500) of urchins underwater.

I’ll be following up with recovery, but for now, the really heavy lifting is finis. And it looks quite intriguing, although I emphasize that this is all VERY preliminary.

I’ve Got the Power!

There is nothing like that pain of taking a first look at fresh precious new data from a carefully designed experiment which took months of planning and construction, 8 divers working full days, and a lot of back-and-forth with colleagues, and then you find absolutely no patterns of interest.

Yup, it’s a real takes-your-breath-away-hand-me-a-beer kind of moment. Fortunately, I was on my way to a 4th of July dinner with my family up in the mountains of Colorado, so, I wasn’t quite going to commit hari-kari on the spot. Still, the moment of vertigo and neausea was somewhat impressive.

Because, let’s admit it, as much as science is about repeated failure until you get an experiment right (I mean, the word is “experiment” not “interesting-meaningful-data-generating-excersise”), it still never feels good.

So, round one of my experiment to test for feedbacks between sessile species diversity and urchin grazing showed bupkiss. Not only was there no richness effect, but, well, there wasn’t even a clear effect that urchins mattered. High density urchin treatments lost about as much cover and diversity of algae and sessile inverts as treatments with low or no urchins in them. What had gone wrong? And can this project be salvaged?

Yeah, I can see a pattern here.  Sure...

Yeah, I can see a pattern here. Sure...

Rather than immediately leaping to a brilliant biological conclusion (that was frankly not in my head) I decided to jump into something called Power Analysis. Now, to most experimental scientists, power analysis is that thing you were told to do in your stats class to tell you how big your sample size should be, but, you know, sometimes you do it, but not always. Because, really, it is often taught as a series of formulae with no seeming rhyme nor reason (although you know it has something to do with statistical significance). Most folk use a few canned packages for ANOVA or regression, but the underlying mechanics seem obscure. And hence, somewhat fear-provoking.

Well, here’s the dirty little secret of power analysis. Really, at the end of the day, it’s just running simulations of a model of what you THINK is going on, adding in some error, and then seeing how often your experimental design of choice will give you easily interpretable (or even correct) results.

The basic mechanics are this. Have some underlying deterministic model. In my case, a model that described how the cover and diversity of sessile species should change given an initial diversity, cover, and number of urchins. Then, add in random error – either process error in each effect itself, or just noise from the environment (e.g., some normal error term with mean 0 and a standard deviation). Use this to get made-up data for your experiment. Run your states and get parameter estimates, p values, etc. Then, do the same thing all over again. Over many simulations of a particular design, you can get an average coefficient estimates, the number of times you get a significant result, etc. “Power” for a given effect is determined by the number of “significant” results (where you define if you’re going with p=0.05 or whatnot) divided by the number of simulations.

So that’s just what I did. I had coefficient estimates (albeit nonsignificant ones). So, what would I need to detect them? I started by playing with the number of replicates. What if I had just put out more cages? Nope. The number of replicates I’d need to get the power into the 0.8 range alone (and you want as close to 1 as possible) was mildly obscene.

So, what about number of urchins? What if instead of having 0 and 16 as my lowest and highest densities?


Shown above are two plots. On the left, you see the p value for the species richness * urchin density interaction for each simulated run at a variety of maximum urchin densities. On the right you see the power (# of simulations where p<0.05 / total number of simulations). Note that, for the densities I worked with, power is around 0.3? And if I kicked it up to 50, well, things get much nicer.

As soon as I saw this graph, the biology of the the Southern California Bight walked up behind me and whapped me over the head. Of course. I had been using a maximum density that corresponded to your average urchin barren. The density of urchins that can hang out, and graze down any new growth. But that's not the density of urchins that CREATED the barren in the first place. Sure enough, looking at our long-term data, this new value of 50 corresponded to a typical front of urchins that would move through an area and lay waste to the kelp forest around it.

Which is what I had been interested in the first place.


So, round two is up and running now. The cages are stocked with obscenely high densities of the little purple spikey dudes. And we shall see what happens! I am quite excited, and hopeful. Because, I've got the Power!

(and so should you!)

Simple Data Visualization

OK, so, I know I already raved about one Hadley Wickham project and how it has changed my life last week. But what can I say, the man is a genius. And if you are using R (and let’s face it, you should be) and you want simple sexy graphs made quick, the man has you covered once again.

I first learned about ggplot2 while scanning through some slides of the LA Area RUG meetings (that I missed – I still haven’t been to any!) by the folks from Michael Driscoll.

And I liked what I saw – ggplot2 and lattice (which I admit, I had kind of avoided) seemed more intuitive than I thought. Then I ran into a series of articles on ggplot2 from the Learning R blog and I was hooked. Still am. And why I ask?

Let’s consider a task – you have some data split into four groups. For each group, you want to plot a fitted regression between two covariates. You want this split into panels, with common scales, and nicely placed axis labels. Also, you want it to be purty. While you can do this with the standard graphics package (and, I admit, I sometimes like the austerity of the standard graphics), it would take a for loop, tons of formatting instructions, and a number of steps where you could easily mess the whole thing up. Not to mention that you might have to change a good bit if you want to add another group.

Here is how easy it is with ggplot2. Note, there are only two lines of actual graphing code. The rest is just making up the data.


#create data - x is a predictor, g is a group, y is the response
g<-as.factor(c("A", "B", "C", "D"))

#i love expand grid, as it easily creates full
#matrices of all combinations of multiple variables

#add a little error in to our results
z$y<- rnorm(length(z$x),z$x*c(3,15,29, 50)[unclass(z$g)]+5, sd=200)

#this first bit just sets up the basic plotting, with colors and panels
a<-qplot(x,y,group=g, colour=g, data=z, facets=~g)

#now, add a fitted line with the error around it shown.
# Also, change the theme.  And plot the whole thing!
a+stat_smooth(method="lm", se=T, group=g, fill="black")+theme_bw()

All of which yields the following pretty figure:

Nice, eh?

And that stat_smooth statement can take lots of other arguments - e.g., glms (I've tried, and it looks great!)

So check it out - even for just casual data exploration, there's some real clarity to be found. And I look forward to trying out other products by Prof. Wickham!

Sometimes, you just need to use a plyr

I haven’t posted anything about R-nerdery in quite some time. But I have to pause for a moment, and sing the praises of a relatively new package that has made my life exponentially easier. The plyr package.

R has the capability to apply a single function to a vector or list using apply or mapply, or their various derivatives. This returns another vector or list.

This is great in principal, but in practice, with indexing, odd return objects, and difficulties in using multiple arguments, it can get out of hand for complex functions. Hence, one often resorts to a for loop.

Let me give you an example. Let’s say I have some data from a simple experiment where I wanted to look at the effect of adding urchins, lobsters, or both on a diverse community of sessile invertebrates – mussels, squirts, etc. Heck, let’s say, I had one gazillion species whose responses I was interested in. Now let’s say I want a simple summary table of the change in the density each species – and my data has before and after values. So, my data would look like this.

Urchins Lobsters Before.After Replicate Species Density
Yes No Before 1 1 30
Yes No After 1 1 10
Yes No Before 1 2 45
Yes No After 1 2 23
Yes Yes Before 15 Gazillion 10
Yes Yes After 15 Gazillion 9

So, each row represents the measurement for one species in one replicate either before or after the experiment. Now, previously, to get an output table with the mean change for each species for each treatment, I would have had to do something like this:

#first, create blank list of vectors waiting to be filled
#as well as blank vectors to remember what treatment 
#combinations we're looking at
for(i in 1:gazillion) mean.list[[paste("sp",i,sep=".")]]==vector()

#then, a for loop for each combination
for (i in levels(my.data$Urchins)){
  for (j in levels(my.data$Lobsters)){
    #get the right subset of the data
    sub.data<-subset(my.data, my.data$Urchins==i & my.data$Lobster==j)

    #now loop over all species
    for (k in 1:gazillion){ 
    sub.data<-subset(sub.data, sub.data$Species==k)
    before.data<-subset(sub.data, sub.data$Before.After=="Before")
    after.data<-subset(sub.data, sub.data$Before.After=="After")
	c(mean.list[[paste("sp",i-3,sep=".")]], mean(after.data[k]-before.data[k])


#then turn it into a data frame to match back up to the right values

Ok, did that exhaust you as much it did me? Now, here's how to get the same result using ddply (more on why it's called ddply in a moment)

#a function where, if you give it a data frame with species 1 - gazillion
#will give you the mean for each species.  Note, the which statement
#lets me look at before and after data separately
#Also note the Change= in the return value.  That gives a new column name
#for multiple return columns, just return a vector, with each 
#new column getting it's own name

tab.means<-function(df){ return(Change = 
			- df$Density[which($Before.After=="Before")]) }

#now use ddply
mean.data<-ddply(df, c("Urchins", "Lobsters", "Species"), tab.means))

#ok, in fairness, each species doesn't have it's own column.  But, we can 
#just use reshape
mean.data<-reshape(mean.data, timevar="Species", idvar=c("Urchins", "Lobsters", direction="wide)

Um. Whoah. Notice a small difference there. Now, perhaps the earlier code could have been cleaned up a bit with reshape, but still, 6 lines of code versus 18 - and most of that 18 was bookkeeping code, not something doing real useful computation.

Soooo.... Check out plyr! It will make your life a happier, more relaxed place. Enjoy!

(p.s. I'm noticing wordpress is ignoring all of my code indenting - anyone have any quick fixes for that?)

when NOT to MANOVA

And now its time for a multivariate stats geek out.

The statistics that we use determine the inferences we draw from our data. The more statistical tools you learn to use, the more likely you are likely to slip on a loose bit of data, and stab yourself in the eyeball with your swiss-army-knife of p values. It’s happened to all of us, and it is rarely due to willful misconduct. Rather, it’s a misunderstanding, or even being taught something improperly.

I was given a sharp reminder of this recently while doing some work on how to breakdown some repeated measures MANOVAs- useful for dealing with violations of sphericity, etc. However, I fell across this wonderful little ditty by Keselman et al – Statistical Practices of Educational Researchers: An Analysis of their ANOVA, MANOVA, and ANCOVA Analyses (yes, as an Ecologist, I love reading stats papers from other disciplines – I actually find it often more helpful than reading ones in my own discipline).

Now, everything in my own work was A-OK, but I found this note on the usage of MANOVA fascinating. (bolded phrases are mine)

In an overwhelming 84% (n = 66) of the studies, researchers never used the results of the MANOVA(s) to explain effects of the grouping variable(s). Instead, they interpreted the results of multiple univariate analyses. In other words, the substantive conclusions were drawn from the multiple univariate results rather than from the MANOVA. With the discovery of the use of such univariate methods, one may ask: Why were the MANOVAs conducted in the first place? Applied researchers should remember that MANOVA tests linear combinations of the outcome variables (determined by the variable intercorrelations) and therefore does not yield results that are in any way comparable with a collection of separate univariate tests.

Although it was not indicated in any article, it was surmised that researchers followed the MANOVA-univariate data analysis strategy for protection from excessive Type I errors in univariate statistical testing. This data analysis strategy may not be overly surprising, because it has been suggested by some book authors (e.g., Stevens, 1996, p. 152; Tabachnick & Fidell, 1996, p. 376). There is very limited empirical support for this strategy. A counter position may be stated simply as follows: Do not conduct a MANOVA unless it is the multivariate effects that are of substantive interest. If the univariate effects are those of interest, then it is suggested that the researcher go directly to the univariate analyses and bypass MANOVA. When doing the multiple univariate analyses, if control over the overall Type I error is of concern (as it often should be), then a Bonferroni (Huberty, 1994, p. 17) adjustment or a modified Bonferroni adjustment may be made (for a more extensive discussion on the MANOVA versus multiple ANOVAs issue, see Huberty & Morris, 1989). Focusing on results of multiple univariate analyses preceded by a MANOVA is no more logical than conducting an omnibus ANOVA but focusing on results of group contrast analyses (Olejnik & Huberty, 1993).

I, for one, was dumbstruck. This is EXACTLY why more than one of my stats teachers have told me MANOVA was most useful. I even have advised others to do this myself – like the child of some statistically abusive parent. But really, if the point is controlling for type I error, why not do a Bonferroni or (my personal favorite) a False Discovery Rate correction? To invoke the MANOVA like some arcane form of magic is disingenuous to your data. Now, if you’re interested in the canonical variables, and what they say, then by all means! Do it! But if not, you really have to ask whether you’re following a blind recipe, or if you understand what you are up to.

This paper is actually pretty brilliant in documenting a number of things like this that we scientists do with our ANOVA, ANCOVA, and MANOVA. It’s worth reading just for that, and to take a good sharp look in the statistical mirror.

H. J. Keselman, C. J. Huberty, L. M. Lix, S. Olejnik, R. A. Cribbie, B. Donahue, R. K. Kowalchuk, L. L. Lowman, M. D. Petoskey, J. C. Keselman, J. R. Levin (1998). Statistical Practices of Educational Researchers: An Analysis of their ANOVA, MANOVA, and ANCOVA Analyses Review of Educational Research, 68 (3), 350-386 DOI: 10.3102/00346543068003350

Baby Got Stats!

I was completely tickled last year with the oh so amusing Statz Rappers. It kept me and my nerdy stats friends laughing for days. Rapping. Stats. The Internet.

Good times.

But little did I know that rapping about statistics was really just hitting its stride on youtube. This is Why We Plot began my trip down the rabbit hole. Quite a nice effort. I followed this with that Stats Rap – not bad, and oh so mellow. Also I love the equations.

But the pièce de résistance is Baby Got Stats from Dorry Serev in the biostats department at JHU. Oh. Dear. Lord. I laughed. I wept. I even put some of the lyrics into my .sig file. Enjoy.

(note, mildly nsfw? maybe?)

There are a ton more – just follow the related links. It’s kind of amazing. And if anyone gets a yen to start doing some multivariate SEM or Bayesian raps, I want to know!