Rafe Sagarin and Passion in The Life Scientific

rafeapylsiacrop-jczarkaHow to process something like Rafe Sagarin’s senseless passing? For those who never encountered him, Rafe was a wonderful marine ecologist whose body of work was as varied and eclectic as can be, from classic papers on long-term effects of climate change in Science and Ecological Monograph to books on how the natural world can help us deal with risk, like terrorist attacks, to working at Biosphere2. He’s even got a wonderful youtube channel that works, as he always did, outside of the box.

Rafe was one of those vivacious dynamic iconoclastic people that you are always delighted to encounter in life. I was fortunate to know him through the Western Society of Naturalists where his charm and vim earned him the annual slot of auctioneer to raise funds for students. I really only knew him as colleague at meetings, and yet he is someone who has served as an example for myself and so many other early career scientists.

It was his passion. Every talk I heard him give, every conversation we ever had – scientific, personal, or with others about the difficulties of being early career – every interaction I saw him have with anyone around him, all of it was infused with an infection passion. His passion for the natural world was infectious. It infused everything. It was in his voice. It was in his worldview. It was in how he shaped his talks and storytelling. It was in his damned silly shirt he wore at every WSN.

Rafe’s uncompromising passion for science and nature was an inspiration. I looked forward to seeing what he thought of next because there is no joy in the world like watching someone follow their passion. And now I will not be able to again.

But that passion. That is what I will always hold onto. Because the life scientific can get you down as it grinds forward. But holding onto our passion, our vim, our vigor, our joie de vivre, rooted in our love of the natural world – that is what can provide an endless wellspring of joy in every moment. It’s something that has helped me forge onwards in my own career and in my life in general.

Rafe will always serve as an example of that passion and the myriad of directions it can lead. I will hold onto that. I hope that his passion will long serve as an example for our whole community. It will for me.

Spicier Analyses with Interactive R Leaflet Maps



Who wants to make a kickass, public-friendly, dynamic, online appendix with a map for their papers? ME! (and you, right?) Let’s talk about a cool way to make your data sing to a much broader audience than a static image.

Last time, I mentioned that I had also been playing the Rstudio’s Leaflet package. For those who don’t know, Leaflet is a javascript library for generating interactive maps. I don’t know javascript (really), but I do know R, so this library is incredibly powerful for someone like me – particularly when paired with Rstudio’s HTMLwidgets or Shiny for publishing things.

So, what is leaflet? How does it work? Let’s start with a simple example. First, install the thing!

if (!require('devtools')) install.packages('devtools')
devtools::install_github('rstudio/leaflet')

Now, at its core, leaflet makes maps. And then puts data on them. It uses the magrittr for pipes as syntax, which may be a bit daunting at first, but I think you’ll see that it’s all prety straightforward.

Let’s start with a super-basic simple map.

library(leaflet)
leaflet() %>% addTiles()

Oh! Map! Scrollable! Easy to use! And you can zoom down to streets. What’s going on here?

First, leaflet() starts the whole process. You can feed it arguments of data frames and such for later use. Second, addTiles says “OK, go to a map server that feeds up image tiles, and grab ’em!” It defaults to OpenStreetMap, but there are others. For example, ESRI has a tile server with satellite imagery.

leaflet() %>% 
  addTiles(urlTemplate="http://server.arcgisonline.com/ArcGIS/rest/services/World_Imagery/MapServer/tile/{z}/{y}/{x}")

There are a number of arguments and functions you could string after addTiles to zoom around, start in a certain area, manipulate the basic map widget in different ways, but I’d like to focus on two common applications for folk like us. First, site maps! Let’s say you had some awesome data from over 100 sites around the world, and you wanted to show it off. Maybe you wanted people to really be able to zoom in and see what the environment was like around your data (i.e., why I went and found the ESRI tile server). How would you do this?

Well, leaflet provides a few different ways to put this stuff on a map. You can add circles, markers, circlemarkers, and much more. Let’s give an example on a basic map. Note the formula interface for supplying values to latitude and longitude.

#fake some data
set.seed(100)
myData <- data.frame(Latitude = runif(100, -80,80), 
                     Longitude = runif(100, -180,180),
                     measurement1 = rnorm(100, -3, 2), 
                     Year = round(runif(100, 1975,2015)))
#map it!
leaflet() %>% 
  addTiles() %>%
  addCircles(data=myData, lat = ~Latitude, lng = ~Longitude)

Awesome! But we know there’s information in that thar data. There’s a Year and a Measurement. How can we communicate this information via our map? Well, there are a few ways. Note those circles had a color and a size? We can use that. Circles and markers are also clickable. So, we can add some additional information to our data and feed that into the map. Let’s start with a popup message.

#round for readability
myData$info <- paste0("Year: ", myData$Year, 
                      "<br> Value: ", round(myData$measurement1, 3))

Note the use of some HTML in there. That’s because leaflet displays things in a web browser, so, we need to use HTML tags for text formatting and line breaks.

OK, great. Now what about a color? This can get a bit trickier with continuous values, as you’ll need to roll your own gradients. Kind of a pain in the ass. If you just wanted one or two colors (say, positive or negative), you can just do something simple, or even feed a single value to the color argument – say, “red” – but let’s get a little fancy. The classInt library is perfect for gradient construction. Here’s an example. We’ll start with grabbing a palette and making it a color ramp.

#grab a palette
library(RColorBrewer)

pal <- brewer.pal(11, "Spectral")

#now make it more continuous 
#as a colorRamp
pal <- colorRampPalette(pal)

Great! Now we can use classInt to map the palette to our measurement values.

#now, map it to your values
library(classInt)
palData <- classIntervals(myData$measurement1, style="quantile")

#note, we use pal(100) for a smooth palette here
#but you can play with this
myData$colors <- findColours(palData, pal(100))

Now that we have info, colors, and, well, let’s say we want to make the circles bigger – let’s put it all together!

#Put it on a map!
newMap <- leaflet() %>% 
  addTiles() %>%
  addCircleMarkers(data=myData, lat = ~Latitude, lng = ~Longitude,
             color = ~colors, popup = ~info, radius = 8)

newMap

Very cool! With different properties for zooming, filling, sizing points, different styles of markers, and more, I think the possibilities for online apendicies for papers – or outreach objects – are pretty endless here. Go forth, ye, and play!

Also, check out the code in this gist!


MEOW! It’s Marine Ecoregions in R!

So, I’m on paternity leave (yay! more on that another day – but HOMG, you guys, my daughter is amazing!) and while my daughter is amazing, there are many hours in the day and wee morning where I find myself rocking slowly back and forth, knowing that if I stop even for a second, this little bundle of cute sleeping away in the wrap will wake and howl.

So, what’s a guy on leave to do? Well, why not learn some new R tricks that have been on my list for, in some cases years, but I have not had time to do otherwise. In particular, time to ramp up some of my geospatial skills. As, really, I can type while rocking. And need to do something to stay awake. (One can only watch Clone Wars for so long – and why is it so much better than the prequels?)

In particular, I’ve really wanted to become a more proficient map-maker. I’ve been working on a few global projects lately, and wanted to visualize the results in some more powerful intuitive ways. Many of these projects have used Spalding et al.’s 2007 Marine Ecoregions of the World classification (or MEOW) as a basis. So, wouldn’t it be cool if we could make some nice plots of those regions, and maybe fill them in with colors according to some result?

Where to start? Well, to begin, how does one get the geographic information in to R? Fortunately, there’s a shapefile over at Marineregions.org.

Actually, heck, there are a LOT of marine region-like shapefiles over there that we might all want to use for different maps. And everything I’m about to say can generalize to any of those other shapefiles!

Oh, for those of you as ignorant as me, a shapefile is a geospatial file that has information about polygons with some additional data attached to them. To futz with them, we need to first load a few R libraries

#for geospatial tools
library(rgdal)
library(maptools)
library(rgeos)

#for some data reduction
library(dplyr)

These are wonderful tools that will help you load and manipulate your shapefiles. Note that I’ve also loaded up dplyr, which I’ve been playing with and finally learning. I’m a huge fan of ye olde plyr library, but dplyr has really upped my game, as it’s weirdly more intuitive – particularly with pipes. For a tutotrial on that, see here – and perhaps I’ll write more about it later.

OK, so, libraries aside, how do we deal with the file we get at Marineregions.org? Well, once we download and unzip it into a folder, we can take a look at what’s inside

#Let's see what's in this shapefile!
#Note - the paths here are relative to where I
#am working in this file - you may have to change them
ogrInfo("../../MEOW-TNC", "meow_ecos")
## Source: "../../MEOW-TNC", layer: "meow_ecos"
## Driver: ESRI Shapefile number of rows 232 
## Feature type: wkbPolygon with 2 dimensions
## Extent: (-180 -89.9) - (180 86.9194)
## CRS: +proj=longlat +datum=WGS84 +no_defs  
## LDID: 87 
## Number of fields: 9 
##         name type length typeName
## 1   ECO_CODE    2     19     Real
## 2  ECOREGION    4     50   String
## 3  PROV_CODE    2     19     Real
## 4   PROVINCE    4     40   String
## 5   RLM_CODE    2     19     Real
## 6      REALM    4     40   String
## 7   ALT_CODE    2     19     Real
## 8 ECO_CODE_X    2     19     Real
## 9   Lat_Zone    4     10   String

OK, cool! We can see that it’s an ESRI shapefile with 232 rows – that’s 232 ecoregions. Each row of data has a number of properties – Province, Realm (which are both higher order geospatial classifications), some numeric IDs, and information about latitudinal zone. We can also see that it’s in the WGS84 projection – more on projections another time – and that it’s chocked full of polygons.

OK, that’s all well and good, but let’s load and plot the sucker!

#get an ecoregions shapefile, and from that make a provience and realm shapefile
#from http://www.marineregions.org/downloads.php
#http://www.marineregions.org/sources.php#meow
regions <- readOGR("../../MEOW-TNC", "meow_ecos")
## OGR data source with driver: ESRI Shapefile 
## Source: "../../MEOW-TNC", layer: "meow_ecos"
## with 232 features and 9 fields
## Feature type: wkbPolygon with 2 dimensions
plot(regions)

unnamed-chunk-3-1

WHOAH! Cool. Regions! In the ocean! Nice! What a beautiful simple little plot. Again, well and good. But…..what can we do with this? Well, a quick call to class shows us that regions is a SpatialPolygonsDataFrame. Which of course has it’s own plotting methods, and such. So, you could fill some things, make some borders, overlay – the sky’s the limit. But, there are two things I want to show you how to do to make your life more flexible.

Higher Order Geographic Regions

One might be to look at Provinces and Realms. In general, when you have shapefiles, if you want to make aggregted polygons, you have to go through a few steps. Let’s say we want to look at Provinces. A province is composed of many ecoregions. Fortunately, there’s a function to unite SpatialPolygons (that’s the class we’re dealing with here that’s part of the SpatialPolygonsDataFrame) given some identifier.

#Unite the spatial polygons for each region into one
provinces <- unionSpatialPolygons(regions, regions$PROVINCE)

OK, great. But we still need to add some data to that. This provinces is just a SpatialPolygons object. To do that, let’s make a new reduced data frame using dplyr.

#Make a data frame that will have Province level info and above
prov_data <- regions@data %>%
  group_by(PROVINCE) %>%
  summarise(PROV_CODE = PROV_CODE[1], REALM = REALM[1], RLM_CODE=RLM_CODE[1], Lat_Zone=Lat_Zone[1])

Bueno. We now have a much smaller data frame that is for Provinces only. The last step is to make a new Spatial Polygons Data Frame by joining the data and the polygons. There are two tricks here. First, make sure the right rows in the data are joined to the right polygons. For that, we’ll use a join statement. The second, the new data frame has to have row names matching the names of the polygons. I don’t often use this, but in making a data frame, you can supply row names. So, here we go:

#merge the polygons with the new data file
#note the row.names argument to make sure they map to each other
provinces <- SpatialPolygonsDataFrame(provinces, 
                                      data=data.frame(
                                        join(data.frame(PROVINCE=names(provinces)),
                                             prov_data),
                                        row.names=row.names(provinces)))
## Joining by: PROVINCE

Not gorgeous, but it gets the job done. We can of course do this for realms as well.

#######
#make realms shapefile
########
#make spatial polygons for realms
realms <- unionSpatialPolygons(regions, regions$REALM)

#make new data
realm_data <- regions@data %>%
  group_by(REALM) %>%
  summarise(RLM_CODE = RLM_CODE[1],  Lat_Zone=Lat_Zone[1])

#merge the two!
realms <- SpatialPolygonsDataFrame(realms, 
                                   data=data.frame(
                                     join(data.frame(REALM=names(realms)),
                                          realm_data),
                                     row.names=row.names(realms)))
## Joining by: REALM

Excellent. So – did it work? And how different are these three different spatial things anyway? Well, let’s plot them!

#########Plot them all
par(mfrow=c(2,2), mar=c(0,0,0,0))
plot(regions, main="Ecoregion", cex.main=5)
plot(provinces, lwd=2, border="red", main="Province")
plot(realms, lwd=2, border="blue", main="Realm")
par(mfrow=c(1,1))

unnamed-chunk-8-1

Lovely.

ggplot ’em! I admit, I’m a [ggplot2][9] junkie. I just find it the fastest way to make publication quality graphs with little fuss or muss. Or make something quick and dirty to send to colleagues. But, you can’t just go and plot a SpatialPointsDataFrame in ggplot2 with ease and then use it as you will. So what’s a guy to do?

I will admit, I’m shamelessly gacking the following from https://github.com/hadley/ggplot2/wiki/plotting-polygon-shapefiles. It provides a three step process where what you do, essentially, is turn the whole mess into a data frame with the polygons providing points for plotting geom_path or geom_polygon pieces.

Step 1, you need an ID column in your data. Let’s do this for both ecoregions and provinces

regions@data$id = rownames(regions@data)
provinces@data$id = rownames(provinces@data)

OK – step 2 is the fortify function. Fortify converts an R object into a data frame for ggplot2. In this case –

library(ggplot2)
regions.points = fortify(regions, ECOREGION="id")
## Regions defined for each Polygons
provinces.points = fortify(provinces, PROVINCES="id")
## Regions defined for each Polygons

Great! Now that we have these two knew fortified data frames that describe the points that we’ll be plotting, the last thing to do is to join the points with the actual, you know, data! For that, I like to use join:

regions.df = join(regions.points, regions@data, by="id")
provinces.df = join(provinces.points, provinces@data, by="id")

What’s great about this is that, from now on, if I have another data frame of some sort that has a Ecoregion or Province as one of it’s headings – for example, let’s say I ran a linear model where Ecoregion was a fixed effect, I have a coefficient for each Ecoregion, and I’ve turned the coefficient table into a data frame with Ecoregion as one of the columns – as long as the name of the identifying column in my new data frame and my data frame for plotting are the same, I can use join to add a new column to my regions.df or provinces.df for plotting.

But, for now, I can show you how these would plot out in ggplot2. To do this, we use geom_polygon to define an area that we can fill as we want, and geom_path to stroke the outside of the areas and do with them what you will.

#####Make some ggplots for later visualization
base_ecoregion_ggplot <- ggplot(regions.df) + theme_bw() +
  aes(long,lat,group=group) + 
  geom_polygon(fill=NA) +
  geom_path(color="black") +
  coord_equal() 


base_province_ggplot <- ggplot(provinces.df) + theme_bw() +
  aes(long,lat,group=group) + 
  geom_polygon(fill=NA) +
  geom_path(color="black") +
  coord_equal() 

Note that there’s a fill=NA argument? That’s where I could put something like coefficient from that joined data, or temperature, or whatever I’ve tacked on to the whole shebang. Let’s see what they look like in ggplot2.

base_ecoregion_ggplot + ggtitle("Ecoregions")

unnamed-chunk-13-1

base_province_ggplot + ggtitle("Provinces")

unnamed-chunk-13-2

So what’s the advantage of putting them into ggplot? Well, besides using all of the graphical aestehtics for your polygon fills and paths, you can add points (say, sites), lines, or whatnot. One example could be, let’s say you wanted to visualize the borders of land (and countries!) on the map with ecoregions. Cool! Let’s get the worldmap, turn it into a data frame, and then add a geom_path with the world map on it.

library(maps)
worldmap <- map('world', plot=F)
worldmap.df <- data.frame(longitude =worldmap$x,latitude=worldmap$y)

base_province_ggplot+
  geom_path(data=worldmap.df, aes(x=longitude, y=latitude, group=NULL), color="darkgreen")

unnamed-chunk-14-1

The possibilities really are endless at this point for cool visualizations.

EDIT – OK, here’s a cool example with filled polygons using a random ‘score’ to determine fill and RColorBrewer for pretty colors!

#let's make some fancy colors
library(RColorBrewer)

#Make a data frame with Ecoregion as an identifier
thing <- data.frame(ECOREGION = regions$ECOREGION,                     score = runif(nrow(regions), -100, 100))  #merge the score data with the regions data frame regions.df2 <- merge(regions.df, thing)  #plot! ggplot(regions.df2) + theme_bw() +        aes(long,lat,group=group) +         geom_polygon(mapping=aes(fill=score)) +        geom_path(color="black") +        coord_equal() +        scale_fill_gradientn(colours=brewer.pal(11, "Spectral")) 

Screen Shot 2015-03-28 at 11.38.02 AM

Next time, Leaflet! If I can figure out how to post it's output. And for those of you who don't know leaflet, prepare to be wowed.

Also, all code for this example is in a gist over here!

Space and SEMs

One question that comes up time and time again when I teach my SEM class is, “What do I do if I have spatially structured data?” Maybe you have data that was sampled on a grid, and you know there are spatial gradients. Maybe your samples are clustered across a landscape. Or at separate sites. A lot of it boils down to worrying about the hidden spatial wee beasties lurk in the background.

I’m going to stop for a moment and suggest that before we go any further you read Brad Hawkins’s excellent Eight (and a half) deadly sins of spatial analysis where he warns of the danger of throwing out the baby with the bathwater. Remember, in any modeling technique, you want to ensure that you’re capturing as much biological signal as is there, and then adjust for remaining spatial correlation. Maybe your drivers vary in a spatial pattern. That’s OK! They’re still your drivers.

That said, ignoring residual spatial autocorrelation essentially causes you to think you have a larger sample size than you think you do (remember the assumption of independent data points) and as such your standard errors are too tight, and you may well produce overconfident results.

To deal with this in a multivariate Structural Equation Modeling context, we have a few options. First, use something like Jon Lefcheck’s excellent piecewiseSEM package and fit your models with mixed model or generalized least squares tools that can accomodate spatial correlation matrices as part of the model. If you have non-spatial information about structure, I’ve started digging into the lavaan.survey package, which has been fun (and is teaching me a lot about survey statistics).

But, what if you just want to go with a model you’ve fit using covariance matrices and maximum likelihood, like you do, using lavaan in R? It should be simple, right?

Well, I’ve kind of tossed this out as a suggestion in the ‘advanced topics’ portion of my class for years, but never implemented it. This year, I got off of my duff, and have been working this up, and have both a solid example, and a function that should make your lives easier – all wrapped up over at github. And I’d love any comments or thoughts on this, as, to be honest, spatial statistics is not where I spend a lot of time. Although I seem to be spending more and more time there these days… silly spatially structured observational datasets…that I seem to keep creating.

Anyway, let’s use as an example the Boreal Vegetation dataset from Zuur et al.’s Mixed Effects Models and Extensions in Ecology with R. The data shows vegetation NDVI from satellite data, as well as a number of other covariates – information on climate (days where the temperature passed some threshold, I believe), wetness, and species richness. And space. Here’s what the data look like, for example:

# Boreality data from http://www.highstat.com/book2.htm
# Mixed Effects Models and Extensions in Ecology with R (2009). 
# Zuur, Ieno, Walker, Saveliev and Smith. Springer
boreal <- read.table("./Boreality.txt", header=T)

#For later
source("./lavSpatialCorrect.R")

#Let's look at the spatial structure
library(ggplot2)

qplot(x, y, data=boreal, size=Wet, color=NDVI) +
  theme_bw(base_size=18) + 
  scale_size_continuous("Index of Wetness", range=c(0,10)) + 
  scale_color_gradient("NDVI", low="lightgreen", high="darkgreen")

visualize-data-1

So, there are both clear associations of variables, but also a good bit of spatial structure. Ruh roh! Well, maybe it’s all in the drivers. Let’s build a model where NDVI is affected by species richness (nTot), wetness (Wet), and climate (T61) and richness is itself also affected by climate.

library(lavaan)

## This is lavaan 0.5-17
## lavaan is BETA software! Please report any bugs.

# A simple model where NDVI is determined
# by nTot, temperature, and Wetness
# and nTot is related to temperature
borModel <- '
  NDVI ~ nTot + T61 + Wet 
  nTot ~ T61
'

#note meanstructure=T to obtain intercepts
borFit <- sem(borModel, data=boreal, meanstructure=T)

OK, great, we have a fit model – but we fear that the SEs may be too small! Is there any spatial structure in the residuals? Let’s look.

# residuals are key for the analysis
borRes <- as.data.frame(residuals(borFit, "casewise"))

#raw visualization of NDVI residuals
qplot(x, y, data=boreal, color=borRes$NDVI, size=I(5)) +
  theme_bw(base_size=17) + 
  scale_color_gradient("NDVI Residual", low="blue", high="yellow")

residuals-1

Well…sort of. A clearer way to see this that I like is just to see signs of residuals.

#raw visualization of sign of residuals
qplot(x, y, data=boreal, color=borRes$NDVI>0, size=I(5)) +
  theme_bw(base_size=17) + 
  scale_color_manual("NDVI Residual >0", values=c("blue", "red"))

residual-analysis-sign-1

OK, we can clearly see the positive residuals clustering on the corners, and negatives ones more prevalent in the middle. Sort of. Are they really? Well, we can correct for them one we know the degree of spatial autocorrelation, Moran’s I. To do this, there are a few steps. First, calculate the spatial weight matrix – essentially, the inverse of the distance between any pair of points. Close points should have a lower weight on the resulting analyses than nearer points.

#Evaluate Spatial Residuals
#First create a distance matrix
library(ape)
distMat <- as.matrix(dist(cbind(boreal$x, boreal$y)))

#invert this matrix for weights
distsInv <- 1/distMat
diag(distsInv) <- 0

OK, that done, we can determine whether there was any spatial autocorrelation in the residuals. Let’s just focus on NDVI.

#calculate Moran's I just for NDVI
mi.ndvi <- Moran.I(borRes$NDVI, distsInv)
mi.ndvi

## $observed
## [1] 0.08265236
## 
## $expected
## [1] -0.001879699
## 
## $sd
## [1] 0.003985846
## 
## $p.value
## [1] 0

Yup, it’s there. We can then use this correlation to calculate a spatially corrected sample size, which will be smaller than our initial sample size.

#What is our corrected sample size?
n.ndvi <- nrow(boreal)*(1-mi.ndvi$observed)/(1+mi.ndvi$observed)

And given that we can get parameter variances and covariances from the vcov matrix, it’s a snap to calculate new SEs, remembering that the variance of a parameter has the sample size in the denominator.

#Where did we get the SE from?
sqrt(diag(vcov(borFit)))

##    NDVI~nTot     NDVI~T61     NDVI~Wet     nTot~T61   NDVI~~NDVI 
## 1.701878e-04 2.254616e-03 1.322207e-01 5.459496e-01 1.059631e-04 
##   nTot~~nTot       NDVI~1       nTot~1 
## 6.863893e+00 6.690902e-01 1.617903e+02

#New SE
ndvi.var <- diag(vcov(borFit))[1:3]

ndvi.se <- sqrt(ndvi.var*nrow(boreal)/n.ndvi)

ndvi.se

##    NDVI~nTot     NDVI~T61     NDVI~Wet 
## 0.0001848868 0.0024493462 0.1436405689

#compare to old SE
sqrt(diag(vcov(borFit)))[1:3]

##    NDVI~nTot     NDVI~T61     NDVI~Wet 
## 0.0001701878 0.0022546163 0.1322207383

Excellent. From there, it’s a hop, skip, and a jump to calculating a z-score and ensuring that this parameter is still different from zero (or not!)

#new z values
z <- coef(borFit)[1:3]/ndvi.se

2*pnorm(abs(z), lower.tail=F)

##     NDVI~nTot      NDVI~T61      NDVI~Wet 
##  5.366259e-02  1.517587e-47 3.404230e-194

summary(borFit, standardized=T)

## lavaan (0.5-17) converged normally after  62 iterations
## 
##   Number of observations                           533
## 
##   Estimator                                         ML
##   Minimum Function Test Statistic                1.091
##   Degrees of freedom                                 1
##   P-value (Chi-square)                           0.296
## 
## Parameter estimates:
## 
##   Information                                 Expected
##   Standard Errors                             Standard
## 
##                    Estimate  Std.err  Z-value  P(>|z|)   Std.lv  Std.all
## Regressions:
##   NDVI ~
##     nTot             -0.000    0.000   -2.096    0.036   -0.000   -0.044
##     T61              -0.035    0.002  -15.736    0.000   -0.035   -0.345
##     Wet              -4.270    0.132  -32.295    0.000   -4.270   -0.706
##   nTot ~
##     T61               1.171    0.546    2.144    0.032    1.171    0.092
## 
## Intercepts:
##     NDVI             10.870    0.669   16.245    0.000   10.870  125.928
##     nTot           -322.937  161.790   -1.996    0.046 -322.937  -30.377
## 
## Variances:
##     NDVI              0.002    0.000                      0.002    0.232
##     nTot            112.052    6.864                    112.052    0.991

See! Just a few simple steps! Easy-peasy! And a few changes – the effect of species richness is no longer so clear, for example

OK, I lied. That’s a lot of steps. But, they’re repetative. So, I whipped up a function that should automate this, and produce useful output for each endogenous variable. I need to work on it a bit, and I’m sure issues will come up with latents, composites, etc. But, just keep your eyes peeled on the github for the latest update.

lavSpatialCorrect(borFit, boreal$x, boreal$y)

## $Morans_I
## $Morans_I$NDVI
##     observed     expected          sd p.value    n.eff
## 1 0.08265236 -0.001879699 0.003985846       0 451.6189
## 
## $Morans_I$nTot
##     observed     expected          sd p.value    n.eff
## 1 0.03853411 -0.001879699 0.003998414       0 493.4468
## 
## 
## $parameters
## $parameters$NDVI
##             Parameter      Estimate    n.eff      Std.err   Z-value
## NDVI~nTot   NDVI~nTot -0.0003567484 451.6189 0.0001848868  -1.92955
## NDVI~T61     NDVI~T61 -0.0354776273 451.6189 0.0024493462 -14.48453
## NDVI~Wet     NDVI~Wet -4.2700526589 451.6189 0.1436405689 -29.72734
## NDVI~~NDVI NDVI~~NDVI  0.0017298286 451.6189 0.0001151150  15.02696
## NDVI~1         NDVI~1 10.8696158663 451.6189 0.7268790958  14.95382
##                  P(>|z|)
## NDVI~nTot   5.366259e-02
## NDVI~T61    1.517587e-47
## NDVI~Wet   3.404230e-194
## NDVI~~NDVI  4.889505e-51
## NDVI~1      1.470754e-50
## 
## $parameters$nTot
##             Parameter    Estimate    n.eff     Std.err   Z-value
## nTot~T61     nTot~T61    1.170661 493.4468   0.5674087  2.063171
## nTot~~nTot nTot~~nTot  112.051871 493.4468   7.1336853 15.707431
## nTot~1         nTot~1 -322.936937 493.4468 168.1495917 -1.920534
##                 P(>|z|)
## nTot~T61   3.909634e-02
## nTot~~nTot 1.345204e-55
## nTot~1     5.479054e-02

Happy coding, and I hope this helps some of you out. If you’re more of a spatial guru than I, and have any suggestions, feel free to float them in the comments below!

Positive Multifunctionality ≠ All Functions Are Positive

Positive Multifunctionality ≠ All Functions Are Positive

I was dismayed this morning to read Bradford et al.’s recently accepted paper Discontinuity in the responses of ecosystem processes and multifunctionality to altered soil
community composition
in PNAS for several reasons.

The paper itself is really cool. They manipulated community complexity and nutrient conditions in the Ecotron, and then looked at five soil ecosystem functions. They then looked at whether complexity influenced multi functionality (as well as N), and found, indeed, it did! They went on and, as recommended in our paper on multifunctionality analyze single functions to understand what is driving that multifunctionality relationship, and then…

Then they fall off the boat completely.

Disappointment #1
They find that, while some functions were affected positively, some were not, and one more was affected negatively. They conclude, therefore, that multifunctionality metrics are not useful.

…multifunctionality indices may obscure insights into the mechanistic relationships required to understand and manage the influence of community change on ecosystem service provision.

The mismatch between our community and fertilization effects on multifunctionality and the individual processes, however, cautions against using the framework as a predictive tool for achieving desired levels of functioning for multiple, specified ecosystem services.

What is frustrating about this is that the authors completely miss what multifunctionality actually tells us.

I’m going to say this once very simply, and then in much more detail –

high multifunctionality ≠ every single function performing well

To quote from my own work, multifunctionality is “simultaneous performance of multiple functions.” No more, no less. A positive relationship between a driver and multifunctionality does not imply a positive relationship between that driver and every function being monitored. But rather that said driver will be able to increase the performance of more functions than are decreased.

Some More Detail
Indeed, in the example in Byrnes et al. 2014, we look at the data from the German BIODEPTH experiment. Some of the functions have a positive relationship with richness. Some do not. One has a trending negative relationship. But, put together, multifunctionality is a powerful concept that shows us that, if we are concerned with the simultaneous provision of multiple functions, then, yes, biodiversity enhances multifunctionality.

In our paper, we advise that researchers look at single functions – precisely because they are likely not all related to a driver in the same way. We state

The suite of metrics generated by the multiple threshold approach provide powerful information for analysing multifunctionality, especially when combined with analyses of the relationship between diversity and single functions.

We say this because, indeed, one has to ask – is the driver-MF relationship as strong as it could be? Why or why not? How can we pull the system apart into its component pieces to understand what is going on at the aggregate level?

The approaches are not in opposition, but rather utilizing both provides a much more rich picture of how a driver influences an ecosystem – both through an aggregate lens and a more fine-scale lens. The similarities and differences between them are informative, not discordant.

Disappointment #2
UPDATE: See comments from Mark and Steve below. This #2 would appear incorrect and a tale of crossed paths not scene. While I cannot find anything in my various inboxes regarding communication, it’s possible either a bad email address was used, or it went missing in my transition between nceas and umb. If this is the case, I’m in the wrong on this. An interesting quandry of how do we resolve these things outside of the literature, and worth pondering in this our modern age of email. I leave my comments below for the sake of completeness, and as there are still some ideas worth thinking about. But, wish that email hadn’t disappeared somewhere into the ether! Now the more my disappointment in technology!

Perhaps the bigger bummer is that, despite this being a big critique of the idea of multifunctionality that our group spent a *huge* amount of time trying to figure out how to quantify in a meaningful and useful way, as far as I know, none of us were contacted about this rebuttle. The experiment and analysis of the experiment is excellent, and it gets into some really cool stuff about soil biocomplexity and ecosystem multifunctionality. But the whole attacking multifunctionality as a useful concept thing?

That entire controversy could have been resolved with a brief email or two, tops. For this group to go so far off base is really kind of shocking, and dismaying.

Dismaying because the advice that would seem to stem from this paper is to go back to just looking at single functions individually and jettison the concept of multifunctionality (no other alternative is provided). That places us squarely back in 2003, with fragmented different types of analyses being used in an ad hoc manner without a unifying framework. Precisely what we were trying to avoid with our methods paper.

And all it would have taken to prevent is a little bit of communication.

References
Bradford, M. A., S. A. Wood, R. D. Bardgett, H. I. J. Black, M. Bonkowski, T. Eggers, S. J. Grayston, E. Kandeler, P. Manning, H. Setälä, and T. H. Jones. 2014. Discontinuity in the responses of ecosystem processes and multifunctionality to altered soil community composition. PNAS. link

Byrnes, J. E. K., L. Gamfeldt, F. Isbell, J. S. Lefcheck, J. N. Griffin, A. Hector, B. J. Cardinale, D. U. Hooper, L. E. Dee, and J. Emmett Duffy. 2014. Investigating the relationship between biodiversity and ecosystem multifunctionality: challenges and solutions. Methods in Ecology and Evolution. 5: 111-124. doi: 10.1111/2041-210X.12143

Getting it Right – after Publication: A Multifunctional Journey?

Sometimes, you have to publish a paper to have the hit-yourself-in-the-head revelation of the real right answer.

Earlier this year, along with a great cohort of colleagues, I birthed a really neat piece summarizing how you can look at simultaneous change in multiple ecosystem functions. We were interested in how changes in diversity can affect the simultaneous performance of multiple functions, but, really, anything can be put on the X axis – warming, fertilization, lemony-fresh-scentedness – it doesn’t matter.

This was a problem that had been vexing the field of biodiversity-ecosystem-function, or, really, anyone who wanted to look at multiple functions. Our group spent multiple sessions bashing our heads against a wall trying to derive a solid analytic strategy to look at changes in so-called multifunctionality and in the end came up with something that I’m pretty proud of. The basic idea of our approach was to look at the slope of the relationship between a predictor and the number of functions ≥ some threshold of their maximum – but then do it for lots of thresholds. Why is it important to looks at lots of thresholds? Well, if you look at the lines for each choice of threshold, you get something like this:

lineplot

Note, that’s from the multifunc R package vignette, and its the analysis from our paper.

Anyway, you can see how the slope and intercept change with different threshold choices. We eventually looked at how slope changes with threshold, and used that to divine a fingerprint of multifunctionality. But, a plot of threshold v. slope – it’s kind of abstract, and can be hard to parse. It was the best we could do, though, as we thought and thought about it.

While working on a recent analysis, I began to wonder – one of the key numbers we want is something like, how does multifunctionality change with the addition or removal of one species? We can look at how one function changes with diversity – but we still don’t have a good something with our predictor on the X-axis. And yet, the question I want to answer is key if we want to think about the consequences of, say, losing species for a multifunctional world.

Discussing this with Jon Lefcheck, I was suddenly struck by something he had done in a figure on a manuscript. He drew a plot like the one I showed above, only, he also put a line across the top at the maximum number of functions observed in an experiment. I noticed that as my eye moved across the plot – from low to high diversity – I could see the color change along the line, indicating that the maximum number of functions were able to hit progressively higher levels of function. At low levels, nothing was able to hit any threshold. At high levels, a few were. So…one could in theory plot diversity against the highest threshold that all of the observed functions could hit all at once.

Lines drawn on top of the same figure for 5 functions and for 2 functions. Let your eye wander along them and note the change in color. Also, note the weird optical illusion at F=2. Yes, that line is actually straight.

Lines drawn on top of the same figure for 5 functions and for 2 functions. Let your eye wander along them and note the change in color. Also, note the weird optical illusion at F=2. Yes, that line is actually straight.

Moreover, if I were to, say, draw a line at a lower number of functions, I would see the same pattern – but the relationship would change. Now, higher thresholds could be reached by a lower number of functions – but, eyeballing it, it looked like the linearity changed. In some ways, it made me think of, in the BEF literature, if we have only one function, we get a saturating curved relationship between diversity and function. But for multifunction, a few of us have long wondered if we might get a more linear relationship.

So, for the German example, I decided to whip up some simple code to explore the relationship between diversity, number of functions, and maximum threshold those functions can achieve. I used the fitted model, and then just calculated which thresholds could achieve some number of functions at a given level of diversity, and grabbed the maximum. The results are interesting.

newmf

You can see that fewer functions can simultaneously achieve a higher threshold – this is predictable. But there’s a suggestion that the curvilinearity of the relationship switches from linear to concave-up as more functions are considered. That’s it’s linear to begin with is notable, as with few functions I would expect more concave-down-ness. And you kinda get that if you run it down to one function, but, this site in general in earlier papers didn’t have an incredibly strong saturating relationship compared to some other classic examples.

Overall, while it takes a bit of a moment to realize what’s going on, I think this is a far more interpretable graph than what we presented in the paper. I haven’t subjected it to the same in-depth can-this-be-fooled simulations that we did for the MEE paper, but, I have to admit…I kind of like this, and think it might be the answer we tried to get at oh so long ago.

A Kelpy Time Machine

This is x-posted from my guest post at Deep Sea News about the new Floating Forests citizen science project I’m part of. If you’re excited about it, go check out http://www.floatingforests.org/!

avatar_kelpThere’s something I’ve always wanted. Something that would take kelp forest science to a new level. Something that would let me do the kind of work that I dream about, quietly, secretly, peering into the mysteries of kelpiness.

It is something impossible.

It is a time machine.

And now I have one. But with a catch. For you, my dear friends, are its pilot.

OK, OK, first, why use a time machine as a tool of science, and not, I don’t know, to right wrongs and make the world a better place? Well, first off, temporal paradoxes, y’all. Come on. Be realistic. Has Doctor Who taught us nothing? Or decades of Star Trek? Or Primer? (seriously, see that last one, as it will hurt your brain in a good way).

But then, why, kelp? What? Hear me out.

There are so many things that we as marine biologists want to know about the state of ecosystems in the past. For many things, we hardly ever have records that are older than ten usually for just one small area. This is really problematic – especially for something like Giant Kelp.

You see, Giant Kelp (that’s Macrocystis to you!) is kind of the bad-ass of the algae world. It’s pretty damn huge – up to 60m long in some places – and grows up to a foot or two a day. It’s so damn big, we can see it from orbit1.

https://seaweedindustry.com/

Photo courtesy of the Seaweed Industry Association

It’s also everywhere – from Alaska to Baja to South Africa to much of South America to New Zealand to the sub-Antarctic islands. And, according to the Lane et al. phylogeny, it’s all one species2. It gets *around*. And wherever it is, it feeds, houses, and nourishes much of the life in the sea around it.

So, where does the time machine come in, and why are you piloting it?

Simply put – Keeeelllppp Innnnnn Spaaaace!

The Landsat family of satellites have been orbiting earth, taking photos of the whole globe twice a month since the early 1980s. Photos where we can see kelp. They are a time machine we as scientists can use to go back and see how kelp has changed over thirty years.

Can we see the near-extinction of giant kelp from Tasmania? Can we see it moving around the coast of South Africa? Has it been walking away from the warming equator? Is there more or less now than there was in the past?

OK, this sounds pretty cool. But where do you, our time-traveling pilots come in?

Basically computers suck.

OK, no, really, we love them. *pets shiny Apple laptop* BUT – they cannot tell kelp from waves or clouds in the Landsat images. My collaborators (the real remote sensing part of this team) have tried. It’s a no-go. We actually need people – kelp hunters, if you will, to peer at the millions of images of potential kelp habitat and help us discover these Floating Forests.

We need you.

Working with the Big Momma of online citizen science, Zooniverse, my collaborators from the Kelp Ecosystem Ecology Network (KEEN) and I have created an online citizen science project called Floating Forests. In it, we ask you to take a look at images. Help us cull out bad ones (they’re satellites – they take pictures cloudy cloudy rain or (often still cloudy) shine, or take pictures of more than we think.

And if you see kelp, circle it!

We'll give you images of coastline - like this July 2001 picture of Santa Cruz on the left. You circle the kelp beds, and that yields data about kelp abundances, as you can see on the right.

We’ll give you images of coastline – like this July 2001 picture of Santa Cruz on the left. You circle the kelp beds, and that yields data about kelp abundances, as you can see on the right.

It’s really quite meditative to see images of the land and see whisk by, pausing periodically to lasso a green bed of kelp.

Maybe you’ll see a wave-swept vista of Point Lobos. Maybe you’ll see the Googleplex (I did – by accident). Maybe you’ll see somewhere you’ve never dreamed of traveling, but with that big kelp bed sitting there, you know the destination of your next dive trip.

So, please, go hop into the pilot’s stick of this magical kelpy time machine. Poke around. And ask questions! We scientists are there to talk to you! What are you waiting for?

Yes, this is the entrance to a time machine.

Yes, this is the entrance to a time machine.

1.Cavanaugh, K., D. Siegel, D. Reed, and P. Dennison. 2011. Environmental controls of giant-kelp biomass in the Santa Barbara Channel, California. Marine Ecology Progress Series 429:1–17.10.3354/meps09141
2. Macaya, E. C., and G. C. Zuccarello. 2010. DNA barcoding and genetic divergence in the Giant Kelp Macrocystis (Laminariales). Journal of Phycology 46:736–742. 10.1111/j.1529-8817.2010.00845.x

The Beatles and Kelp

For a blog post somewhere else, I’m trying to re-work the Beatle’s classic Kelp!, er, I mean, Help!. But my first draft got all urchin-y. I love it, though, so I thought I’d post it here.

KELP! (from the perspective of a growing urchin)

Kelp! I need some algae.
Kelp! not just any algae
Kelp! You know I need some kelp!

When I was younger (So much younger than) so much younger than today
(I never needed) I never needed any growing kelp in any way
(Now) But now these days are gone (These days are gone), I don’t eat just diatoms
(I know I’ve found) Now I find I’ve changed my mind and opened up my jaws

Kelp me if you can, I’m hungry now
And I do appreciate kelp being ’round
Kelp is great food for me oh right now
Won’t you please, please kelp me

(Now) And now my spines have changed in oh so many ways
(My Aristotle’s lantern) My Aristotle’s lantern is big and ready to graze
(And) And every now (Every now and then) and then I feel so insecure
(I know that I) I know that I just need kelp like I’ve never needed it before

(chorus)

When I was younger (So much younger than) so much younger than today
(I never needed) I never needed any growing kelp in any way
(Now) But now these days are gone (These days are gone), I don’t eat just diatoms
(I know I’ve found) Now I find I’ve changed my mind and opened up my jaws

Kelp me if you can, I’m hungry now
And I do appreciate kelp being ’round
Kelp is great food for me oh right now
Won’t you please, please kelp me, kelp me, kelp me, ooh