Groupapalooza: Adapting Food Web Trophic Group Methods for Defining Bacterial “Species”

The following is some notes on a technique I’m developing for a cool collaboration between me, Jen Bowen, and David Weisman. I think it has some generality to it, and I’d love any feedback from the more mathematical crowd…I also wrote it to make sure I knew what I was doing – translating scribbled equations to code to results – so it does freeflow a bit. It may change based on feedback – consider this a working document.

So. Away we go.

What do food webs and determining the identity of bacterial species based on sequences and co-occurrence data have in common? How can bacterial ‘species’ advance basic food web research?

Networks. And AIC scores.

Let me explain.

I’ve long been a huge fan of Allesina and Pascual’s 2009 paper on deriving trophic groups de novo from food web networks. In short, they say that if you have a simple binary network (a eats b, or a doesn’t eat b), you can use information theory to determine trophic groups within a network. I’ve applied their methods in the past to kelp forests, and seen some interesting things, and “>Ed Baskerville has a great paper on using the technique for Seringetti food webs.

So how does this connect to bacteria?

I’m working on an analysis where my collaborators have surveyed bacterial communities at a number of different sites. We want to know the abundance of different species at different sites. However, how to define a bacterial ‘species’ is a tricky question. OK – let me poorly explain my understanding of bacterial taxonomic definitions (don’t kill me, Jen!) Let’s say you amplify and sequence a sample. You may get a number of different representative sequences from that sample. And you can get a measure of the abundance of each sequence type.

Now, on to species – looking at any pair of sequences (looong sequences of many base pairs), you may find two that are, say, one base pair different from each other. Are these two ‘sequences’ independent species or not? What if they differed by 2 base pairs? What about 3? 4? Now, a researcher can define an ‘operational taxonomic unit’ or OTU by all sequences that are X% different from each other – and X is up to them. Thus, once you define your percent similarity, you can sum up all of the species in each OTU, and get the abundance of each “species” in each plot.

This is somewhat unsatisfying. I mean, what if you had two sequences that were 98% similar, but all of sequence A was in one plot, and all of sequence B was in another plot. Now you tell me – is this one species or two?

Let’s take that one step further. Let’s suppose A and B are both in a plot. But sequence A has 10x the abundance of sequence B. Furthermore, in a second plot, both are present, but sequence B is 10x more abundant. Again, one species or two?

The approach I want to lay out here answers this using a slight modification of Allesina and Pasqual’s framework. Namely, we’re going to look at patterns of association, sequence similarity, and abundances to define OTUs.

The Association Part
At the core of Allesina and Pasqual’s framework is the following observation. Let’s say you are dealing with a food web. You’ve got all sorts of directed connections of species A eating species B. Now, let’s say you want to define two trophic groups. Definitions of predator, prey, etc., are not important here. Just that in each group, you’ll have one set of species that eats species in the other group, and vice-versa. Like in this diagram:

So far, so good, yes? Now, the question is, which of these is a better is a better descriptor of the structure of the network, after penalizing for complexity. I.e., we want a general schema. Is the amount of information lost by grouping things a-ok, given that we’ve reduced the complexity of out model of how the world works?

A&P derive a wonderful formula for this. It involved two pieces. First, for each A -> B connection between groups we’ve made, we can derive a probability of producing that particular graph with those species assigned into exactly those groups. L(ab) is the number of links going from species in A to species in B, and S(i) is, say, the number of species in group i. If we define p(ab) as L(ab) / [S(a)S(b)]. The probability of a given chink of the network – say, A -> B – given p(ab) can be defined as

p(network | p(ab) = p(ab)L(1-p(ab))^S(a)S(b) – L

Which implies that the likelihood of p(ab) given the network is the same.

Likelihood (p(ab) | network) = p(ab)L(1-p(ab))^S(a)S(b) – L

or

Log-Likelihood = L*log(p(ab)) + (S(a)S(b) – L)*log(1-p(ab))

Cool, right?

Let’s call one of those LLs, L(a->b). Now, the Log-Likelihood of a given network configuration with groups is just

LL(all p(ij) | whole network) = LL(a->b) + LL(b -> a) + LL(a -> a) + LL(b -> b)

where LL(a->b) is one of those log-likelihood calculations above. We’ll call this LL(network) for future use.

Now, what about this comparison and penalty for complexity? Here’s where things get even better. We know that there are S total species, and k^2 probabilities, where k is the number of groups. So, voila, we have an AIC for a group structure’d network

AIC = -2 * LL(network) + 2S + 2k^2

and as each AIC for each configuration captures information about information lost by a particular network, we can directly compare different grouping schemas. Note that the AIC for the baseline network is just 2S + 2K^2.

So what does this have to do with bacteria?!?!

OK, ok, hold your horses. Let’s think about sequences and their associations with a site as a link. Let’s consider both sequences and sites as nodes in a network. So, if one sequence associates with one site, that’s a directed link from sequence to site. It’s a bipartite graph. Now, instead of searching through all possible group structures, our groups are defined by OTUs that are created from different levels of sequence similarity. We can calculate the LL for each group -> site association the same as we calculated the LL for A -> B before. The difference is, however, that there are fewer probabilities over the whole network. Instead of there being k^2 probabilities, there are k*r where r is the number of replicate plots we’ve sampled. So

AIC = -2 * LL(OTU network) + 2S + 2k*r

The beauty of this approach is that instead of having to search through group structures, we have 1 grouping per degree of sequence similarity. Granted, we can have tens of thousands of groups, so, it’s still a moderately heinous calculation (go-go mclapply!), but it’s not so bad.

But, what about that abundance problem?

So, until now, I’ve been talking about binary networks, where links are either 1 or 0. As far as I know, no one has derived a weighted-network analog of the A&P approach. On the other hand, here, our network weights are real abundances. Because of this, we can calculate a Likelhood of species with some set of abundances in a plot being part of the same group. Then,

LL(OTU group A -> 1 Plot) = LL(network) + LL(sequences having the observed pattern of abundances in that plot if they are in the same group)

I’m making this jump from the
probability of species in one group being in that group and connecting to one plot = probability of species connecting to plot * probability of species having that pattern of densities.

p(network & abundance) = p(network) * p(abundances)

OK, so, how to we get that p(abundances) aka L(parameters | observed abundances)?

I’m going to throw out a proposal. I’m totally game to hear others, but I think this is reasonable.

If two sequences are indeed the same OTU, they should respond in similar ways to environmental variation. Thus, you should have an equal probability, if you were to sample random individuals from a group in one plot, of drawing either species. So, in the figure below, on the left, the two sequences (in red), even though they both associate with this one site, are different OTUs. Or, rather, it is highly unlikely they are from the same OTU. On the right, they are likely from the same OTU.

This is great, as we now have a parameter for each group-plot combination: the probability of drawing and individual with one of the sequences within a group. And we’re defining that probability as 1/number of sequences in a group. It’s rolling a dice. And we’re rolling it the number of times as we have total ‘individuals’ observed. So, for each sequences, we have a probability of drawing it, and a number of dice rolls…and we should be able to calculate a p(sequence | p(i in j in plot q)) which is the same as Likelihood(p(i in j in plot q) | sequence). I’ll call like Likelihood(abundance ijq). Using a(iq) as the abundance of species i in plot q and A(jq) is the abundance of all species in group j in plot q and S(jq) is the number of sequence types in group j in plot q

Likelihood(abundance ijq) = dbinom(a(iq) | size=A(jq), p=1/S(jq))

Log that, sum over all species in all plots, and we get LL(abundance).

We’ve added 2*k*r more parameters, so, now,

AIC = -2 * LL(OTU network) -2 * LL(OTU abundances) + 2S + 4k*r

Aaand…. that’s it. I think. We should be able to use this to scan across all OTU structures based on sequence similarity, calculate an AIC for each, and then use the OTU structure with the smallest AIC as our ‘species’.

Now, we could of course add additional information. For example, what if we knew some environmental information about plots, etc. We could probably use that to create groups of plots, rather than just use individual plots.

I also wonder if this can be related to a more general solution for weighted networks, and get back to A&P’s original formulation for food webs. Perhaps assuming that all interaction strengths are drawn from the same distribution with the same mean and variance. That should do it, and be relatively simple to implement. Heck, one could even try different distributional assumptions.

References
Allesina, S. & Pascual, M. (2009). Food web models: a plea for groups. Ecol. Lett., 12, 652–662. 10.1111/j.1461-0248.2009.01321.x

Baskerville, E.B., Dobson, A.P., Bedford, T., Allesina, S., Anderson, T.M. & Pascual, M. (2011). Spatial Guilds in the Serengeti Food Web Revealed by a Bayesian Group Model. PLoS Comp Biol, 7, e1002321. 10.1371/journal.pcbi.1002321

Scallopocalypse

Everyone has been pretty shocked by the devastation wreaked by Sandy. Here in New England, we also got a Nor’easter following a few days later. That’s a lot of intense storm action in a short period of time.

So I was quite curious as I ventured out into the field last weekend to see how things looked. I went on a potential field site scouting trip to UMB’s field station in Nantucket. Nantucket of course got a good dose of Sandy, although it largely passed southwest. The Nor’Easter may have been worse.

What I found while just walking about on the shoreline was pretty incredible. It was Scallapocalypse.

Let me include a video here of what one saw looking across the beach so you can get a sense of what was going on.

This was taken in Madaket. It was a bit more dramatic in other parts of the island – because scallop fishermen had come on shore, scooped up the scallops (many of which were the seed for next year, and too small for now) and taken them back out to the scallop grounds. Here’s what things looked like by the lab.

All over, the scallop grounds had come to shore.

But the huge flux of biomass onto shore was impressive. And it wasn’t just scallops, but a ton of seagrass as well, much of which was matting over fringing salt marshes.

Still, the huge amount of energy and nutrients coming into the shoreline ecosystem driven by storms gave me a lot of pause. I mean, those scallops that weren’t saved did end up in the coastal foodweb. Birds were definitely looking fat and happy, and we’d find piles like this with flocks of birds nearby:

The whole thing really got my brain going, with two big questions

1) So, what is the fate of all of this influx of stuff into the shoreline? How will the influx of energy alter the structure and dymaics of the food web? Will the smothering of the marsh matter? It is winter, when things are slower. How quickly will everything be decomposed? Will the effects be lagged until the springtime? Or will they affect the system now? I think of Gary Polis’s work on how food web structure is shaped by the influx of energy on small islands. I know this is a BIG island, but, still, the point stands, this is a big flux of biomass and nitrogen. And it’s not just plant matter, but animal protein.

2) How will climate change alter the frequency of this subsidy? What would the consequences of a regime with regular small subsidies and occasional big ones versus regular big subsidies be? This stems largely from my thinking about the increase in the size of the ‘largest storm of the year’ in California coastal systems that’s been the basis of my previous work. But, models and analysis from the Knutson group seem to show that, while hurricanes and cyclones in the Atlantic aren’t getting more frequent, the size of each one is getting bigger. So, similar pattern. If small subsidies are coming in every year now due to the occasional passing hurricane or Nor’easter, but the size of those same storms in the future is going to get larger, then having this kind of big Scallapocalypse/subsidy could get more frequent. Particular as northern Atlantic waters get warmer (which they are – Nixon 2004), this could be an interesting and perhaps not so well investigated climate effect – the increased strength of coupling between marine and terrestrial food webs.

Oh, and random 3) What role will invasive algae play in increasing the impacts of storms on the amount of material coming on land? This may lead nowhere, but I noticed a lot of material (not scallops) that had washed on land had the invasive Codium fragile attached to it. I know that subtidal kelps can do this to mussels as well (Witman’s work), but there’s no kelp here. Is Codium becoming a drag (har har) and increasing the energy and nutrient flow from sea to land?

All in all, an interesting trip with a lot to chew on for future research. And a great setting!

Refs
Knutson, T. R., J. L. McBride, J. Chan, K. Emanuel, G. Holland, C. Landsea, I. Held, J. P. Kossin, A. K. Srivastava, and M. Sugi. 2010. Tropical cyclones and climate change. Nature Climate Change 3:157–163.

Nixon, S. W., S. Granger, B. A. Buckley, M. Lamont, and B. Rowell. 2004. A one hundred and seventeen year coastal water temperature record from Woods Hole, Massachusetts. Estuaries 27:397–404.

Polis, G. A., and S. D. Hurd. 1995. Extraordinarily high spider densities on islands: flow of energy from the marine to terrestrial food webs and the absence of predation. Proceedings of the National Academy of Sciences, USA 92:4382–4386.

Polis, G. A., W. B. Anderson, and R. D. Holt. 1997. Toward an integration of landscape and food web ecology: the dynamics of spatially subsidized food webs. Annual Review of Ecology and Systematics 28:289–316.

Witman, J. D., and T. H. Suchanek. 1984. Mussels in Flow – Drag and Dislodgement by Epizoans. Marine Ecology Progress Series 16:259–268.

Does Synthesis Ecology Exist as a Scientific Discipline?

Does Synthesis Ecology exist? Is it a discipline? If so, what is it? If not, why not?

As a part of the Trends in Ecological Analysis and Synthesis symposium here at NCEAS, several postdocs past and present organized by Jennifer “Firestarter” Balch got together and sent this survey to the last 15 years of NCEAS postdocs. The survey asks what current and former NCEAS postdocs thought were the most important contributions in Synthesis Ecology and what they thought were the most exciting future directions in Synthesis Ecology.

And then a small storm erupted.

While Jennifer modified a definition of Synthesis Ecology from the NCEAS mission statement (“Synthesis Ecology is the integration and analysis of existing data, concepts, or theories to find emergent patterns and principles that address major fundamental questions in ecology and allied fields. “), even amongst the postdocs, no one could agree whether or not Synthesis Ecology existed as a Thing. Was it a discipline? Was it a technique? Would you feel comfortable calling yourself a Synthesis Ecologist? What is it?

Even amongst the authors on the analysis of the survey, there was little agreement. We sat down one morning, a group of current and former NCEAS postdocs, and tried to hash this issue out. Amusingly, the room was divided, largely along generational lines, as to whether it was or was not a field. We argued it around for a while, posing different definitions and finding little agreement.

Really, there are more questions and points of reflection than answers. Here are some relevant points that I pulled from our conversation. They’re what I latched on to, and are even argued amongst the participants in the group, so, no answers here.

• What is a Field Of Science? The definition I threw out that everyone seemed comfortable with was that a field is a unique way of asking and answering questions about the world. The confluence of Asking and Answering is key. A methodology is just a way of answering.
• Does a Field need to have a unique theory associated with it? Or not?
• By analogy, how is Genomics a field? Why is Genomics not just a technique or methodology within Genetics? Similarly, Geography has had this debate about Geographic Information Science and, indeed, has emerged as its own field. Also on the same line, Molecular Biology – a field we are all well familiar with has gone through the same set of questioning.
• One objection was that Synthesis Ecology doesn’t have a single field system – it is a collection of techniques that answer larger questions. And yet, is that not similar to Theoretical Ecology? How is one a discipline and the other not?
• If it is a field, a defining emergent characteristic MUST be the crossing of disciplinary boundaries – either within ecology or outside of ecology

So, I wish I could say I had an answer for you.

OK, that’s a cop-out – I do have my own answer (Not reflective of the group! In fact, I hope they have some pointed answers and counterpoints to this!). Yes, I do think Synthesis Ecology is a field. Synthesis Ecology is the field in ecology defined by the combination heterogeneous streams of data & concepts to ask and answer questions underpinned by either ecological theory and/or application that cannot be addressed by any single investigation or dataset.

OK, after pondering THAT and the above points and thinking about the pieces you’ve read over the last 15 years, I open this discussion to you: Is Synthesis Ecology in and of itself a field? And please, be polite!

Update: See also Karen McLeod’s excellent post, Beyond crunching data: The power of ideas

A Need to Understand Climate Change’s Indirect Effects

We know that warming, storms, drought, acidification, and the myriad of other effects of climate change will impact natural ecosystems. Most of our studies have concentrated on direct effects, though. For example, if you change temperature, you alter herbivore grazing rates. But what about indirect effects? For example, I’ve found that increased intense storm frequency may remove kelp which will have an indirect effect on the structure of kelp forest food webs.

So, I did a little experiment. I went to Web of Knowledge and searched the following term: “climate change” AND “impact”. I got 21,310 entries. Then I searched again using this query: “climate change” AND “impact” AND “indirect effect”.

The search returned 35 entries.

Surely, this must be a mistake. So instead of “indirect effect” I went with just “indirect”. 506. Better. If I took out the word impact I went up to 1,202. So, at maximum, 5.6%.

OK, maybe this was because I was looking at EVERYTHING. So I filtered it down to just Environmental Sciences and Ecology. “climate change” AND “impact”: 9,248. “climate change” AND “impact” AND “indirect”: 173. Removing impact got me to 689. Only 7.5%.

I’m guessing there are other careful ways of filtering, but, either way, I’m pretty surprised that even at this point, the study of the indirect of climate change still accounts for so little of our knowledge. Pretty interesting. Although I’m heartened by the fact that this literature seems to be increasing exponentially.

Food Web Structure and Changing Diversity at Two Levels

This is part of a larger series of open notebook posts about how food web structure modifies the effects of predator extinctions. For an introduction and list of other posts, see here.

OK, last but on two-level food webs for the moment. I’ve examined how food web structure can change the effects of predator or prey extinctions on both top-down and bottom-up control. A number of folk (including me) have theorized that changes in diversity at two trophic levels should interact – that the consequences of predator diversity loss should change as prey species are lost.

So bearing in mind our master food web and the little 2-level sliver of it we’re thinking about, let’s interrogate this idea. (And yes, that use of the word interrogate goes out to Scott Richmond).

Let's zoom in on one part of the general food web

Thinking about Extinction in Our Framework Thus Far

Thinking about what I’ve put together thus far, I’m not so certain that changing diversity at two trophic levels should influence predation or energy transfer beyond our understanding what’s happening with one trophic level. The simple probabilistic equations that I’ve shown to describe energy transfer and predation both rest on thinking about the average consequences for individual species. Each equation rests on taking a mean value of the probability of, say, being eaten over all prey when predators go extinct. If prey are going extinct as well, that should’t affect the outcome.

Why? Think about it this way. Let’s say you have a food web of 3 predators and 3 prey, and each trophic level is losing one species. For prey species a, the probability that it will be eaten does not change. This is because implicit in asking the question of what are the consequences of extinction for species a, we are asking what are the consequences for species a in all food webs in which it exists. Thinking further, what is, say p(eaten) for a species that does not exist? It’s not 1, but it cannot be 0 either. We just don’t think about it. This argument works as well when thinking about energy transfer.

So, I’d argue, that to understand p(eaten) we simply use the equations derived to understand p(eaten) under predator loss and to understand p(energy) we use the equations derived to understand energy transfer under prey species loss.

Well that chain of logic is uncomfortable. I don’t like where it led at all. I guess tacitly it suggests that maybe the variance of p(eaten) and p(energy) should somehow change… But I haven’t so much thought about variance other than thinking it would work in a similar way to means. Maybe I’m missing something. What is the proper way to calculate variances here? How do simultaneous extinctions affect this variance?

Still, even for the mean value of p(eaten) I’m no so sure. Let’s go draw some webs and see if this plays out.

Webs show that my Logic is Correct. Great.

Let’s start with our 2 predator, 2 prey food web with 1 of each going extinct.

Aaaaand – yeah, those results match exactly with what would be predicted from p(eaten) and p(energy) looking at predator or prey loss independently. The variance is larger – doubled, actually (from 0.125 to 0.25). Interesting. What about something more radical, say, a 3 predator, 3 prey web with 2 predators and 1 prey going extinct.

Yup, still the same as the single-level results, although, here the variance only increases slightly (by a factor of 1.03125).

So, clearly, the single-level results are true for the mean. The variance is still…yeah, I don’t quite have that figured out.

Comparison with the Experimental Literature

So, this result, that you can predict the average effects of changing diversity at two trophic levels at the same time by looking at the results for changing diversity of just one trophic level – does it agree with the experimental literature? Let’s think about one of my favorite examples – Lars Gamfeldt’s excellent 2005 Ecology Letters piece.

In this paper, Lars (LARS!) shows that he wishes I was working on the paper we are collaborating on rather than writing this entry.

Sorry, rather, Gamfeldt shows that prey and consumer species richness can interact. The key quote from the abstract is “…prey richness did not increase resistance to consumption when consumers were present. Instead, our results indicated enhanced energy transfer with simultaneous increasing richness of consumers and prey.”

I find this heartening. Here, p(eaten) was determined by consumers, as predicted. The second statement is curious as well and hearkens to Figure 4 of the paper where total biovolume (predators and prey) is clearly the highest when all 3 predators and prey are present. This is clear evidence that energy transfer into this food web is at its highest here. It drops, though, as consumer richness, but not prey richness, changes. Which, actually, we’d predict based on our in initial examination of energy transfer in the presence of predator loss alone. So…Gamfeldt’s results do appear to echo what I’ve shown here. And for anything with less than 3 consumers shows a consistent relationship for producer loss.

Ah ha. So…I admit, intuitively, I still think that under loss at both levels, p(energy) and p(eaten) should be products of the results from both the prey and predator equations together. But they don’t appear to be (otherwise for the 2-2 web with 1 loss at each level, we’d have p(eaten) and p(energy) = 0.5625). Hrm. This bears more thinking – at least for p(energy) why one does not have to incorporate diversity at both trophic levels. Clearly there’s something a little more complex that needs to be represented in a general equation for p(energy | Er, Ep) though. And likely p(eaten) as well. Hope to come back to that later.

That, and I’m starting to (unsurprisingly) see that some meta-analysis to compare predictions to observed results is going to be necessary, and that figuring out the right metric is going to be non-trivial.

Prey Loss in Different Food Web Structures: We’ve Been Here Before

This is part of a larger series of open notebook posts about how food web structure modifies the effects of predator extinctions. For an introduction and list of other posts, see here.

OK, only two more entries (I think) on simple two-level food webs before we jump into the great unkown (and you’ll see how unknown it is). So far I’ve been talking about the consequences of losing predator species for predation and energy transfer. But, what about losing prey? And what about losing both? In this entry, I’m going to show that we already know how to think about prey loss and food web structure. We just have to stand on our head. So, keeping our “Master Food Web” in mind, and that we’re zooming on on a particular component, let’s think about loss of prey.

Let's zoom in on one part of the general food web

Who will be eaten?

First, the obvious nice result. If prey go extinct, this does not change the probability that they prey trophic level will be under control. No predation links have been deleted. Therefore, p(eaten) is still 1.

Huh? Yeah, really. Think about it. This is an obvious answer, but, well. It’s a rather nice one!

What’s the probability that energy will get to predators?

So, energy transfer is the thing to focus in on. Sure, energy will still enter the prey trophic level, but the probability of energy getting to some of the predators after some prey go extinct may now be 0. Oops!

The wonderful thing is, p(energy) can be defined using exactly the same framework as p(eaten). We just have to stand on our heads. Let’s first look at a familiar 2-predator 2-prey web with 1 extinction.

It’s quite similar to what we’ve seen before with predator loss with a mean p(energy) of 0.75 across all prey extinction scenarios. However, what’s different is that we’re now interested in what PREDATORS have no prey, not what prey have no predators. This implies that we can merely flip the equations from before, replacing predators and prey as follows.

Let’s assume Er extinctions of our resource species (i.e., prey), Sr is our maximum resource species richness, and Di from before now becomes the out degree of predator i – their number of prey. We can simply revisit our earlier framework for the following two equations.

p(energy | Er) = 1-dh(0; Di, Sr-Di, Sr-r)   (4)

$p(energy | E_{r}) = 1-sum_{D=1}^{S_{r}}{p(eaten|E_{r}, D)F(D)}$     (5)

Simple, no? And no extra explanation needed!

A Probabilistic Approach to Predator-Prey Relationships: Worth All of the Hoo-Ha?

This is part of a larger series of open notebook posts about how food web structure modifies the effects of predator extinctions. For an introduction and list of other posts, see here.

So, in my last post (notebook entry?), I introduced a new framework for understanding how the food web structure of a predator-prey food web can affect the consequences of extinction for the ability of predators to control their prey. Nice, no? But two things stick out. 1) Really? I mean, this is pretty and such, but does this match anything we’ve seen in nature? 2) So…what if we don’t know the precise structure of a food web, but, rather, just some statistical properties, like linkage density, or degree distribution? What’s your fancy theory going to do for us now?

So, keeping our “Master Food Web” in mind, and that we’re zooming on on a particular component, I’ll address those two questions with a little simulation, a little probability math, and hopefully have you convinced, excited for more, and coming up with all of your own spin off ideas. Or, you can show me how I’m full of it.

Let's zoom in on one part of the general food web

First, a quick review for those of you who didn’t commit to memory the discussion from last time. We’re interested in knowing the change in the probability of prey being eaten given some number, E, of predator extinctions. I showed that we can calculate the average probability of extinction for each prey item if we know it’s number of predators (Di for prey item i) and the diversity of predators (Sp) using the density function of the hypergeometric distribution (dh) such that

p(eaten | E) = 1-dh(0; Di, Sp-Di, Sp-E)     (1)

We can then average this over all prey to get the average probability of being eaten for all prey. And, remember, if it’s just predator extinctions we’re concerned with, then the probability of energy getting to the predator trophic level is the same as p(eaten).

Verification
Great, so, can we trust this approach? Consider the results of a totally awesome paper by Finke and colleagues where they were able to manipulate diet breadth of three species of parasitoid. Broadly, the found that with all generalists, diversity didn’t matter bupkiss for the control of prey. But with specialists, mixtures of parasitoids were better able to control prey. So, one would predict that as average number of prey consumed per predator increases, the relationship between number of predator species and the probability of prey being eaten should go from being linear to quickly saturating at 1.

So, let’s take three food webs. In web 1, we have all specialists. In web 2, we have one specialist, and two predators that eat two prey. In web 3, we have all generalists.

Based on the insights of Finke et al.’s work, we would predict that in web 1, we should have a nice linear relationship between predator species richness (that’s species left after extinction). In web 2, we should start to see some curvature, as p(eaten) increases at lower levels of predator species richness. And in web 3, the only time we should see a value other than 1 is if there are no predators left – total predator extinction! And, indeed, that’s precisely what we see in the figure below.

And to see this over a much wider range of possible food webs, here’s a figure showing curves for all possible five predator, five prey food webs. I’ve colored the lines for each simulation by linkage density (that’s number of feeding links divided by total number of species). I’m plotting this with E on the x-axis again (brain flip!) for my own clarity. If you want to follow along at home, here’s the simulation code.

As you can see, the higher linkage density, the more quickly saturating the relationship.

Getting Away from Specific Food Web Structures
OK, but, we don’t always know the diet of every single species in a food web. Indeed, we often only know the general statistical properties of a web. Robust though that data may be, how can we incorporate it into our the framework we have here?

Now, Di seems to be the place where statistical properties of a food web may enter the picture. It would be great if we could somehow use the average value of Di, or maybe it’s mean and variance, or the power coefficient, or something. The sticking point is that this is a nonlinear equation, so, Jensen’s inequality says, “Nope!” to just plugging in something in place of Di. However, we can use the knowledge that Di is limited to be equal to or less than Sp, and that Di is discrete. We can then just apply some simple probability math – namely, how to estimate a mean from an arbitrary discrete distribution. Let’s assume that the in-degree distribution of the prey (i.e., their number of predators) follows some statistical distribution, F(D). We can then get the average value of p(eaten) with the following:

$p(eaten | E) = 1-sum_{D=1}^{S_{p}}{p(eaten|E, D)F(D)}$     (3)

So, as long as you know the degree distribution, the number of predators and prey, you can plug in anything you’d like. Simplicity itself, no? You can also use this approach to solve for other moments, such as the variance or more. I like it. OK, onwards to thinking about prey extinction, and then to more hairy territory.

A Probabilistic Look at Predators, Prey, and Extinctions

This is part of a larger series of open notebook posts about how food web structure modifies the effects of predator extinctions. For an introduction and list of other posts, see here.

To begin tackling the question of “How does the structure of a food web influence the consequences of extinction? let’s begin by thinking about how changes in the number of predator species in a simple 2-level predator-prey food web can influence 1) the probability of prey being eaten and 2) the amount of energy transfered to the predator group. Looking back at our “Master Food Web”, we’re zooming in on, say, the consequences of extinction in group E for group C.

Let's zoom in on one part of the general food web

I’m going to begin by thinking about whether we can solve this problem if we know the specific structure of a food web. Later on, I’ll start talking about working from food web network properties such as degree distribution.

The first thing to realize is that we’re talking about probabilities. What’s the probability of a prey item being eaten? What’s the probability of energy getting to the predator trophic level? It’s this probabilistic thinking that’s at the centerpiece of the framework I’m going to lay out.

A Probabilistic Framework for Food Webs and Predation

Let’s establish some ground rules for this framework. For an arbitrary topology (i.e., network structure), we can calculate the probability of an individual prey species being eaten quite simply. If there are any links between a prey item and any predator, that probability is 1. If there are no links, it is 0. Let’s call this p(eaten). Control of a group of prey species can therefore be described as the average value of p(eaten) across the whole group of prey. If everyone has a predator, it’s 1. If no one has a predator, it’s 0. If half of the species have a predator, it’s 0.5. And one should be able to calculate variance, etc., from that information.

And just like that, we have a new network metric. p(eaten), which, really, is p(connected to the network by an incoming edge) if you want to get technical.

Before we jump into numbers, let’s look at some examples of how this p(eaten) metric works. First, a simple system with 2 predators, 2 prey. One predator eats one prey. One predator eats two prey. What’s the average p(eaten) if 1 predator goes extinct?

The above figure shows 2 different possible configurations. When the generalist is knocked out, one prey item escapes consumption. p(eaten) for that extinction scenario is 1/2. When the specialist is knocked out, neither prey item escapes consumption. p(eaten) is 1. So, what’s the average probability that a prey item will be eaten if 1 predator is going extinct? It’s just the average of results from the two scenarios: 0.75.

To hammer this home, consider the figure below for a three predator, three prey food web with two specialists and one generalists. I’ve drawn all possible configurations for both the 1 predator extinction and the 2 predator extinction scenarios. Underneath each scenario is its p(eaten) and in bold to the left is the average p(eaten) for that number of extinctions.

A Hypergeometric Approach to Predator-Prey Relationships. Whoah. You said Hypergeometric.

OK, so, the framework should be fairly clear at this point. Now, for that arbitrary topology, what will be the probability that a prey item will be eaten if some number, E, predators go extinct? First, we can calculate this for a single prey item, let’s call it prey item i, if we know the number of predators who eat it – the prey in-degree, Di.

Let’s say there are Sp predators. Some number of them, Di, eat a single prey. We want to know the probability that, if E predators go extinct, of those that remain, what is the probability that none of them are predators of our prey item. This is p(not eaten). And then 1-p(not eaten) = p(eaten) for our little prey item.

OK, so, given all of the possible combinations of predator extinctions, what is p(not eaten)? How do we find it?

This is actually a special case of the hypergeometric distribution. Remember it from intro probability and statistics? No? OK. So. Briefly, let’s say you are drawing balls from an urn. Some are white, some are black. What’s the probability, if you draw X balls, that N of them will be black? This can be expressed as dh(N; # black, # white, X) since we’re using the distributions probability density function. If you want to get into the details of it, see here.

In the case of our prey item, N=0 – no predators are left that eat the prey. The number of draws is the number of species left after extinction: Sp-E. Black balls are predators of our prey item (Di), white balls are those that are, well, not. So, the probability that an individual prey item will not be eaten given E extinctions is

p(eaten | E) = 1-dh(0; Di, Sp-Di, Sp-E)   (1)

To determine the average probability that a prey item will be eaten, we just average this over all prey.

And if you want to get all gory with this, we can actually write down a function for the probability of being eaten given E extinctions:

$p(eaten | E) = 1 - overline{ frac{ {{ S_{p}-D_{i} }choose{ S_{p}-E }} }{ {{ S_{p} }choose{ S_{p}-E }} }}$

N.B. In putting this together, I’ve also realized that we can use a slightly more intuitive general equation (but the gory details version is uglier). That is, rather than thinking about the probability of having 0 predators left after E extinctions, what is the probability of having all Di predators of a prey item removed by extinction. This is still p(not eaten). And it leads to the following very similar, and potentially more understandable, equation. Any preferences in the peanut gallery?

p(eaten | E) = 1-dh(Di; Di, Sp-Di, E)   (2)

A Last Note on Energy Transfer from Prey to Predators

The nice thing about this averaged function is that it is simultaneously both the average probability that the prey trophic level will be under control by predators and the average probability that energy entering the food web via prey will make its way up to the predator trophic level. Basically, p(prey eaten) = p(energy gets to predator).

A Final Note of Wonder and some R Code

So, the thing that amazes me the most about the results above is that it all hinges on the prey. One actually doesn’t need to know anything about the diet of individual predators. Instead, one only needs to know how many things eat each prey item. This makes the framework easy to code up computationally, and easy to similate, as instead of coming up with all possible adjacency matrices, one can just look at all possible combinations of Di given some number of total predators. To demonstrate how this can be nice, here’s the R code I use to calculate p(eaten):

```#Sp is the diversity of the predators
#E is the number of extinctions
#prey.vec is the in-degree (# of predators) of each prey species

pEaten<-function(Sp, E, prey.vec) {
#see ?dhyper for more on hypergeometric distributions in R
1-mean(dhyper(prey.vec,prey.vec, Sp-prey.vec, E))
}

#If you find the 0 predators remaining formulation more intuitive
pEaten2<-function(Sp.max, E, prey.vec) {
1-mean(dhyper(0,prey.vec, Sp.max-prey.vec, Sp.max-E))
}
```

Great, and thus end-eth the big information boom. This is the kind of thinking that will underlie everything as I move forward, so read and digest this. If there is something that isn't clear, let me know. Once you start to think about the probability of connections, I think it becomes a good bit more transparent. I'll talk validation and generalization to statistical network probabilities in my next post.

Food Web Network Structure and Extinction: The Start of an Open Notebook

So, we know that species are going extinct at a pretty stunning rate. Mostly by human activities. The natural question is, will this affect the function of the natural world? You may well say ‘Duh! Of course!’ as a first instinctive pass, but, the issue isn’t so clear cut – will species that survive simply take up the slack? What’s the value of a ‘species’ anyway?

Starting in the late ’90s the field of diversity-function research has tackled this topic, largely using manipulations plant species number. And the results are pretty conclusive – what you change plant diversity, you affect how the natural world works.

But note I said plants.

A bunch of us in the early to mid ‘aughts wondered if changes in the number of top predator species, or herbivores, or intermediate predators, or other species other than plants and algae might also alter the way the natural world worked in an analogous way. Emmett Duffy outlined a number of reasons we could expect changes in diversity at different trophic levels to produce either the same results as changes in plant diversity or maybe not!

So we went out and did the experiments, and found – well, sometimes diversity affected function one way, sometimes another. It all seemed to depend on something about each individual experiment. I was involved in this by examining about whether losses of predator diversity affected the impact of herbivores on their plant or algal prey – so called trophic cascades.

And in looking at the relationship between predator diversity and trophic cascades we really did see every kind of result one could imagine.

Fortunately, there seemed to be some predictability here which can be seen both by comparing different experiments or looking at some of Deborah Finke’s awesome work in a variety of systems. That one could predict the effect of changing levels of diversity if they knew the relative number of omnivores, specialists, or within-trophic level predation (so called intraguild predation). But these insights were all pretty qualitative. There’s no real quantitative guide here as to when diversity will do what.

Leaving this, I went off and did some work looking at how climate change may alter the network structure of food webs. And promptly did a palm-to-the-forehead. Food web network ecology has done a brilliant job of deriving metrics to describe the structure of, well, food webs. And the structure of food webs seems to influence the effects of species going extinct on trophic cascades or any other function one would care to measure.

Clearly, these two fields needed to come together.

So this is what I’m doing for my postdoc here at NCEAS. I am slowly but slowly trying to figure out how to link food web network theory with biodiversity-ecosystem function.

A general food web to consider for other entries in this series.

What’s my goal? Simple. Look at the food web to the left. In it, different trophic levels have different colors. But, heck, even within a trophic level, we may split things into finer trophic groups based on their diet and types of interactions – something we do all the time qualitatively (e.g., by saying there’s a brown food web of detritivores and a green food web of consumers of living tissue) and can even now do quantitatively.

What I want to be able to do is say, let’s take this food web. If we know some structural properties of the web, can we then predict the effects of a change in diversity within any trophic group on the flow of energy and control of consumption within the web.

For example, if some number of species go extinct in F, will consumption of A increase or decrease? If some number of species in C go extinct, will G accumulate more or less energy?

I realize this doesn’t take some important things into account – interaction strengths or the abundance of each individual species. I think the former can be folded in later. I’d also note that I am trying a very different approach than qualitative modeling and think that problems of indeterminacy in predictions may be dealt with by using a probabilistic framework from the start. With respect to abundances – I’m hoping the results can translate into predictions of biomass, but, we shall see…

In the coming weeks, I’m going to try and open up my lab notebook, and lay out the theory I’m developing to answer these sorts of questions. I’ll be honest, I’m doing this for myself as much as anything. I have a lab notebook full of scribbles – some blind alleys, some promising leads. Writing it out will force me to focus my arguments and spot weaknesses (or have you spot weaknesses)! I’ve got some of this nailed, and some of it I may stumble around a bit with. And, heck, I’m always game to hear thoughts from the peanut gallery.

As I answer different pieces of the puzzle, I’ll put links to them in this entry. So, let’s start this open notebook and see where it goes!