I want to know what *YOU* think about review, preprints, and publication

As part of the OpenPub project, we’re soliciting folk to send us videos about their experience with the scholarly publication process. We want to use these to try and crowdfund the development of OpenPub – our preprint server with robust tools for discussion and interaction. Interested? Check out the full request over here and/or email me!

Filtering Out Exogenous Pairs of Variables from a Basis Set

Sometimes in an SEM for which you're calculating a test of D-Separation, you want all exogenous variables to covary. If you have a large model with a number of exogenous variables, coding that into your basis set can be a pain, and hence, you can spend a lot of time filtering out elements that aren't part of your basis set, particularly with the ggm library. Here's a solution – a function I'm calling filterExoFromBasiSet


#Takes a basis set list from basiSet in ggm and a vector of variable names

filterExoFromBasiSet <- function(set, exo) {
    pairSet <- t(sapply(set, function(alist) cbind(alist[1], alist[2])))
    colA <- which(pairSet[, 1] %in% exo)
    colB <- which(pairSet[, 2] %in% exo)
    both <- c(colA, colB)
    both <- unique(both[which(duplicated(both))])

    set[-both]
}

How does it work? Let's say we have the following model:

y1 <- x1 + x2

Now, we should have no basis set. But…

library(ggm)

modA <- DAG(y1 ~ x1 + x2)
basiSet(modA)
## [[1]]
## [1] "x2" "x1"

Oops – there's a basis set! Now, instead, let's filter it

basisA <- basiSet(modA)
filterExoFromBasiSet(basisA, c("x1", "x2"))
## list()

Yup, we get back an empty list.

This function can come in handy. For example, let's say we're testing a model with an exogenous variable that does not connect to an endogenous variable, such as

y1 <- x1
x2 (which is exogenous)

Now –


modB <- DAG(y ~ x1, 
               x2 ~ x2)

basisB <- basiSet(modB)
filterExoFromBasiSet(basisB, c("x1", "x2"))
## [[1]]
## [1] "x2" "y"  "x1"

So, we have the correct basis set with only one element.

What about if we also have an endogenous variable that has no paths to it?


modC <- DAG(y1 ~ x1, 
               x2 ~ x2, 
               y2 ~ y2)

basisC <- basiSet(modC)

filterExoFromBasiSet(basisC, c("x1", "x2"))
## [[1]]
## [1] "y2" "x2"
## 
## [[2]]
## [1] "y2" "x1"
## 
## [[3]]
## [1] "y2" "y1" "x1"
## 
## [[4]]
## [1] "x2" "y1" "x1"

This yields the correct 4 element basis set.

Extracting p-values from different fit R objects

Let's say you want to extract a p-value and save it as a variable for future use from a linear or generalized linear model – mixed or non! This is something you might want to do if, say, you were calculating Fisher's C from an equation-level Structural Equation Model. Here's how to extract the effect of a variable from multiple different fit models. We'll start with a data set with x, y, z, and a block effect (we'll see who in a moment).


x <- rep(1:10, 2)
y <- rnorm(20, x, 3)
block <- c(rep("a", 10), rep("b", 10))

mydata <- data.frame(x = x, y = y, block = block, z = rnorm(20))

Now, how would you extract the p-value for the parameter fit for z from a linear model object? Simply put, use the t-table from the lm object's summary

alm <- lm(y ~ x + z, data = mydata)

summary(alm)$coefficients
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   1.1833     1.3496  0.8768 0.392840
## x             0.7416     0.2190  3.3869 0.003506
## z            -0.4021     0.8376 -0.4801 0.637251

# Note that this is a matrix.  
# The third row, fourth column is the p value
# you want, so...

p.lm <- summary(alm)$coefficients[3, 4]

p.lm
## [1] 0.6373

That's a linear model, what about a generalized linear model?

aglm <- glm(y ~ x + z, data = mydata)

summary(aglm)$coefficients
##             Estimate Std. Error t value Pr(>|t|)
## (Intercept)   1.1833     1.3496  0.8768 0.392840
## x             0.7416     0.2190  3.3869 0.003506
## z            -0.4021     0.8376 -0.4801 0.637251

# Again, is a matrix.  
# The third row, fourth column is the p value you
# want, so...

p.glm <- summary(aglm)$coefficients[3, 4]

p.glm
## [1] 0.6373

That's a linear model, what about a generalized linear model?


anls <- nls(y ~ a * x + b * z, data = mydata, 
     start = list(a = 1, b = 1))

summary(anls)$coefficients
##   Estimate Std. Error t value  Pr(>|t|)
## a   0.9118     0.1007   9.050 4.055e-08
## b  -0.4651     0.8291  -0.561 5.817e-01

# Again, is a matrix.  
# The second row, fourth column is the p value you
# want, so...

p.nls <- summary(anls)$coefficients[2, 4]

p.nls
## [1] 0.5817

Great. Now, what if we were running a mixed model? First, let's look at the nlme package. Here, the relevant part of the summary object is the tTable

library(nlme)
alme <- lme(y ~ x + z, random = ~1 | block, data = mydata)

summary(alme)$tTable
##               Value Std.Error DF t-value  p-value
## (Intercept)  1.1833    1.3496 16  0.8768 0.393592
## x            0.7416    0.2190 16  3.3869 0.003763
## z           -0.4021    0.8376 16 -0.4801 0.637630

# Again, is a matrix.  
# But now the third row, fifth column is the p value
# you want, so...

p.lme <- summary(alme)$tTable[3, 5]

p.lme
## [1] 0.6376

Last, what about lme4? Now, for a linear lmer object, you cannot get a p value. But, if this is a generalizes linear mixed model, you are good to go (as in Shipley 2009). Let's try that here.

library(lme4)

almer <- lmer(y ~ x + z + 1 | block, data = mydata)

# no p-value!
summary(almer)@coefs
##             Estimate Std. Error t value
## (Intercept)    4.792     0.5823   8.231

# but, for a genearlined linear mixed model
# and yes, I know this is a
# bad model but, you know, demonstration!

aglmer <- lmer(y + 5 ~ x + z + (1 | block), 
        data = mydata, family = poisson(link = "log"))

summary(aglmer)@coefs
##             Estimate Std. Error z value  Pr(>|z|)
## (Intercept)  1.90813    0.16542  11.535 8.812e-31
## x            0.07247    0.02471   2.933 3.362e-03
## z           -0.03193    0.09046  -0.353 7.241e-01

# matrix again!  Third row, fourth column
p.glmer <- summary(aglmer)@coefs[3, 4]

p.glmer
## [1] 0.7241

#Scio13 and Beyond

While I’ve been active in using online spaces for scientific activities – blogging, tweeting, crowdfunding, and much much more – for a looong time. I’ve found it’s benefitted me greatly as a scientist. I’ve also formed a deep love for the community I’ve found in the online science world (to name just a few).

And yet, until this year, I’d never been to science online before.

This year, I finally remedied that. And it was indeed amazing. I’ll be posting the notes from my own session on how science online can and has changed the peer review process, but, I wanted to share this picture of (many) of the marine bloggers who were at the conference, and issue a challenge.

First – the challenge.

HEY MARINE SCIENTISTS WHO READ THIS BLOG (you’re quiet, but I see you in my hitlog)!!! I understand that #SciO13 may have been a bit overwhelming for you, or too broad, or something, so you didn’t register. It’s ok. Becoming more engaged with the science online world can seem like a lot. But, aren’t you a little bit curious? Well, if you are, in October, David Shiffman is setting up an amazing opportunity for you – Science Online Oceans. Go read his post, and block that weekend off on your calendar. Right now! Then come to Miami next October, and be prepared to have your world blown open as you interact with a much broader community that will help you realize the full potential of this internet thingamajig and how it can help you as a scientist.

(oh, and everyone reading should consider coming to Science Online 2014 as well)

And now, the picture, which should serve as some extra enticement. It’s only a few of the marine bloggers who were at the conference, so it’s only a small flavor of the awesomeness that was there, and the great connections and conversations that resulted. But I think you get the point.

OceanBloggers at #SciO13

Crowdfunding in the Peer Reviewed Literature

(x-posted at the #SciFund Blog)

ResearchBlogging.orgThe final version of Wheat et al.’s paper Raising money for scientific research through crowdfunding is out in Trends in Ecology and Evolution. A more hefty piece of #SciFund analysis is behind it (slowed down in no small part because of my sloooow processing of new data in fancy models). The Wheat et al. paper is a lovely short piece that Rachel and Yiwei (who crowdfunded the excellent Alaska Predator Research Expedition that has it’s new website over here) were gracious enough to ask Jai and I to participate in. In it, we cover the basics of crowdfunding for the academic sciences – what is it? what are the platforms you might use? what are some strategies for success?

Overall, this is a nice, gentle introduction that you should send to any colleague who either appears interested in crowdfunding, curious as to what it is, or is highly skeptical of the entire enterprise.

So go check it out!

Wheat R.E., Wang Y., Byrnes J.E. & Ranganathan J. (2013). Raising money for scientific research through crowdfunding, Trends in Ecology & Evolution, 28 (2) 71-72. DOI:

Linkage: A field guide to privilege in marine science

As someone who, admittedly, benefitted a great deal from Privilege growing up (it definitely lowered the barriers to my becoming a succesful marine scientist), know that this is true of MANY Ecology and Evolutionary Biology folk, and now think a good deal about how I can lower those barriers for students and mentees that come under my aegis, you should all go over and check out Miriam Goldstein’s A field guide to privilege in marine science: some reasons why we lack diversity.

A Quick Note in Weighting with nlme

I’ve been doing a lot of meta-analytic things lately. More on that anon. But one quick thing that came up was variance weighting with mixed models in R, and after a few web searches, I wanted to post this, more as a note-to-self and others than anything. Now, in a simple linear model, weighting by variance or sample size is straightforward.

#variance
lm(y ~ x, data = dat, weights = 1/v)

#sample size
lm(y ~ x, data = dat, weights = n)

You can use the same sort of weights argument with lmer. But, what about if you’re using nlme? There are reasons to do so. Things change a bit, as nlme uses a wide array of weighting functions for the variance to give it some wonderful flexibility – indeed, it’s a reason to use nlme in the first place! But, for such a simple case, to get the equivalent of the above, here’s the tricky little difference. I’m using gls, generalized least squares, but this should work for lme as well.

#variance
gls(y ~ x, data=dat, weights = ~v)

#sample size
gls(y ~ x, data = dat, weights = ~1/n)

OK, end note to self. Thanks to John Griffin for prompting this.

Scallopocalypse

Everyone has been pretty shocked by the devastation wreaked by Sandy. Here in New England, we also got a Nor’easter following a few days later. That’s a lot of intense storm action in a short period of time.

So I was quite curious as I ventured out into the field last weekend to see how things looked. I went on a potential field site scouting trip to UMB’s field station in Nantucket. Nantucket of course got a good dose of Sandy, although it largely passed southwest. The Nor’Easter may have been worse.

What I found while just walking about on the shoreline was pretty incredible. It was Scallapocalypse.

Let me include a video here of what one saw looking across the beach so you can get a sense of what was going on.

This was taken in Madaket. It was a bit more dramatic in other parts of the island – because scallop fishermen had come on shore, scooped up the scallops (many of which were the seed for next year, and too small for now) and taken them back out to the scallop grounds. Here’s what things looked like by the lab.

All over, the scallop grounds had come to shore.

But the huge flux of biomass onto shore was impressive. And it wasn’t just scallops, but a ton of seagrass as well, much of which was matting over fringing salt marshes.

Still, the huge amount of energy and nutrients coming into the shoreline ecosystem driven by storms gave me a lot of pause. I mean, those scallops that weren’t saved did end up in the coastal foodweb. Birds were definitely looking fat and happy, and we’d find piles like this with flocks of birds nearby:

The whole thing really got my brain going, with two big questions

1) So, what is the fate of all of this influx of stuff into the shoreline? How will the influx of energy alter the structure and dymaics of the food web? Will the smothering of the marsh matter? It is winter, when things are slower. How quickly will everything be decomposed? Will the effects be lagged until the springtime? Or will they affect the system now? I think of Gary Polis’s work on how food web structure is shaped by the influx of energy on small islands. I know this is a BIG island, but, still, the point stands, this is a big flux of biomass and nitrogen. And it’s not just plant matter, but animal protein.

2) How will climate change alter the frequency of this subsidy? What would the consequences of a regime with regular small subsidies and occasional big ones versus regular big subsidies be? This stems largely from my thinking about the increase in the size of the ‘largest storm of the year’ in California coastal systems that’s been the basis of my previous work. But, models and analysis from the Knutson group seem to show that, while hurricanes and cyclones in the Atlantic aren’t getting more frequent, the size of each one is getting bigger. So, similar pattern. If small subsidies are coming in every year now due to the occasional passing hurricane or Nor’easter, but the size of those same storms in the future is going to get larger, then having this kind of big Scallapocalypse/subsidy could get more frequent. Particular as northern Atlantic waters get warmer (which they are – Nixon 2004), this could be an interesting and perhaps not so well investigated climate effect – the increased strength of coupling between marine and terrestrial food webs.

Oh, and random 3) What role will invasive algae play in increasing the impacts of storms on the amount of material coming on land? This may lead nowhere, but I noticed a lot of material (not scallops) that had washed on land had the invasive Codium fragile attached to it. I know that subtidal kelps can do this to mussels as well (Witman’s work), but there’s no kelp here. Is Codium becoming a drag (har har) and increasing the energy and nutrient flow from sea to land?

All in all, an interesting trip with a lot to chew on for future research. And a great setting!

Refs
Knutson, T. R., J. L. McBride, J. Chan, K. Emanuel, G. Holland, C. Landsea, I. Held, J. P. Kossin, A. K. Srivastava, and M. Sugi. 2010. Tropical cyclones and climate change. Nature Climate Change 3:157–163.

Nixon, S. W., S. Granger, B. A. Buckley, M. Lamont, and B. Rowell. 2004. A one hundred and seventeen year coastal water temperature record from Woods Hole, Massachusetts. Estuaries 27:397–404.

Polis, G. A., and S. D. Hurd. 1995. Extraordinarily high spider densities on islands: flow of energy from the marine to terrestrial food webs and the absence of predation. Proceedings of the National Academy of Sciences, USA 92:4382–4386.

Polis, G. A., W. B. Anderson, and R. D. Holt. 1997. Toward an integration of landscape and food web ecology: the dynamics of spatially subsidized food webs. Annual Review of Ecology and Systematics 28:289–316.

Witman, J. D., and T. H. Suchanek. 1984. Mussels in Flow – Drag and Dislodgement by Epizoans. Marine Ecology Progress Series 16:259–268.