How many species? I’m still not sure.

by Greg Mayer

Jerry recently discussed an article in Plos Biology which tried to estimate the number of species on Earth by extrapolating from more or less known relationships between higher and lower level taxonomic diversity. Carl Zimmer has a piece in today’s Science Times, and records some dissent from the method used:

But Terry Erwin, an entomologist at the Smithsonian Institution, think there’s a big flaw in the study. There’s no reason to assume that the diversity in little-studied groups will follow the rules of well-studied ones. “They’re measuring human activity, not biodiversity,” he said.

Terry Erwin initiated quantitative efforts to estimate the number of species in 1982, when, in a brief paper, he estimated there might be 30,000,000 species of tropical arthropods. Erwin’s criticism of the Plos paper is the same as one that I and WEIT reader Mickey Mortimer registered in the comments on Jerry’s piece: taxonomic ranking is conventional (i.e. an agreement among workers on a particular taxon), and too variable among taxa to be used to reliably produce quantitative estimates like this.

Carl Zimmer used a pyramid analogy to explain the method used in the Plos paper: If you know the shape of the top of a pyramid, you can estimate the area of the base. And that’s true. But it depends on knowing how high the pyramid is, what the slope of the faces are, and that the slope is constant (at least at a suitable scale) all the way to the bottom. In the taxonomic case, the height of the pyramid is the number of ranks– this is wholly conventional, and varies among groups; the slope is the relationship between higher and lower level diversity– which we can know for well-known groups, but not poorly known groups; and that this relationship is constant (or at least functionally known) all the way down to species– which may be true in some groups, but not others. And, crucially, the height, the slope, and its constancy may differ among taxa.

The only fact of nature here is the area of the base of the pyramid (i.e. the number of species); the shape and height of the pyramid above are human constructs.

I don’t want to be too hard on Mora et al., because any method of estimating total species diversity is subject to great uncertainty. Discussing Erwin’s estimate for tropical arthropods, E.O.Wilson wrote:

Erwin’s calculations were an important step forward in the study of biodiversity. The explicit figure he arrived at initially, however, is somewhat like an upside-down pyramid balance on its point. At any step on the road to the final total of 30 millian tropical-forest arthropods, the number of species can be shifted drastically up or down by changing assumptions. If the true total is within 10 million of that number either way, it will be sheer luck.

For quite some time, I’ve told classes that the number of species on Earth is somewhere from 3 million to 30 million. I’m going to stick with that range.


Erwin, T.L. (1982) Tropical forests: Their richness in Coleoptera and other arthropod species. The Coleopterists Bulletin, 36, 74-75. pdf

Mora, C., D.P. Tittensor, S. Adl, A.G.B. Simpson, and B. Worm. 2011. How many species are there on earth and in the ocean? Plos Biology 9(8): e1001127. pdf

Wilson, E.O. 1992. The Diversity of Life. Harvard University Press, Cambridge, Mass.


  1. Dominic
    Posted August 30, 2011 at 9:19 am | Permalink

    BBC Radio 4’s Material World has a podcast & they discussed the bacteria fossils WEIT covered as well as talking to one of the researchers who wrote this paper –
    Worth a listen especially for the non-expert.

  2. Torbjörn Larsson, OM
    Posted August 30, 2011 at 9:36 am | Permalink

    I hope that the critique isn’t centered on the idea that an artificial model may not have predictive ability, since any model may. Stratified data can have the darnedest patterns: “It is not known why Zipf’s law holds for most languages.”

  3. Posted August 30, 2011 at 10:22 am | Permalink

    The only fact of nature here is the area of the base of the pyramid (i.e. the number of species); the shape and height of the pyramid above are human constructs.

    I would argue that in the case of asexual species, the area of the base is a purely human construct as well.

    Even in sexually-reproducing species, there’s a lot of ambiguity, but one could at least in principle come up with an objective metric to define species boundaries and then assert that the number of distinct species according to that criteria was a “fact of nature”. (Even then, ring species present a special difficulty…)

    But I think it’s hopeless with asexually-reproducing organisms. Demarcating species in that case will always contain a large degree of subjectivity and convention.

    • whyevolutionistrue
      Posted August 30, 2011 at 11:04 am | Permalink

      Agreed for prokaryotes, and, acknowledged as a problem in metaphytes and metazoa.


      • Dan Gaston
        Posted August 31, 2011 at 5:28 am | Permalink

        Also a problem among Microbial Eukaryotes, although not quite as bad (LGT probably isn’t quite as pervasive) as among Eubacteria and Archaea.

        But I would argue that species definitions period, including for Metazoa and Metaphyta, are just as much a human construct as higher order taxonomic divisions. Just as a model, they tend to reflect the biological reality a little bit closer.

  4. GaryU
    Posted August 30, 2011 at 10:40 am | Permalink

    For quite some time, I’ve told classes that the number of species on Earth is somewhere from 3 million to 30 million. I’m going to stick with that range.

    So, imagine that you’re Adam, in the Garden of Eden, and god’s busy parading all his creations past you so that you can name them. “Oh my god [intentional]! Enough with the beetles already!”

    • whyevolutionistrue
      Posted August 30, 2011 at 11:01 am | Permalink

      Which, of course, calls to mind J.B.S. Haldane’s encounter with an Anglican divine, who inquired, “Tell me Prof. Haldane what attributes of the Creator have you discerned from a study of the Creation?”, to which Haldane replied, “An inordinate fondness for beetles.”


      • Marella
        Posted August 30, 2011 at 3:32 pm | Permalink

        Only god’s one true love is obviously viruses.

        • Dominic
          Posted August 31, 2011 at 1:54 am | Permalink

          He does like to spread that love around!

  5. Posted August 30, 2011 at 11:00 am | Permalink

    One other huge problem with the paper is that it implies (as in the title) that it has given the number of species of all living things, but in fact it only estimates a hilariously crude lower bound for the number of bacteria species. See this blog/website for an analysis:

    The real number of species on earth is certainly millions higher than 9M.

  6. Gregory Kusnick
    Posted August 30, 2011 at 11:12 am | Permalink

    The only fact of nature here is the area of the base of the pyramid (i.e. the number of species); the shape and height of the pyramid above are human constructs.

    Surely the cladogram relating all those species is also a fact of nature, and isn’t that what the pyramid is attempting to model? I grant that the choice of what ranks you assign to the various branch points is somewhat arbitrary, but that doesn’t mean the branches are imaginary or that the fractal dimension or “branchiness” of the tree can’t be estimated.

    Whether Mora’s method produces a reasonable estimate is of course open to debate, but you seem to be arguing that there’s nothing there to measure.

    • Posted August 30, 2011 at 4:16 pm | Permalink

      Unfortunately, the pyramid is often a poor representation of the cladogram. For instance, Mora et al. used the Catalogue of Life database, which groups Annelida (segmented worms) into two classes- Polycheata (sea worms) and Clitellata (earthworms and leeches). So that’s a diversity of two classes, but in actuality recent analyses (e.g. Zrzavy et al., 2009) suggest Clitellata is nested deeply within Polycheata, or in other words clitellates are a subset of polycheates. Even worse, Echiura (containing one class) and Sipuncula (containing two classes) are used as separate phyla, but the same analyses I mentioned show they are also nested within Polycheata. So the branches of the cladogram are at completely different positions than the hierarchy used by Mora et al. would suggest. Instead of two tiny phyla (one containing two classes) and a huge phylum containing one large class and one smaller class, we have one phylum whose internal relationships are much messier. They could be divided into two classes (at whatever the basal split is between one polycheate group and all other annelids) or many classes. You could argue that maybe the lower levels like orders are still correct, but even here using Zrzavy et al.’s figure 2 as an example, four of Mora et al.’s seven polycheate orders are each split into three different groups spread around the tree. This makes Mora et al.’s branches joining each order imaginary, as you put it.

      And annelids aren’t an isolated case here. Just look at the chordate classes used. Reptilia is used, despite some reptiles being closer to Aves than others. Sarcopterygii is used, despite the fact their version is a paraphyletic grade compared to tetrapods, since lungfish are closer to tetrapods than the coelacanth is.

      So the primary data being used doesn’t represent the real tree of life. Instead, it’s more or less our traditional subjective classification based on a combination of diversity, disparity and actual phylogeny.

  7. Posted August 30, 2011 at 1:59 pm | Permalink

    As I was reading Jerry’s original post on this, I was wondering how estimates could be determined based on the human construct of classification.

    I couldn’t help think of genera and species like Ginkgo and others with only one member. How could anything predictable come from comparing genera or even families when such exist? Or do they pose only a slight margin of error that really won’t affect the predictive model?

    • Manny
      Posted August 30, 2011 at 2:09 pm | Permalink

      Huh? Just kidding…great stuff here…first time visitor!

    • ChasCPeterson
      Posted August 30, 2011 at 3:01 pm | Permalink

      The authors at least address this issue:

      Different ideas about the correct classification of species into a taxonomic hierarchy may distort the shape of the relationships we describe here. However, an assessment of the taxonomic hierarchy shows a consistent pattern; we found that at any taxonomic rank, the diversity of subordinate taxa is concentrated within a few groups with a long tail of low-diversity groups (Figure 3P–3T). Although we cannot refute the possibility of arbitrary decisions in the classification of some taxa, the consistent patterns in Figure 3P-3T imply that these decisions do not obscure the robust underlying relationship between taxonomic levels.

      • Posted August 30, 2011 at 5:20 pm | Permalink

        The pattern found in 3P-3T is that most groups contain few subgroups, but a few groups contain many subgroups. That this pattern holds across different hierarchical levels doesn’t indicate the signal is natural. Some of those few groups that contain numerous subgroups (like Passeriformes) are real, but some (like Polycheata) are not. It would often be possible to combine or split subgroups, which would change the data completely. As an example, Mora et al. group all ratite birds as Struthioniformes, so that falls as an order containing five familes. But it’s often divided into several orders each containing one family, and if plotted as such would change the graph. As an aside, note they keep Tinamiformes separate whereas recent molecular studies suggests they are deeply nested within Struthioniformes/Ratites. Just another way the primary data doesn’t reflect the real tree. Depending on how much of a splitter or lumper you are, and how many paraphylogetic groups you’re willing to have, the lines in 3P-3T could have any curve you want. Their shape is due to the average of how humans classify as opposed to anything reflecting deeper reality.

        • Diane G.
          Posted August 31, 2011 at 1:00 am | Permalink

          Makes sense to me. Thanks for the detailed input.

    • Diane G.
      Posted August 30, 2011 at 3:52 pm | Permalink

      As I was reading Jerry’s original post on this, I was wondering how estimates could be determined based on the human construct of classification.

      Well put, Lynn, that was what I was thinking, too. I’m still not convinced by the authors’ pattern justification (albeit, while admitting that they know a lot more about this sort of work than I ever will…).

      • Diane G.
        Posted August 30, 2011 at 3:54 pm | Permalink

        (forgot to check “notify me of new comments”…jeez, couldn’t that be the WP default for commenters?!)

      • Lynn Wilhelm
        Posted August 30, 2011 at 7:49 pm | Permalink

        Yes, Diane, lots of people know more about this than I do!
        I have to read very carefully and only hope that I don’t look like too big a fool.

        Thanks to Chas and Mickey for responding as well. I need all the help I can get.

        (subscribing now too, forgot before)

    • Dominic
      Posted August 31, 2011 at 1:58 am | Permalink

      Presumably the older the family/genus, the more attrition so the fewer surviving species. So what is it about Ginkgo biloba that has stopped it from speciating itself over the long period of its existence, or has it, only the new species became extinct? Was it restricted in range so it was unable to spread?

%d bloggers like this: