Diversity Indexes

I recently began to read Charles Darwin’s “On the Origin of Species”, and it is fascinating to read the inner thought processes and ideas from the Nineteenth century. One question that came to my mind is how exactly does one quantify species diversity, or any type of diversity in general? Some searching on the internet led me to the concept of Diversity Indexes, such as the Simpson Index of Diversity. If this question came to my mind, it has probably come to other people’s minds as well. Since many people who are probably interested in Nature, Biology, and Evolution may inevitably come into the contact with Diversity Indexes, it would be prudent to expound what exactly is a Diversity Index. However, since I am not an Evolutionary Biologist (yet), nor do I have any field experience with needing such concepts, it may be better for other people with more field experience to explain this.


Whenever you start doing or looking up something ecology-related you end up under a pile of indexes and formulas, don’t know if it’s a good or bad thing, but ecology adores statistic stuff.


A diversity index is just a score that goes up for diverse biological communities and down for not-very-diverse communities. The simplest diversity index is just species richness: what is the number of species in that community?

Most other indices are based on richness, but also take other factors into account. For example, the Simpson’s diversity index also considers species evenness: are all species approximately equally represented (e.g. 34 red oaks, 33 white oaks, and 33 sugar maples), or do one or two species dominate the community (e.g. 95 red oaks, 4 white oaks, and 1 sugar maple)? The first community is more even, so it gets a better diversity score under Simpson’s index. Shannon’s index also considers evenness, but does a slightly different calculation to incorporate that information. I don’t normally work with diversity data, so I can’t tell you why someone would choose Simpson’s vs. Shannon’s index.

Other diversity indices may consider other information, such as evolutionary history (are many of the species present in a community closely related, or are they spread across the evolutionary tree?) or functional diversity (how similar or different are the ecological roles/niches of the various species in a community?).


Jaccard Similarity Index is a basic tool for comparing two different areas in terms of what species they share and don’t share. Haven’t used in long time so I’m rusty on its applications.

1 Like

As an aside, I wrote a program that will calculate 4 diversity indexes of your iNat observations. It takes awhile to run tbh, but it is fun to see how “biodiverse” your observations are. I might try to make it run better on my holiday break later this month.


EDIT: I replied to the wrong person and moved my post accordingly


How does calculation of a diversity index take into account the fact that you will never be able to find all the species that occur on a piece of land?


There are a number estimators that attempt to calculate diversity that includes unobserved species. For example, Chao1 estimates the lower bound of species richness by using the number of species observed only once or twice in the area (singletons and doubletons). You can also use the observation data to try to estimate the asymptotic species richness (i.e., how many total species do you accumulate as the number of observations approaches infinity).

There’s many more options as well as assumptions/pitfalls. E.g., the detection probabilities of different types of taxa aren’t the same.


This is a very good question, because no formula includes artefact: how much the area was studied by experts, who knew where to look for what and were able to ID them (especially this concerns fungi,mosses, insects, myxomycetes,etc.). Hence, when it comes to ‘all species’, the formulas are OK for vascular plants, mammals, birds, reptiles or for calculating just one separate group (e.g., Coleoptera, or lichens, or myxomycetes). If you try to calculate diversity based on iNat OBs, you will need to include numbers of observers, calculate number of the observers going into more diverse areas (not only city parks and gardens) and also calculate uncalculable: for how many organism groups identification materials are available in certain territories and how many experts cover these areas.


I worked one summer with the ecologist Robert E. Ulanowicz who is best known for his book Ecology: The Ascendent Perspective.

Ulanowicz did some fascinating work studying the structure of trophic webs based on what species are doing, independently of what they “are”. I.e. measuring where energy and/or nutrients flow in an ecosystem, and then looking at that as a network and how the structure changes. He was looking for physical laws that governed the evolution of ecosystems that are too complex and/or too random or chaotic to be governed by differential equations (which would be most ecosystems.) He wanted measures that would stand up to the replacement of one species by another, yet in filling a similar or analogous ecological niche, which is extremely common. His work is really compelling but there is a huge downside that it is incredibly laborious to actually measure the necessary networks. Also another problem is that most ecologists lack the mathematical background to understand it, not that it’s that hard, but just that it involves different tools (i.e. little or no calculus or statistics, but a lot of information theory) and thus tends to make more sense to a chemical engineer or thermodynamicist than it does to an ecologist.

His measures include forms of diversity indices, but not measuring numbers of species or individuals, but rather, quantities of energy or nutrients flowing in the trophic network. One of the key measures he uses is derived from informational theory, and can be seen as average mutual information contained in an ecosystem’s trophic network. Mathematically it is similar to a two-dimensional generalization of the simple entropy indices to measure diversity (like a Shannon index, see this Wikipedia page)

Although I find his work impractical for studying a specific ecosystem, I found it compelling for understanding the general dynamics of ecosystems. For example, he was able to hypothesize, and then scientifically test and find evidence for broad principles, such as the idea that ecosystems tend to become more efficient in the absence of disturbance, that there is a tradeoff between efficiency of energy/nutrient cycling and adaptability or resilience, and even in some cases that you can learn something about its recent history (such as assessing how recently an ecosystem was disturbed) by studying the structure of its trophic web, even if you are only observing it in the present and don’t know anything about its past history.

Part of me thinks Ulanowicz’s work may actually be most influential outside of ecology. The two of us co-authored a paper and I was surprised to find it get over a hundred citations, and many of them are in areas far outside ecology, such as in fields like wastewater management and urban planning, which makes sense because they study networks a lot.

I’m not usually a fan of “mathy ecology” but if you really want to delve into diversity indices I would highly recommend reading Ulanowicz’ work, particularly his second book (his first is a bit too mathy, his third is more philosophical and less focused on ecology). I consider him to be one of those eccentric geniuses who fundamentally changed the way I think for the better. He is also one of the few people I’ve ever encountered who has successfully applied the scientific method to “holistic” ideas, i.e. not trying to understand everything through reductionism.


My evenness score would be highly biased, though, since I only post one observation per taxon.