The future of DNA barcoding, and its use in citizen science

I am not a spider taxonomist, but I can answer a few of your questions.
First, it might help to clarify a bit about how different fields are related.
First, taxonomy is the field that assigns names to things and puts species into named groups like families. Phylogenetics is a separate but related field that studies how different organisms are related to each other in the tree of life. Taxonomy originally was not based on this idea of a tree of life, but in the modern day the field attempts to make its categories match branches of the tree of life. Sometimes those relationships are unexpected or even frustrating, since different characters might be shared by organisms that are distant on the tree of life, and so they get placed into different taxonomic groups to reflect this. It makes identifying taxonomic groups frustrating, but that’s just the way that the tree of life is.

Before the use of genetics became feasible, phylogeneticists had to rely on morphology to build their trees. If they were lucky, there were characters that made it possible to (presumably) correctly reconstruct the tree, but sometimes there were too few characters or misleading characters, so mistakes were made. Every tree is just a hypothesis, since we can never know the true tree. With the advent of genetics, phylogeneticists now had millions or billions more character traits to use, and so genetics based trees are often much more reliable than those based on morphology. When the genetics-based trees disagree with older morphology-based trees, usually it is because those morphological characters were misleading in some way (such as being convergent similarities between species). When better-supported trees come out, taxonomy might be updated to bring it in line with the best phylogenetic hypothesis, so it is constantly changing.

Hope that clarifies a bit!

6 Likes

As the previous response mentioned, phylogenetics uses evidence of relatedness (genetic and/or other kinds of evidence) to reconstruct the most likely history of descent from common ancestors among groups of organisms that are now considered to be separate species.

DNA barcoding, on the other hand, attempts to identify a unique combination of genetic markers for each taxon (species, subspecies, genus, etc. - I’m using species as my proxy here), which can then be used to assign unknown samples to one of those taxa by finding the closest match. Problem is, it is probably not humanly possible to collect and identify unique markers for every species on Earth (perhaps 90% of which aren’t yet even known to science). So identification via DNA barcodes will always find a closest match, but it is only an accurate match if that species has (1) previously been identified by other means, and then (2) had a unique barcode developed for it. It also assumes that the presumed markers are invariant among all the members of the species, which is not a safe assumption without a whole lot of additional sampling within that species and among closely related species.

If one chooses not to accept that populations now behaving as separate species descended over time from a common ancestor, then concepts like “squirrels are more closely related to mice than either group is to a petunia” really become meaningless, as does any taxonomy that attempts to name and classify those groups based on such relationships. If one considers every species to have had its own separate and unique origin, then all we really need for communication purposes is a separate and unique name for each one.

3 Likes

So does DNA barcoding have anything to do with evolution as do phylogenetics? Like, is DNA barcoding just plain ‘analyzing DNA sequences’?


Another question: What if any of the following occur?

  • Rapid evolution, which means sometimes organisms will evolve faster than others. This means some spiders could also evolve faster than other spiders, even those closely related, which would cause discrepancies between genetic traits and physical traits.
  • Hybridization, which means genes can be transferred horizontally (from one species to another species) rather than only vertically (from a parent of one species to the offspring of that same species). For example, two separate species of the thomisid genus Bassaniana have been documented crossbreeding.
  • Bacteria and other microorganisms can transfer genes from one organism to a completely unrelated organism, so the parent could carry these alien genes to its offspring and the offspring to their offspring and so on.

Would any of these affect phylogenetics and DNA barcoding in any way?


I don’t think that’s true. Taxonomy doesn’t necessarily have to be a place to “reconstruct the most likely history of descent from common ancestors among groups of organisms”, it can be a place to reflect morphological similarities too, like it used to be. There are obvious similarities between organisms, like how two species of jumping spider are closer to each other than to something like a scorpion or centipede–there’s no doubt about it–I just think taxonomy isn’t an aspect that “really becomes meaningless” for those who don’t believe organisms evolved from common ancestors.

Present-day DNA sequences and resulting “barcodes” would be the result of past and ongoing evolutionary processes, so yes.

Very well exemplified by rapid development of antibiotic resistance in disease organisms. And in general, any time there is an increase in selective pressure, i.e. differential survival of heritable traits, on a population, that population will evolve (i.e. undergo genetic changes over time) more quickly. The pace of evolution also has to do with generation time - organisms with several generations per year or week or hour (i.e., microorganisms) will show differential survival of heritable traits much more quickly than organisms with 2 or 3 generations per century.

All of the phenomena you list are well-known in both spontaneous and human-induced settings and, to the extent they affect portions of the genome sampled for phylogenetic or barcoding work, absolutely do affect the results of that work. Phylogeneticists are well versed in detecting and accounting for the genetic signal from past reticulation events between lineages.

As a biologist I wouldn’t really care much about morphologic similarities if I believed that they had no biological relevance. They would instead be in the realm of odd curiosities and coincidences.

2 Likes

This is a problem with a lot of the “difficult” groups – they are made “difficult” by the history of what traits taxonomists focused on in the past. I can’t remember the title now, but there is a published dragonfly guide where the author comments on this in the introduction: that the guide is based on visual field marks, but first, they had to key out dragonflies based on wing venation.

And yet – for those who accept evolution – it must be admitted that the whole organism, not just this or that trait, is subject to heritability, selective pressure, and genetic drift.

…and yet identifiers can get so pedantic about it when bumping back observations.

This harks back to the fungi thread a couple days ago:

“Would I expect it to have an identical (or near identical) ITS barcode sequence to other Armilla ostoyae specimens? Yes. Because that’s how barcoding works.”

This assertion is based on assuming that “the presumed markers are invariant among all the members of the species.” It was made in direct response to my saying essentially what you are saying about population-level sampling.

Can we say the same about genomic similarities? Or for that matter, genomic differences?

1 Like

I think something that should be clarified is that DNA barcoding really doesn’t seek to ID based off of a full genome sequence; barcoding regions are chosen based on genes that are tend to be stable in a species population but does change as the species do. ITS gets used most often for fungi, for example, but even within fungi there are other regions that are more useful for some group of fungi than ITS - and there are groups of fungi where ITS simply isn’t enough to differentiate species.

Basically with barcoding, we’re saying ‘this 300-600 basepair sequence is one we know is consistent within a species in this group of organisms’ and that’s really it.

Hybrids are wonky. I’ve seen some sequences come back that I suspect may have been hybrids (but probably need more research to really decide this,) and there’s at least one species of hybrid oyster mushroom in common cultivation (‘black pearl oyster mushrooms,’ a hybrid of P. ostreatus and P. eryngii) that has a sequence floating around that I’ve looked at, and the (barcode) sequence just comes back looking weird. My anecdotal, personal experience says that we probably don’t really know enough about hybrids to really judge how they present in barcoding.

All that said, you can create phylogenetic trees based off of barcode regions that show relationship between species - to a point. If you’re trying to make a tree with species that aren’t that closely related, the accuracy may go down quite a bit - at least with fungi, though I’d bet its similar for spiders, these sequences are usually under a thousand basepairs long so there just isn’t enough information there to connect distantly related groups.

Rapid evolution could account for instances where sequences are wildly different, but the morphology doesn’t change much - or vice versa. Again, I just don’t know if there’s been enough research done here to really tell, maybe there is, and if there is I’d be very interested in reading it.

6 Likes

I remember reading that it’s quite cost-intensive to sequence the entire genome of an organism, and so I’d assume that most phylogenetics is done using partial genomes which could introduce errors if the sections used happen to not be representative for this purpose. But maybe technological improvements have made it a lot easier to compare entire genomes?

1 Like

This is a pretty big topic in phylogenetics, and the field is definitely moving towards using more and more data as the cost of sequencing plummets. The cost depends on the size of the organism’s genome and how deeply one wants to sequence it. Depending on the type of sequencing it can be only a few hundred dollars per sample now for decent coverage of the whole genome.

There is definitely a broad acknowledgement in the field that a single gene can have a different phylogenetic tree than the overall species tree, and that generally with more data one is more likely to reconstruct the species tree. (In my experience, most individual parts of the genome won’t follow the species tree exactly if you are dealing with closely-related species. Mitochondrial DNA and certain other genes tend to follow the species tree more closely for a variety of reasons, so that’s why particular sequences tend to do better as barcodes). From what I’ve seen in birds, most recent phylogenies are now being based on a larger fraction of the genome (i.e., at least ~1% of it), and usually there is not much improvement in using more than 1% of the genome unless you are working with very difficult situations where many species diverged from each other very rapidly, leaving less signal in the DNA. Papers published based on only a few genes are getting rarer unless there is a special reason for it (eg ancient DNA).

(Edit to add: my perspective is from birds, the situation may be different in other groups that are too diverse for it to be practical to sequence more than a few genes from every species)

6 Likes