DNA barcode: what's going on?

https://www.inaturalist.org/observations/303637699
This is one of many observations that this has happened to. I found it unidentified and called it a mushroom. Within a few days, two people who know mushrooms identified the genus. A few weeks later, the observer identified the species and added a DNA barcode field and others. (I don’t see the date of the fields, but assume they’re added together with the comment and ID.) This one was identified as an existing species; others are identified as «Otidea “sp-IN01”» or the like. How and why does one get a DNA barcode, and what part does it play in identification?

The DNA barcode field is not added by the observer, only the collector’s name is. If you click on the field name you can see “Added By” and “Last Edited By”.

1 Like

As I understand it, the DNA barcode is a sort of simplified DNA analysis that just looks at a small subset of the genome that has been identified as broadly useful for separating species. It’s faster/cheaper than doing a complete genome analysis. I don’t think it necessarily works in all cases (ie. some very closely related species may have identical barcodes). I believe the name “barcode” is intended to reflect this “quick and cheap’“ nature of the analysis. As in - “it’s like scanning a barcode”. It looks to me like these folks have uploaded the observation, and then came back later to update the observation with the results of the DNA barcoding. Perhaps this would be a good application for a “draft mode”, where observations are uploaded to iNat, but not made visible to anybody besides the observer. Though in this case, some of the info has been submitted by another user (perhaps a colleague of the observer?). So perhaps they would need a draft mode where access is restricted to members of a project.

5 Likes

Stephen Russell has a lab working on North American fungal diversity, so he does a great deal of DNA barcode sequencing. If you would like to get your mushroom finds barcoded, there are some opportunities through Mycota Lab: there are periodic “MycoBlitzes”, and there is an ongoing “open call” with free barcoding for submissions from certain states.

DNA barcoding is of particular interest for fungi because there are a great many undescribed species, even in otherwise well-surveyed regions, and it’s often unclear to exactly what taxon old species descriptions refer.

14 Likes

If you get a DNA barcode for your observation, you can certainly add it yourself. For barcoding projects where you send your specimen to a lab, like MycoBlitzes, the lab will update your observation with the barcode and ID.

7 Likes

https://www.inaturalist.org/observations/108814888

I found a worm I had hoped was a native species, so in addition to sending individuals from the same area to a group expert, we also performed COII barcoding and blasted it using the BLAST (Basic Local Alignment Search Tool). Both the blast and the expert analysis got the worm to Genus. I didn’t know what to do with the COII sequence so I just added it as a comment.

4 Likes

You can upload the barcode to one of the open databases, like GenBank or BOLD Systems (that would require creating an account), and later add the accession number to your observations, which I think is a little better looking than adding the whole sequence (probably more foolproof for future use). That way, your DNA data is also available to researchers, as is the observation.

4 Likes

do most people add the accession number as a tag, OBS field, or how do they add it?

Ooh! It includes “Caribbean - all islands”! I’ll keep that in mind next time I’m in the Dominican Republic!

2 Likes

I’m not sure, but a GenBank Accession number observation field exists ( https://www.inaturalist.org/observations?verifiable=any&field:Genbank%20Accession%20Number= ), which is probably preferable to simply adding it to the observation description.

2 Likes

Is this happening in any other kingdoms/taxa beyond fungi? Are there any other iNat field tags for that to search for?

I’m surprised no one tagged me LMAO.

Anyway, the short and quick explanation is that DNA barcoding uses a small section of DNA (usually around 300-1000 basepairs but there is some wiggling room) to compare to other sequenced organisms to try to help make a determination on species.

With Fungi, we’re usually using ITS (though occasionally other gene regions like LSU or SSU will be sequenced as well) or the Internal Transcribed Spacer, which is a non-coding DNA region - which basically means random mutations in the spacer don’t really cause detrimant to the organism since it isn’t coding for anything. This means we can use these mutations to help determin species limits since interbreeding populations should have a more stable ITS than groups that aren’t exchanging DNA. Hypothetically, of course, there are some groups that ITS works less well for.

This isn’t an exact science of course, we’re lacking good data to reference in some cases and in other cases we may make a few too many temp codes for a group and then when someone does describe it the temp codes get collapsed. No big deal, that’s WHY we use the temp codes.

If you see a temp code (something that looks like Mycena “sp-IN01”) that doesn’t mean it is undescribed, it doesn’t mean it is a new species - all it means is that we don’t have enough confidence in the identification to for sure give it a species name.

Sometimes this means that we’re lacking sequences of holotypes - there is an EXTREME number of fungi holotypes that lack sequences, many of them being from some of North America’s most influential mycologists like Charles Horton Peck. So, there are a lot of species that lack a type sequence to point to and use as an example. Sometimes we can figure out species based on other clues (macro and micro morphology, environment, original description location, etc) other times we can’t, and it needs a temp code because it isn’t good science to just ASSUME we are correct about something without evidence for that thing.

As mentioned above, Stephen Russell with Mycota runs mycoblitzes a few times a year and also has an open call going for some territories. Eventually, OMDL will probably get back to accepting specimans as well, though it is going to be of much limited scope (we got a bit overwhelmed with samples and are still playing catch up.) There are also individuals that are happy to sequence for a fee if you aren’t located in one of the open call areas, the price varies a lot based on how much prep work and validation you’re willing to do yourself. If you are interested in paying for fungi sequences, feel free to reach out to me and I can point you in the right direction.

As far as other kingdoms of life goes, barcoding is definitely possible but I’m not aware of any mass community science efforts like this for other kingdoms. Entomology may be one of the bigger areas it would be useful for, since many insects suffer the same issues of identification difficulty that fungi do.

LMK what other questions y’all have and I’d be happy to answer

11 Likes

https://www.inaturalist.org/observations/194406876

Here is an observation that I’d like to use a very good example of how we use some observation fields. This particular speciman is a species holotype, so it has a pretty good spread of fields filled out (though of course there’s always going to be some others that may pop up, depending on the observation, usually habitat or morphology details that may not come across in photos.)

Reads in Consensus (RiC): Most of the sequences you see are going to be done using Oxford Nanopore Technologies, which is a 3rd-generation sequencing tech that basically reads a string of DNA a bunch of times to get an average read (this helps account for errors.) Basically, we get a file back called a FASTQ that is a bunch of individual reads, RiC is how many reads that particular file had.182 is pretty good, though we’ve gotten the bioinformatics tweaked to the point where anything over about 20 is going to be fairly reliable.

This is what a FASTQ file looks like btw, it may help explain what I mean. Rows are each individual reads of the same organism.

DNA Barcode ITS: The ITS sequence of the organism.

Type Status: Rarely relavant, but in this case since this is a Holotype speciman it is notated as such.

Sequencing Tecnology: As I said before, we mostly use Nanopore. You may also see Sanger, other techs are a lot rare for this sort of work (I certainly haven’t seen any Illumina sequences on iNat)

Herbarium Catalog Number: If it is accessioned somewhere this can be added; you can see the accession #s for the various herbaria here

Sequencer: Who sequenced it. We don’t always bother filling this one out, TBH. You’ll also see a field for ‘sequence validator’ but most people don’t bother with that either - I do for what I look at though just because I like being able to click it and see what i’ve looked at

Genbank Accession Number: The accession # on genbank

Provisional Species Name: You’ll see temp codes in this field (we’re trying to adjust the protocol a bit so this might change), along with actually provisionals (basically, species that are in pre-publishing but have a recognized name, there’s a few mushrooms like this) and occasionally strict species names to override a bad iNat identification (this helps with our workflow on Mycomap - again, trying to adjust the protocol so we don’t have to do this as much.)

Mushroom Observer URL: the duplicate observation on MO

Mycomap BLAST Results: this is a permanent link to the mycomap blast for the organism; Mycomap blasts against NCBI genbank along with local databases like iNat, Mushroom Observer, Mycoportal, along with a few others, plus it makes some adjustments for nucleotide differences that are not very significant, so it can be a lot easier to use than Genbank for our purposes.

Trace Files (Raw DNA Data): This is where you can fine the FASTQ file, the sequence, and details about what primers were used, RiC, etc. It’s nice having access to the original files because, unlike genbank, we can double check any sequence edits if neccessary to make sure they were done correctly. Sometimes when you pull sequences off of genbank they have been edited in ways that make them not very informative (IE a bunch of N nucleotides left in, which mean ANY nucleotide can be at that location.)

Citation: This is where this is published, since again, it’s a holotype

Genbank number (URL): The direct link to genbank

4 Likes

(I’m so sorry for the chain of replies y’all but some of these posts are long and I’m trying to keep it legible)

There is so much biodiversity in the carribean and it is seriously undersampled, so PLEASE DO. If you’re going to send in samples for sequencing, especially from such a biodiverse and undersampled region, I would encourage you to make the absolute best observations you can. Print out some voucher slips (You can generate/print them out here https://mycomap.com/events/event-slips?event=147 for mycota though I just have a template that I print on to pads through Vista print), take pictures of the mushrooms in situ and from as many angles as possible, record nearby plants, environment, any details like odor, bruising, taste (not always important, probably most relevant for Russula, some boletes, and some things in Clavariaceae - chew and spit, don’t swallow). If you feel like carrying around testing chemicals (KoH/Ammonia/FeS04 are the most basic though there a few others) recording reactions can also help. Also if you want to get REAL squirrley, pick yourself up a 365nm UV light and record if the fungi fluoresces and what color

This isn’t to say you aren’t, of course, but the more information the better. (I’m kind of saying this for the crowd too).

ALSO: If you are shipping internationally, please just be aware of declaration requirements and what sort of organic material is allowed to be shipped. I know for sure that specimans need to be free of dirt, but there’s probably other regulations that vary country-to-country. I would personally suggest only sending splits of the fruitbodies and not the entire fruits, that way if something does get lost in the mail or confiscated you haven’t lost the entire specimen. Also, don’t send full fruitbodies if you want them back, definitely send splits in that case.

ALSO: If you are shipping internationally, please just be aware of declaration requirements and what sort of organic material is allowed to be shipped.

Also, be aware that countries and indigenous groups have rights in terms what is done with DNA from specimens collected in their territories (covered by the Nagoya Protocol). I don’t know anything about the Mycota project (I DNA barcode insects and other invertebrates) and if they have open calls for some countries, they may already have DNA export permits with those countries, but check.

2 Likes

Thanks for that! I couldn’t remember the name, it was on the tip of my tongue.

1 Like

Oh, btw, one more thing.

ITS doesn’t work as well with some groups - most of Basidiomycota does okay (there are a few groups that need different gene regions,) and a DECENT chunk of Agaricomycota has functional barcodes -but there’s a lot of weird groups like various molds and plant pathogens that don’t work quite as well. especially the molds. You get an ITS back, but it isn’t useful in delineating some of these groups and you need different gene regions

We’ve gotten good ITS sequences back from a few of the weirder fungi Phyla, like Mucoromycota or Chytridiomycota, but these weirder groups just lack reference data in GENERAL, so I have no idea if ITS is useful in differentiating species or if it’s like some of the molds and it basically doesn’t work. So if you send in an Erynia or something, it’ll probably get a sequence but it’s almost certainly going to be a temp code.

Just trying to set reasonable expectations.

BTW if anyone wants to do this and they NEED sequences back quickly, I would suggest doing a paid option - most of the community efforts are mostly volunteer work so everyone does their best, but paying for sequences and validating the data yourself is always going to be the fastest.

3 Likes

7 posts were split to a new topic: Meta response to DNA barcoding response

This is an impressive initiative that could definitely expand to other clades. There seem to be plenty of opportunities in various kingdoms/taxa where community and citizen science DNA barcoding / ITS / etc could enrich and refine what we know about speciation, relationships, subspecies radiations etc.

A lot of what is in books and taxonomic indexes is based originally on observations made as many as hundreds of years ago, by regional explorers and resident academics traditionally cataloging things who didn’t necessarily know how to understand and analyze phylogenies, speciation, radiations, hybridizations, diversifications etc in terms of explicit genetic flow and diversification.

Often in naturalist history a single person went somewhere with a specific goal of describing and naming new species for personal and professional credit–with positive bias to do so–and it would be interesting to add modern data to contextualize these findings in barcoded phylogenies.

Plants could be just as ripe for this, with the added benefit that, similar to fungi, it can be straightforward to get a genetic sample from plants (e.g. from leaves) without significantly damaging, killing or removing whole specimens from rare/threatened/dwindling/poached populations (as with exotic insects and some other taxa…)

Would be cool to know who else is doing this now and how it tie into iNat

This also looks like something that could deserve an updated web-native (e.g. interactive SVG, d3.js, HTML5 etc) phylogenetic view context and widgets/explorers in the iNaturalist web and app interfaces that incorporates and displays the DNA findings.

Unfortunately, barcodes don’t work well for plants…you almost always need to do 2-3 genes just to get to genus and there are very few ones where you can get to species.

People do do a lot of DNA barcoding for insects and that is generally helpful, although not a silver bullet. Among other things, the gene used is in the mitochondria, which are maternally inherited. So there are some species groups that have hybridization in their history where their DNA barcode doesn’t really match their whole genetic history.

If you’re curious, you can poke around the Barcode of Life Database website. They have both more general and very technical information there

4 Likes