AI Suggestions in Fungal Observations

I stepped back last night so as not to turn this into a back and forth between me and lothlin (and probably get the thread put into slow mode). I think that I have sufficiently gathered my thoughts that I can tie this back into the original topic.

Okay. But I also could ask for understanding. When I see dismissive remarks like

If “just” Googling things includes, for example, looking on The Mushroom Expert, or, where applicable, using The Bolete Filter and taking the time click all those check boxes on its interactive key, well, then I have to ask: what would be acceptable sources to you – given the aforementioned constraint of non-specialist?

As to how old my mushroom guide is – well, rather than make a snide guess as to how recent is recent enough (and then get flagged), I’ll straight up ask instead: how recent is recent enough?

I’d sure like to know what would be an acceptable alternative.

I don’t know to what extent others have tried following the advice and information that has been provided by various myco folks on here in an effort to improve the quality of my initial IDs as much as I can, but is it really so surprising that making these efforts would result in frustration when one is outright told that they are never going to be good enough?

So, let’s bring this back to the original reason for this thread. @kallampero is upset about observers blindly trusting the CV. So I ask @kallampero : what behavior from observers would have prevented you from making this complaint? Because at this point it’s hard to know what the myco folks expect of us.

5 Likes

You have 198 fungi observations and many of them are lichens or plant pathogens - groups that are notoriously difficult. I do not think your sample size is large enough to get a truly representative sample of how useful things like mushroom export or bolete filter are (sometimes they are very good, sometimes they are out of date, the only way to learn this is experience, as frustrating as that is to hear.)

Well over 60% of my fungal observations are not research grade. It is, unfortunately, a reality a lot of us have to live with. The ratio would be even worse if I wasn’t doing sequencing work. This is a reality that frankly, needs to be accepted.

I do not get upset when Diptera identifiers tell me that my observation is not identifiable to species with the photos I took. I go ‘okay’, bump it up to genus, move on with my life, and try to take better photos next time (and let’s be real, the photos next time are not going to be good enough either because I dont care enough to catch specimans and dissect them)

8 Likes

Thanks for this information. I truly appreciate your efforts! I’m new to this. I found nothing from AI that convinced me of the identification of what I think might be a fungus so I did not attempt to post an identification. Would it be best to just leave the post until someone with expertise identifies it or should I just remove the observation? I like the purpose of this site and I do not want to add to any more of the overly abundant misinformation in this world, especially that related to science.

1 Like

I think this issue really should be less focused on the fault of observers, and more on how the system promotes through design certain behavior. You really can’t fault people for not having in depth knowledge on a certain taxon. You can’t entirely fault people, especially new users for putting trust into the CV. The CV really is an incredible piece of technology, and while not perfect, it is still remarkable and very useful for helping those get an idea of what they may have observed.

If you know nothing about a taxon, picking what the CV suggests is very easy and the path of least resistance. There really aren’t many incentives to not pick it either. How are new users supposed to know when the CV is wrong, or how often it is wrong with a certain taxon? Unless you are an identifier, or have a lot of experience in a group. How can any user know how accurate the CV is with that group? I don’t know how accurate it is for violets or scorpions.

There are a number of things that could be done to try and better the system to improve these issues.

Improve training data through identifying and correcting

Listing confidence scores on all platforms

Notes on how a certain taxon is identified

Adding intermediate taxa so leaf taxa do not cause the parent taxa to be unlearned and create traps where difficult to identify and undescribed species struggle or will never get into the model

If so many undescribed morphotypes can exist in that small of an area. How can the CV ever be entrusted to get suggestions of fungi mostly correct if it can never really learn those morphotypes, even if they are idable to genus?

Community control over certain variables like how many photos, observations, are needed for an individual taxon to be learned

The ability to remove taxa that can’t really be accurately identified through photo, or are causing significant issues

Better on boarding.

Include hybrids

Make the common ancestor tab mandatory to allow users to not pick species IDs if they are not comfortable with doing so, even if the confidence score is high. Maybe also make an exception for monotypic taxa.

Include subgenera, complexs, and really all taxa levels in the common ancestor tab that is listed above the top suggestions. This is the “we’re pretty sure its in X” message.

Warning signs, or notes that the suggested taxon has a very high inaccuracy rate or other problems.

While there are some things individuals can do to help with some of these issues. Hopefully we will see future improvements to the model solve and address some of these growing issues we can’t solve on our own.

5 Likes

@eymrox It’s great that you’re trying to improve your data quality! Definitely don’t delete your observation just because you don’t know how to identify it. Something you

sounds like the kind of thing that is very difficult to identify. I recommend IDing as Kingdom Fungi if you have some confidence that it’s a fungus, or just “Life” if you don’t know. Observations identified as Life are more likely than those with no ID at all to be noticed by people who can ID them further.

1 Like

I sympathise wholeheartedly. I don’t put up many fungi, and have no expertise in that group at all. I do however put up records of Australian vertebrates and some plants - my plea to iNat is to please have your AI take its sticky little fingers out of the observation entry process, including coming up with rubbish ID suggestions, jumping ahead of the typing of IDs with ludicrous suggestions replacing what is being typed, reading the file name and have a guess on that basis, and replacing the input name at the last minute or possibly as it it uploaded, so that the person doing data entry does not notice the substitution.

1 Like

Are you able to share a screen recording of these?

Interesting, my guess for the reason behind the recent spate of CV discussion threads was the opposite. I guess it’s just a coincidental mix of causes.

Here is a similar request: https://forum.inaturalist.org/t/in-cases-where-the-cv-is-not-pretty-sure-of-anything-offer-a-suggestion-of-a-higher-taxa/14781

4 Likes

When the CV is certain above some specific threshold that a species-level ID is correct, it will make the species the top suggestion instead of the genus (which is what I complained about in that post).

But it seems like the improvement here is how the new app deals with CV suggestions with lower confidences. I haven’t really used the app recently, but it’s nice to see!

2 Likes

Hi Jerry,

We met many years ago, at Kew I think.

Anyway: re describing all those species. Don’t underestimate what AI will do when it gets better. (I know it’s always “next year”, but look how far we’ve come in the last few years).

If you’ve got sequences, photos and microscopy it’s just a matter of collating the info.

Perhaps an AI could re-assemble my rough notes for specimens into a nice standard description but I’m not convinced it could help much with the rest, and I wouldn’t trust it to anyway.

On average for describing a mushroom species it would take me 3-4 days to do all the measurements, get the useful microscopy and prepare the plates I need to show relevant characters (that’s without doing tedious and time-consuming drawings). Then the description has to be inserted into a formal publication, with all that entails. Including finding a way of publishing that doesn’t swallow my operating budget, or put papers behind a fire-wall, and still ticks the high impact factor box that satisfies managers. That’s before dealing with rejections due to lack of ‘novelty’ and review feedback. It is not a fast/simple/satisfying process, and much of that is generating the baseline data that no AI can help with - yet.

I think the publication pipeline could be streamlined and accelarated, and there have been some attempts to do that. Some contentious because they side-step the issue of fully describing what you have, e.g. DNA-only, or 1-liners to satisfty the code, critical comparison with similar species, ensuring type specimens have been studied …

But back to the issue at hand. It is a slow process to give massive numbers of undescribed fungal species formal names. Without those names they can’t ever be identified on iNat, and the CV can never learn about them.

3 Likes

Perhaps both comments are true in a sense.
It may be better at suggesting higher level ancestor taxa (not sure), but it’s definitely also suggesting species level IDs in a way the browser AI does not.

I imagined, like you, that this current spate of complaints about incorrect species level IDs is indeed down to the phone app and how it’s impacting uploads.

For me, when I use the phone I feel encouraged to either,

  1. use a species level autosuggest
  2. use an iconic taxa button to not even have to wait for the autosuggest

Both of which have their cost in identifier time down the line…

2 Likes