What is this - iNaturalist and generative AI?

There was an earlier post showing off the “capabilities” of ChatGPT that received a fair number of likes, so I thought I’d test it on a genus that I’m an expert in:

[Prompt]: You are a hyper intelligent AI with knowledge about how to identify plants, but you know your limits. When I ask you a question, answer concisely.

  1. [Prompt]: How do you tell Thaspium trifoliatum var. trifoliatum from Thaspium trifoliatum var. aureum?

"The main difference is in the leaf coloration:

Thaspium trifoliatum var. trifoliatum typically has green to yellow-green leaves.
Thaspium trifoliatum var. aureum has golden-yellow or more intensely yellow leaves.

Other than that, they’re very similar in structure."

Utter hallucinated nonsense. The leaves are identical. Next try.

  1. [Prompt]: How do you tell Thaspium chapmanii from Thaspium barbinode?

*"To distinguish Thaspium chapmanii from Thaspium barbinode, focus on these key traits:

1. Leaf Shape:

    • T. chapmanii: Leaves are more broadly ovate with a more rounded or cordate base.*
    • T. barbinode: Leaves are typically narrower and more lance-shaped with a more pointed base.*

2. Hair Type:

    • T. chapmanii: Has shorter, less dense hairs on the stems and leaves.*
    • T. barbinode: Has longer, more dense, and often more conspicuous hairs, especially on the petioles.*

3. Flowering Habitat:

    • T. chapmanii: Prefers more moist, swampy, or wetland areas.*
    • T. barbinode: Often found in drier, upland habitats.*

These traits can help you differentiate between the two species."*

Again, utter hallucinated nonsense. Leaf shape is identical, chapmanii has longer hairs (got it backwards), and chapmanii is found in drier habitats (backwards again).

  1. [Prompt]: How do you tell the difference between Thaspium pinnatifidum and Thaspium walteri?

*"The key differences are:

** Thaspium pinnatifidum: Has deeply lobed, pinnate leaves.*
** Thaspium walteri: Has more coarsely toothed or slightly lobed leaves, but not deeply pinnate.*

Additionally, T. pinnatifidum tends to grow in more mesic habitats, while T. walteri prefers slightly drier, upland areas."*

Thaspium pinnatifidum and Thapsium walteri are synonyms. They are identical.

  1. [Prompt]: How do you tell the difference between Thaspium grayi and Thaspium chapmanii?

*"To distinguish Thaspium grayi from Thaspium chapmanii:

1. Leaf Shape:

    • T. grayi: Leaves are usually more deeply lobed and may have a more elongated shape.*
    • T. chapmanii: Leaves are broader, more ovate, and less deeply lobed.*
      2. Habitat:
    • T. grayi: Found in drier, upland habitats.*
    • T. chapmanii: Prefers wetter, more swampy areas.*
      3. Stem Hair:
    • T. grayi: Often has fewer, shorter hairs on the stems.*
    • T. chapmanii: Has more dense, longer hairs on stems and petioles.*

These are the most noticeable differences!"*

“Thaspium grayi” is a name I just made up. Look how the generative AI is just oozing with confidence, despite my explicit instructions to “know your limits”.

I could go on, but I think you get my point. No one should be using ChatGPT (or any other generative AI) for this purpose.

30 Likes

See Point 1 in this comment, from someone who has written a lot of ID tips. I agree with them, being in a similar situation myself. But for it to work, of course, there would have to be significant savings of time and effort in reviewing and correcting machine-generated compilations vs. creating a wiki from scratch. That may be the experiment that this project ultimately creates.

I am very wary of setting a precedent and expectation like this. Part of the human culture I value most here is that we are all volunteers. Even the paid staff, without whom none of us would be here discussing anything, are iNat users too, and I know that many devote time to the platform well beyond what they are paid to do. It’s their passion, just like it is for so many of us.

Some commenters here (not saying you) seem to envision iNat leadership as some siloed impersonal corporate entity divorced from its user base. They are the (original) user base, and have been part of it all along. That gives me all the trust I need to know that they will ultimately do the right thing for iNat and the reasons that we are all here, even if there are bumps along the way.

19 Likes

Goodness, is this a record? So many comments in such a short space of time and with such passion! I have limited access to screens in this period, so can neither thoughtfully read all this thread, nor comment at length, but you know, I think it’s pretty safe to say that whether you/we/they want it or not, AI in one form or another is here to stay. And in that case, I firmly believe it is very much better to be part of the process, so in a position to influence/guide the outcome, rather than a victim of it. Of course there are issues to be resolved and, most importantly, we need to educate and inform people so they are aware of its intrinsic limits, but that being said, I personally am convinced that AI is an enormously powerful tool, if used to the best of its abilities and with the necessary safeguards and caveats. I feel for the iNat staff who must now figure out how to weather this storm and I await developments with silent anticipation.

9 Likes

There was announcement here in the forum when they added it.

1 Like

Then just tag it at the genus level (or the last level at which you were confident when identifying) and let an expert in that area take it to species. At least, that’s what I always do when I’m not 100% certain on an id.

8 Likes

For anyone interested, I have been keeping a tally of responses both here and in the comments on the blog post.

iNat users who have voiced clear opposition to any genAI use: 306, including 36 curators
Users who have voiced a clear support of genAI use: 12, including 3 curators

Users who are undecided, waiting for more info, or commented but did not express a clear opinion one way or another: 30

Since this whole thing is being framed as a way to assist with IDs, here are the ID breakdowns:
Total number of IDs made pro-genAI users: 1,019,922
The IDs per user range from 0 at the low end to 396,271 on the high end. Below you can see the percentage of the total contributed by each individual.

Total number of IDs contributed by anti-genAI users: 7,752,051
These range from 0 at the low end to 614,922 at the high end. Below you can see the per-user breakdown.

Users who have announced intentions to delete their content should genAI be implemented in any way on iNat: 38 as of my last count, including at least 4 curators. (Many others have discussed leaving the site but allowing their existing contributions to remain; I have not added these up at this time)

If the deletions currently threatened are followed through, this would lead to the loss of:

  • 223,879 observations
  • 777,714 identifications
  • 11,817 annotations
  • an untold number of comments, journal posts, and projects
43 Likes

I am in India for the first time at the moment and I want the diagnostic criteria for Eristalis cerealis. I tagged in an identifier (@matthewvosper) who confirmed something for females but didn’t mention males. On chatGPT I prompted it twice - once asking for diagnostic criteria, then asking for a source. It immediately pointed me to a paper with the diagnostic criteria I was looking for.

How often do you use genAI yourself?
You seem to be making assumptions about people using it in a particularly naive way. This is akin to someone taking the word of a random user on iNaturalist as a solid truth. If I feel the need to dig deeper on something I will always ask an identifier for diagnostic criteria as well as a source to know where they are coming from. There is nothing to say that any implementation iNaturalist creates of genAI couldn’t be used similarly - or have sourced links inbuilt somehow.

The continued distorted notions about the environmental impacts of genAI being repeated on this thread and the blog post comments are a good example of how people will take information at face value regardless of where its from - be that random Instagram memes, genAI or random iNaturalist users.

Sorry, I’m confused - you think the existing use of CV is not AI?

Your use of the term “pattern recognition software” seems like an attempt to reframe the existing use of CV as not falling under AI and ML. This simply isn’t the case. Whilst CV can refer to solely image analysis without an AI/ML implementation, that’s not how it is being used on iNaturalist. Many people also like to talk about LLMs as pattern recognition software / text predictors. Similarly underselling the actual processes in play.

Ultimately the comparison here is an error from an AI model misleading a user… vs… an error from an AI model misleading a user.

Sure - “hallucination” is often being used in regard to LLMs in a specific way. But…that’s just semantics here.
The core point remains.

How many of these mention as their reason claims about environmental impacts which are simply unfounded? How many talk about leaving because of AI …even though AI has been used on iNaturalist since forever?
It’s hard to take a lot of the comments seriously, they’re just so detached from reality.

3 Likes

I have never been misled by the CV. When the CV says “based on pattern recognition, this photo most closely matches photos from this species in our database”, it’s always a true statement.

Now, the species it points to isn’t always the same species that’s in my photo, but that’s fine. I was never lied to by Computer Vision. It never said any false statements. It’s just looking at my image and saying (correctly) that this photo seems to look like some other photos.

Compare and contrast with the cavalcade of false statements ChatGPT just told me, in my exchange I posted earlier.

What’s the difference? Generative AI vs. predictive AI. One tells lies, and the other doesn’t.

They are not contradicting themselves, they are just using different, colloquial, terminology. “AI” is shorthand here for “generative AI”. You’re not going to persuade many people with the “gotcha” that the CV is technically also a form of AI.

You should not write people off as being “detached from reality” just because they aren’t using your more technical terminology.

Thank you for this.
If 306 to 12 isn’t enough to convince iNat to pull the plug on the project, what would be?
At some point “listening to the community” must mean understanding when the community has spoken.

33 Likes

Also worth considering that not everyone using the site is totally enthusiastic about CV (“AI”) either.
As a former identifier trying to have others learn things, it ranked pretty high on the list of annoyances.

21 Likes

No, they are not. Even if this particular version of generative AI happens to have low environmental impact — and that’s a big if — and even if we decide that its impact is outweighed by its benefits — then still, greenwashing is a serious concern. See eg. my comment here: https://forum.inaturalist.org/t/what-is-this-inaturalist-and-generative-ai/66140/322?u=bogsuckers

12 Likes

But if the expert has written a careful explanation on another obs … how could we find it? That is the gap this demo could close for us. We would like to mentor more / new identifiers. Out of the dreaming spires of academia.

Every few months there is a push pull post on the Forum. The observer complains - you add an ID but never explain. The identifier retaliates with - I DID explain carefully, but the observers ignore that. Copypasta or text expander with every ID - not going to happen!

4 Likes

While I understand the point you’re making — that CV doesn’t technically lie — I think it’s a bit misleading to treat it as fundamentally different from generative models like ChatGPT. Both are built on neural networks and rely on prediction from large training datasets. Both operate as black boxes and can mislead users, even if they don’t “intend” to.

There are countless forum posts from users who’ve been misdirected by the CV autosuggestions, especially in regions or taxa with limited training data. The model’s accuracy varies widely — it performs very differently depending on the input context.

Saying “I’ve never been misled by the CV” might reflect careful usage, but it’s not representative of the broader community experience. Many newer or less experienced users take the CV suggestions as authoritative, and that has real consequences for data quality.

And crucially, this isn’t just about the model — it’s also about how iNaturalist implements the feature. The interface, wording, and visibility of confidence levels all affect how suggestions are perceived. If CV is framed too confidently or presented without nuance, it invites over-trust — especially from users who don’t understand the model’s limitations.

So yes — it may not fabricate in the way generative AI sometimes does, but it’s still a predictive system making guesses. The key difference isn’t about one “lying” and one “not lying”; it’s about how each system presents its output and how users interpret that output. Both require critical literacy to use responsibly.

8 Likes

With out going into details. Its pretty terrible for my area. Chironomids are pretty niche and honestly im not surprised that even the most widespread of AI scrappers dont get it. Much of the identification information is locked away in books and scientific reports, many that are older than the internet was a thing. How can we expect an AI model to get information from a book released at the end of WW2 thats still behind a pay wall?

8 Likes

I haven’t understood, yet what the new AI thingy should do for inat, but please let us stay one of the last places, which is nearly completely real!!

  • I enjoy the community, learning step by step, encouraging people etc…
  • I wouldn’t profit from another overdose with to much information at one time.
  • If I would like to get deeper into a species, I can…

But on the other hand, some times AI can help and be a bit of fun like SEEK…

  • But, as a friend of mine once said: “So often it suggestes Norh American species and I am sure, it is a european one! I know that! - And that cat is not a penguin… - Is there a way, I can tell the AI, why it is wrong?”

So… sigh… - Please, dear Inat-Staff be very careful with your new AI-Project. Don’t destroy what we all love…

4 Likes

Once again, I ask of you all to stay on-topic:

iNaturalist, the grant from Google, and its specific implementations here.

It’s fine to disagree, and to cite examples, but I feel the conversation is wandering into the weeds a bit. Also, beware of the trap of anecdotal data. please.

(editing to add: most of the mods are asleep at this time and I happen to be under a great deal of personal stress this morning. I’ll be keeping an eye here regardless but please make an effort to be kind and patient)

19 Likes

Very well said, and I agree.

And I think this is one of the crux issues.

8 Likes

Unfortunately, design issues and/or a poor onboarding of newcomers results in members not fully (or not immediately) grasp how the site works. Namely, that (1) there are actual humans doing the ID/Curation work behind the pretty (magical) app/site/CV-Identifier/ID-Tips-Generator; and (2) that these humans are largely volunteers, laypeople, and plain users (not “pros”, not “paid”, not all “staff”); and (3) that human-human interaction can be welcome if done right (by e.g. having a profile page mentioning language/skills/willingness to interact).

Not sure if adding another layer of “machine” on top of the cake, without (or before) fixing the onboarding and explaining and human-centric features, will improve that aspect.

10 Likes

Do you have an iNat profile?
Do you, have you, used iNat to observe or ID?

Former identifier? So you have already left?

@graysquirrel of the nearly 8 million IDs against AI - how many are simply confirming what was already RG? (Who are, maybe the top 5, names on each list? Busy busy or taxon specialists?) IDs from taxon specialists are not going to fly high on the list (unless bees or similar), but if ‘you’ are one of the very few who can ID that your lonely only ID has great value.

3 Likes

yes, let’s please not continue this “survey” subtopic as I feel it is promoting significant misinformation. We, you, iNat staff have no idea how many people are “for” or “against” any of this without doing a rigorous survey. The people most likely to comment on anything will be those with the strongest negative opinions.

Thank you.

24 Likes

Thanks for running that experiment on a different set of taxa. I was not at all advocating for using ChatGPT in its current state for ID advice, I was rather saying that a generative AI with hallucination mitigation techniques would work fairly well on iNaturalist.

When I put your prompts through ChatGPT, I got slightly better answers (it avoided mentioning leaf color in the first question, for instance), but it still failed to recognize the synonym pair and the made-up Thaspium grayi. I prompted it again asking it to avoid hallucinating, and it was less confident about how to tell Thaspium grayi apart from similar species and suggested I consult a local field guide, but it was still confident that Thaspium grayi was a real species. I then told it that I may be lying to it, and it should verify that species names I use are real before answering. Still very confident Thaspium grayi was a real thing. So I asked it to find a source, it pulled Wikipedia and decided Thaspium grayi wasn’t a real name after all. So I am pretty sure that with a robust retrieval augmented generation system set up, iNaturalist’s LLM could avoid most of the hallucinations, which seem pretty clearly to stem from the fact that ChatGPT was pulling from its base training rather than pulling from sources when it was hallucinating. Asking it to pull from sources ended the hallucination.

Edit: a pattern I noticed is that ChatGPT commits the etymological fallacy: it “translates” the scientific name to derive physical characteristics and the range of the species. iNaturalist’s LLM should have some kind of reinforcement training to teach it that such extrapolations are highly unreliable, because such hallucinations could potentially evade Retrieval Augmented Generation.

5 Likes