What is this - iNaturalist and generative AI?

It’s frustrating to me that people are exclusively testing these ideas with ChatGPT, as it’s only one of a number of LLM options, one which doesn’t typically live reference the internet to augment its training data, and which obviously wasn’t trained with obscure organism identification in mind.

When I ask an LLM for ID help I usually use Perplexity as it’s better suited to this task. It references internet resources and includes photos and links from those resources so you can double check its answers. It’s also less likely to make up information that it can’t find in the internet sources.

In this case, it assumed that T. grayi existed based on it being included in the question, but said it couldn’t find any information about it. I think that’s a fair response, theoretically some additional prompting advice on species ID could get it to more consistently state that a species may not exist at all if it’s not mentioned anywhere on the internet.

Neither of these LLMs are representative of what anything on iNat might be like; my point is to emphasize that there is a wide variety of LLM use cases out there and any specific example likely won’t be a good comparison.

Full response in spoilers:

Perplexity AI: How do you tell the difference between Thaspium grayi and Thaspium chapmanii

To distinguish between Thaspium grayi and Thaspium chapmanii, it is important to note that detailed direct comparisons between these two species are not extensively documented in the search results. However, there is useful information on Thaspium chapmanii and its close relatives that can help infer differences.

Thaspium chapmanii is characterized by:

  • Taller, more branched plants with more divided leaves (cauline leaves 2–3 ternate).
  • Leaflets with coarsely serrate margins.
  • Leaves typically pubescent on the underside (abaxially) with hairs on the lamina and veins.
  • Pale yellow to creamy white corollas.
  • Peduncles and umbel rays that are scabrellous (bearing small, sharp projections) on the upper surface.
  • Smaller mature fruits (4–5 mm long), minutely spinulose to scabrellous between the wings.
  • Habitat preferences include glade edges, bluff tops, prairies, rocky banks, stream floodplains, and open woodlands.
  • Flowering peaks later (late May through June).

Thaspium barbinode, a close relative often compared with T. chapmanii, differs by having:

  • Less divided leaves (cauline leaves 1–2 ternate).
  • Golden yellow corollas.
  • Peduncles and umbel rays glabrous or papillose-roughened (with blunt projections).
  • Larger fruits (5–6 mm long), glabrous or rarely scabrellous.
  • More mesic habitats like wooded bluffs and springheads.
  • Earlier flowering (mid-May continuing into June).

Since Thaspium grayi is not directly described in the search results, it is likely less commonly referenced or may be confused with these species. Given the detailed morphological and habitat distinctions for T. chapmanii, one could differentiate T. chapmanii from other Thaspium species by looking for its combination of pubescent leaves, pale corolla color, scabrellous peduncles, and specific habitat.

For precise differentiation between Thaspium grayi and Thaspium chapmanii, consulting a regional flora or taxonomic key that includes both species is recommended, as the available data primarily contrasts T. chapmanii with T. barbinode but not directly with T. grayi.

In summary:

  • Look for leaf division, pubescence, flower color, peduncle texture, fruit size, and habitat.
  • Thaspium chapmanii has more divided, hairy leaves, pale flowers, and scabrellous peduncles.
  • Without direct data on Thaspium grayi, these features can help distinguish T. chapmanii from its close relatives, but further specialist resources are needed for T. grayi specifically.

Temporary link to response here: https://www.perplexity.ai/search/how-do-you-tell-the-difference-f2BvDp3vQh.mL7qk8XKK7w

9 Likes

yup. my understanding is that there are two main types of language models out there. without knowing what path iNat plans to take, folks that are testing should probably test using a diffusion-based model as well.

5 Likes

CV is not the same as generative AI. This has been discussed in many different places on the Forum (and possibly somewhere in iNat itself.)

5 Likes

Hmmm, I disagree. What @graysquirrel came up with is essentially a sentiment analysis – a perfectly valid way of gauging community response. What they did is of course not the only way to measure community response (as you suggest, a rigorous survey would be great), but I don’t see how it’s “misinformation.” It’s a representation of the sentiment of that subset of the community that is likely to respond on iNaturalist forums and blog posts, which is a perfectly valid group of people to consider (maybe they are even more likely to donate or have lots of contributions). So I don’t think these data can be considered “anecdotal;” it’s a small synthesis that does what it was intended to do and is presented factually. In fact, I bet that the iNaturalist team will do a similar analysis themselves! So I disagree that discussion of it should not be continued.

But I do agree that the iNat team doesn’t really know how many people are for or against this – or at least, didn’t have a sense of this before the announcement – which is an issue, and why so many have suggested listening to the community in this moment.

19 Likes

If you are determined to talk about various kinds of AI, and how they could theoretically be applied, and the differences between them as it relates to nature, please make a new thread or I will have to split out all those comments from here myself.

Ordinarily it’s fine when a thread meanders, but the side conversations add a lot of work to a thread that is already hard to follow and full of strong opinions.

We don’t know what the iNat team’s plan is, we don’t know what tech they will have access to, and they have promised a Q&A.

Feedback form HERE

9 Likes

I am disappointed that, despite recent messaging, iNat still has not clarified the all-important question:

Does iNat intend on using a genAI model to produce text descriptions explaining why a particular observation is identified as a particular taxon?

Until this clarification is made, I have to assume, based on the blog post, that this is still the plan.

As long as this is still the plan (or as long as plans remain sufficiently vague), I will continue to strongly condemn this project, this grant, and this decision, for all the structural, ethical and, more relevantly, practical reasons that myself and others have already described. Such an implementation of genAI is, from the basic standpoint of technological capability, simply not able to achieve the admirable goals @loarie has to facilitate ID expertise. Using genAI to write natural-language ID explanations will not only make iNat a much poorer scientific resource but a much poorer social network.

Again, much confusion has stemmed from the fact that the iNat team has not done a good job with the messaging around this project. In particular, iNat’s messaging around how this genAI project is similar or different to its existing use of machine learning tools (the CV, the geomodel, etc.) has been ambiguous at best, and deceptive at worst.

As @dhobern stated early on in an important blog post comment: "Please be careful not to fall into the trap that Google and others have laid of wanting us to treat AI as if it is a monolith. Image recognition (with suitable caveats) has always been a statistically sensible use case (large, well annotated training sets; fine-grained and testable predictions). LLM-based models, even when propped up with other code to avoid the worst errors, are not the same thing.

Even if I could ignore the fact that generative ML is a cynical bid by Silicon Valley to extract as much value as possible from all of society’s data, knowledge and labour and from the environment, I beg you to be very cautious before helping their latest black boxes to become the intermediaries between humans and knowledge. We are living in an age of scientific denial and rejection of evidence."

It is exactly this “trap” that many defenders of this project seem to be falling into - something that can be prevented if the iNat team makes explicit, in its messaging, the very clear distinction between the CV and this genAI project.

Also, I suggest that if iNat wants to get a precise sense for ongoing community opinion, what we need is an open, accessible, visible forum, of a kind similar in nature to the Comments section on the original blog post. Given the magnitude of the situation, current feedback processes aren’t cutting it. This forum thread is not very visible, the feedback form is a suggestion box rather than a community discussion, and plans of an eventual virtual Q&A do not substitute for discussion today. Perhaps a stickied forum thread or discussion post is in order.

One thing’s for sure. On all sides of the debate, we can all agree that the community response to this project has been overwhelmingly and unprecedentedly negative.

It’s time for community members to see this feedback not as a knee-jerk reaction by the heated, uninformed masses, but as what it is - a community that is in crisis over a deeply troubling project that nobody asked for.

16 Likes

Why is much of the discussion about existing systems taking place?

They already exist and a fair few people are familiar with them when the term “AI” is mentioned. Thus they form a frame of reference for the discussion.

Also the information about the iNaturalist implementation is rather vague and wooly so with that vacuum and pre-existing prejudices we get the explosion of heat that has taken place.

I’m personally not necessarily against it. What I do think is absolutely necessary is a clear pause in implementation and much more detailed technical information released on this. That gives time for simmering down, time for more considered analysis and time for the de facto hallucinations that have taken place in this thread by some posters (a real irony that) to be addressed and revealed and dealt with.

3 Likes

Here is a very simple poll for what it’s worth:

Is this a good idea?

  • YES! This is a great tool!
  • NO! iNat should stop this!
0 voters

You should have added a couple other options for those leaning Neg/Pos but waiting to see.

13 Likes

Given the strong feelings on the topic it would be worth for the iNat staff to do such a rigorous survey and do some community engagement and listen before going ahead with those plans?

8 Likes

That’s why I said “very simple” - it was meant to be just a general black-and-white poll. But you are right and here’s another one:

Is this a good idea?

  • NO! iNat should stop this!
  • Leaning no
  • I don’t know
  • Leaning yes
  • YES! This is a great idea!
0 voters
1 Like

I and many others have said this repeatedly, but here it is again. The REASON does not matter. Whether they are logical or not does not matter. What matters is that this is supposed to be a community science platform, and the community is speaking up against this idea and being dismissed.

What matters is that this action will clearly have massive detrimental impacts through loss of users and deletion of data, and at BEST has very questionable utility.

I am very confused by why you keep mentioning “iNat has had AI forever” - the announcement specifically says this is generative AI which is a completely different technology to visual recognition. It’s like saying “Why are you opposed to driving semi-trucks across this bridge, people have driven golf carts here for decades”. They are so far apart that the only real similarity is that both get mislabed as “AI” (when, technically, neither are).

Please explain how it is “misinformation” to summarize the publicly declared sentiments in an open thread? Are the users’ comments misinformation too? Or are you suggesting I have falsified the count in order to promote an agenda? If it is the latter, I am happy to send you my spreadsheet so you may check every data point.

I never claimed this was a rigorous survey, I took pains to express exactly how I summarized this and from where. Obviously there are many people of all sentiments who have not spoken and I therefore did not count.

But YES, iNat staff need to do a rigorous survey of all users immediately. It should have been done long before we ever got to this point!!

20 Likes

I do think it might be useful - if temporarily annoying - to have a pop-up in the app and the website, for every single user, with a question asking how they feel about the proposed AI demo(s) and potential resulting AI-based tools. Something to consider. I’m sure most users would do what they could to simply dismiss the question and get to iNatting, but it would serve to paint a better picture of the opinion of the user community than this forum post (which, while extremely active and full of very active and vocal users, is still a tiny tiny fraction of the iNat userbase. I feel like a huge huge huge number of iNatters are blissfully unaware of all of this - and their opinions matter, too.)

11 Likes

Poll:

  • Positive feelings about this
  • Mixed feelings
  • Negative feelings
0 voters

Both iNat’s use of CV and genAI use deep learning/neural networks.
Both are very specifically under the umbrella of AI.

It seems like users need some disambiguation around these terms though, as there appears to be a lot of confusion. What makes you think that technically neither are?

4 Likes

You can fill out the typeform if you like. Unlike comments here or on the blog, they can be more easily collated and analyzed. And they are also private, so someone not comfortable voicing their opinion publicly would be more likely to reply, which I think is a good thing. It still would not be a rigorous randomized survey, though.

I do want to emphasize again that at this point we don’t know what this demo will look like. This would be a demonstration, and not out until the end of the year or so, which is about six months away. These are very early days and obviously a lot of thought will need to be put into this. At this point I’d consider what @loarie said (quoted below) and for now frame it as exploring a way to improve how expertise from the community is captured, organized, and shared.

So if you’re asking for clarity on exactly what this will look like, that’s not available at this time. I understand that’s not satisfying, but it’s the reality of where things stand and I’d ask you to please be patient. If you have concerns or thoughts about what kind of tool might work or not work, that would be much more constructive than assuming what this will be and reacting to that assumption.

26 Likes

FYI, this poll is not anonymous. that might change who votes and how they vote. (I would pick somewhere in between mixed and negative feelings, personally.)

3 Likes

Thank you. I appreciate the response.

I’m glad that here on the forums, the emphasis from iNat has been to keep the project open-ended.

However, the fact is that the official news about the project remains the blog post. The stance that the blog post takes is that, while it’s also a bit vague/open-ended, the current intention is to use grant money to train a genAI to write text descriptions explaining the reasons for a particular ID on a particular observation.

I would appreciate it immensely if the sentiments recently conveyed by the iNat team on this forum (e.g. we are going to keep things completely open-ended; we are exploring options that do not involve genAI production of natural language text; etc.) were conveyed in an official blog post, press release, or similar.

As it stands, the mixed messages continue to be sent, because the official blog post, as well as the Google blog, still indicate that iNat will use genAI to write descriptions explaining why an ID was made.

This continues to be unacceptable to a community that has extensively explained why this is a poor idea, for technical reasons, pragmatic reasons, social reasons, and broader, systemic reasons that remain unsatisfactorily addressed.

20 Likes

About what exactly, the whole AI situation or the pop-up? Or something else?

2 Likes

Comments are most of the time relevant only if viewed as a conversation in the context of that observation. Any useful summary should give reference or the full conversation.
I would need strikethrough font formatting or will be forced to delete my incorrect entries even if that would make the conversation illogical in part.

If short, generated comments would appear next to CV suggestions that would provide some comic relief:

Rosulabryum billardierei
“Could be correct but most likely some other Bryaceae”

8 Likes