What would make AI on iNaturalist tolerable to you?

As I understand it, the new use case is to collate identification information to offer more contextual information to human identifiers to help them learn to ID stuff. It’s not to train an AI to ID stuff. iNaturalist already has that in the form of the computer vision.

Additional points that would be needed to make a future implementation tolerable to me are:

  • a robust discussion and testing of any new tool before full release. The community must be largely happy to see it implemented, which could be assessed through a poll, the tenor of forum discussions, or other means.

  • transparency about the energy use involved in training and providing the tool. I understand that this would likely be minimal, unlike other applications of genAI, but this needs to be explicit as it is an important factor in why I and many others have a deep antipathy towards genAI in general.

  • ability to immediately correct or disable erroneous information that might be generated.

I would not be happy to see a new implementation at present because a substantial part of the forum community is against it, although I think a large component of that is based on fears about what such a tool would look like (I prefer to reserve judgment), distrust and dislike of genAI in general (which I share), dismay about the role of big tech in the current active dismantling of crucial components of public life and scientific endeavour in the United States (which I am also horrified by), and the unfortunate tendency of social media to prompt people to express strong, outraged opinions before all the facts are in (which I try to check myself from doing).

6 Likes

while remembering that the ‘forum community’ is a rather small slice of iNatters.

7 Likes

No one is objecting to using AI on the site. What people are objecting to is using generative AI.

7 Likes

Howdy,
I remade an account here to comment on this.
If the generative AI tool was a completely separate app that I could outright refuse to be a part of, and if everyone including new signups were automatically opted-out of having their data used for it, then I might be comfortable downloading and using iNaturalist again, and recommending it to others.

The AI that’s been a staple of iNaturalist to help guide people towards likely species and genera is already the limit to what I will tolerate.
I want machines to think for me (and folks in the classes I host) as little as possible, and so I can’t support ideas like generative AI explanations for suggested IDs any more than I can support students in school using ChatGPT to form essay outlines. I personally think it’s doing folks a disservice, especially if they’re beginners.
I know not everyone will agree with this, but I wanted to share my thoughts.
(And this isn’t even considering problems with plagiarism and consent)

12 Likes

Well, I think that’s up to you and what makes you feel uncomfortable. Ultimately, the idea is that people feel uncomfortable about the new tool that is being created and I’m curious what specific changes would make people feel more comfortable with the situation. The question was meant to be general in nature. If you don’t feel comfortable with generative AI specifically, what specific features would make you feel more comfortable with it? Again, if the answer is nothing, you probably should look at the other post.

2 Likes

Please reread the original post. General discussion about this topic and complaints about AI (generative AI or otherwise) should be posted here: https://forum.inaturalist.org/t/what-is-this-inaturalist-and-generative-ai/66140
Not in this discussion post.

I have edited the original post to make things clearer. I also aggregated the current suggestions which hopefully makes clearer the purpose of this post.

@emmett35 @spiphany @dinofelis @radekwalkowiak @spookyaranhas @reedlindwurm @echomary Please especially take note.

3 Likes

Here are some of the suggestions so far. If yours didn’t make it, you might consider rereading the original post. :-)

  1. Attribution
  1. Disclaimer
  1. Source accessibility
  1. Correction/user input
  1. User choice
  1. AI quality
  1. User notifications
  1. Overall system function
  1. Transparency, communication, and community involvement
7 Likes

I think that a lot of the things said above as you outlined @ nathantaylor are all very good, but I would like to add one caveat here in regards to how I could see this working as a system. I guess this pertains to 8.

Currently, the entire idea behind the generative AI is that it is a way of aggregating the information from comments without making it a work of hundreds of thousands of man hours. However, I still believe there’s a world where a human based wiki exists. So, what if instead of making it one or the other, we do it as a little of both?

  1. Gen AI creates guides to species ID under the outlines people have written above.

  2. These outlines are put into a central wiki, and all articles written by AI are marked. These explainations of species ID can then be added for the computer ID, or however the team wants to use it. The important part is that it is stored permanently in a singular spot. This means that it will not have to be regenerated every time, thus leading to overall less energy consumption. Still not great, but less.

  3. Users past a certain threshold of usage (likely IDs) are able to make edits that automatically override the AI generated ones within the central database.

To me, this helps both parties. For the inaturalist team, it gives them cases where they can check how their AI preformed in comparison to a real curated user wiki, and allows them to better it over time. ie. More good data leads to better future AI.

Meanwhile, the creation of a centra database could finally mean we finally have some focus and energy into fixing our current mess of a system that is the guides tab. Then maybe, just maybe, we can start teaching some new identifiers to help with the our current identifiers being highly strained.

I think that it is valiant that iNaturalist is trying to find a way to aggregate user info, but until we make it user first and AI second, a lot of people who really care about this platform are going to feel put off.

9 Likes

These - all from natev - strike me as key:

  • Curators or even users should have the option to hide LLM responses until further review, not just flag them. (Natev)
  • Nice to have: some way to “promote” comments that the AI can then draw from (though at that point, a non-AI solution may be simpler). (Natev)
  • Some way for users to opt out of their comments being used this way, or an opt-in system (the latter would probably cripple the whole project though TBH) (Natev)

Expanding on the opt-in / opt-out:

→ I’d like to see opt-in, and I’d like to see it be granular enough to allowing people to select which taxa they are comfortable in having their words used in this manner.(scarletskylight)
→ People who opt-in should be given an opportunity to review species and highlight the differences with some sort of pen / highlighter tool, which will help the machine learning system understand what to look for in other observations (scarletskylight)
→ Novice users should automatically be opted-out until they have contributed a certain number of identifications, and those identifications either meet a community threshold of accuracy or they are vetted by curators if making numerous disagreements - as sometimes disagreements indicate correcting data and not inaccuracies (scarletskylight)

And here is my own request, regarding intellectual property:

  • Way to ensure copyrighted material is protected and not used without copyright-holders permission (for example, sometimes I reference FNA material directly, and when I do, I source it. The AI model should not use this data.) (scarletskylight)

And why I prefer opt-in over opt-out:

When I take a photo of something in nature, I don’t think of it as my property. I don’t hold dominion over living creatures. CV model = no problem.

When I use my words though, those are mine: my thoughts, my time and energy to learn. Let me opt-in or opt-out, and let me do it on the taxa where I know I’ve got the IDs down with confidence. Meaning, there are some species where I’ve taken the time to really learn the dichotomous keys and I’m quite fine with my words being used to train a LLM, but, there are some species where I know I may have made mistakes or jumbled my words or don’t know the full taxa range well enough - I want to opt out for sure.

9 Likes

I think my biggest problem with this whole thing is not how the AI is implemented as many are discussing. If the iNat people found a magic generative AI model that didn’t use up ridiculous amounts of energy for an no reason, I might be fine with it. That’s not even mentioning the theft aspect. It just feels distasteful that a website dedicated to learning about and protecting the environment would try to implement a technology notorious for its inefficiency and negative environmental impact. At the end of the day, I can grumble about how I think gen AI is stealing all I want, but the real problem is it is actually wasting valuable resources like water and energy which fuels the destruction of the environment we’re trying to protect.

7 Likes

We have information gathered together at this project.
https://www.inaturalist.org/projects/observations-with-id-tips
I do add obs to it, as good comments come up when I ID.
But. I don’t ever go back to the project to attempt to FIND the good, useful, reliable info which was worth highlighting. We need that extra bit of ‘mentoring’ to surface the good info which is currently buried across iNat.

3 Likes

There still appear to be some misconceptions and assumptions about what an implementation of this proposal would look like, which I understand is the point of this topic to address – what assumptions would need to change for an implementation to be acceptable?

Regarding energy use:

iNat staff have already said that any content would not be generated on demand, but would more likely generate a stable resource, and we also know from other threads that the computer vision and geomodel are trained in-house with low energy demand. For example, Tony has already responded that:

9 Likes

It should make all pertinent comments it used to build it’s own response, including responses it considered and discounted available to anyone interested in a searchable way. People would have the opportunity to assess how valid the AI’s reasoning is in a transparent way, and give the IT nerds more insight into how it works.

4 Likes

If the AI input pool was opt-in, I probably wouldn’t concern myself with this test project anymore.

2 Likes

High customizability and flexibility to be manually changed by curators.

Whether editing text, ability to put warning comments on difficulty or impossibility of ID. Etc. Really just the flexibility to edit it for whatever situation and issue may arise.

Ideally even a way to limit certain leaf taxa from being able to be learned by the CV in the first place due to difficulty to identify.

5 Likes

I’d want a way to be able to put for some species/groups: “not identifiable based on current knowledge without a microscope mount of genitalia; please don’t identify beyond genus” or similar.

5 Likes

There is / was a request for that.

Sorry - can only find this
https://forum.inaturalist.org/t/computer-vision-clean-up-wiki-2-0/40318
There was an attached project, but it is gone. And the thread is closed.

https://forum.inaturalist.org/t/computer-vision-clean-up-archive/7281

But a way to surface a ‘warning’ for difficult to ID taxa ?

3 Likes

Here is one more:

  • Allow some taxa to be ‘pulled’ out of the LLM aspect of this project altogether, especially taxa where:
    – Subspecies or species-level IDs are very difficult to differentiate, or known to be undescribed (insects in particular seem to have a high number in this category)
    – Genera with species that are not in the CV model yet

Supporting example: I’ve delved into Galiums recently. G. anglicum is not uncommon in areas I regularly walk, but it’s not in the CV model yet. Other Galiums are being suggested with high confidence in its place. Generative AI, in this case, would do harm in giving people more confidence in the ‘why’ behind what can only be a misidentification until G. anglicum is recognized. This becomes an echo chamber then of incorrect diagnostic information. “Oh! This is G. divaricatum because the fruits are glabrous - I learned that on iNat!”

6 Likes

My answer is still going to be “nothing” and I’m going to say it regardless.

2 Likes

Similar idea here:
https://forum.inaturalist.org/t/what-is-this-inaturalist-and-generative-ai/66140/113