What is this - iNaturalist and generative AI?

A brief aside: I think you should clarify what “prior art” means in this instance, since “AI” and “art” in the same breath… well, I can already predict the reactions to this, lol. (I had to look it up myself! It’s not “AI art”! It’s to do with patent illustrations! Unless this is common knowledge, which I don’t think it is, but who knows! I can be weirdly out of the loop after all.)

2 Likes

Natev. You’re making good arguments, as well as many others have. What if you’re the only one who is right (and the 10 or so who like your comments), and the other thousand commentors are wrong? At this point, I think iNat needs to look at the target audience and realize that this just isn’t the time. Maybe one day, but if it’s forced through to prove that we were all wrong about it, it’s just another case of what many are complaining about, being force-fed genAI when it was never asked for. The sheer number of first-time bloggers making their feelings known stands out to me as a powerful display of the temperature in the room. Your well-written responses aren’t going to change enough minds.

14 Likes

I’ll just wait and see.

My practical worries based on experimenting with AI:

  1. I leave incorrect comments. that someone corrected, unedited out of respect for the replier and agree in a separate comment.
    AI will regurgitate my initial incorrect statement which is embarrassing.
    Which comments or identifications will the AI take into account?

  2. I use a language tailored to the audience (apical vs. on the leaf tip)
    An AI summary will make me look inconsistent if not dumb or condescending.
    Who is the target audience for the AI generated text? Experts or total novices?

On the flip side, if AI comes up with links to useful or long discussions that would be welcome.

10 Likes

Perhaps worth considering the possibility that many voices agree with Nate but are (wisely) making the choice not to get dragged into the, as you euphemistically put it, “powerful display of temperature”, on display from the other side here. I’m glad at least one person is standing up for the good judgment of the iNat team, who have surely earned more trust and generosity than has been shown to them here or on Twitter.

20 Likes

Collaboration with Google is the topic at hand. It is the “merit”. The genetic fallacy does not say that Google is irrelevant to the topic of collaborating with Google.

An ad hominem fallacy is a subset of the genetic fallacy. So yes, my example was demonstrating an ad hominem, but that’s just a type of a genetic fallacy.

Again: If these problems are so easily solvable, why is every AI out there still rampant with them? It all seems very hand-wavy. I don’t care if these nonsense-hallucinations have been “reduced”, I want them gone. Until they’re gone, it’s a problem.

6 Likes

In reality— your response and the notes from @natev are the only posts getting to the heart of the key scientific opinion I hear from academics: almost ALL of these grievances stem from a misunderstanding of how AI has been used in the past, how it can be used now, and how different most other AI actually is from the currently accessible consumer edition;

More exactly, too many are projecting their chatGPT experiences towards alternative complex predictive models in general. It hurts science moving forward that anything which can be labeled AI is to be clouded with ignorant circumspection. There are myriad algorithms and models capable of being labeled “AI” given their effectiveness in predictions, these often developed/trained intensively through no similar pipeline to anything your average iNatter utilizes or is capable of falsifying themselves.

IMO: these battles simply come down to the recent ignorance in how AI works, further facilitated by consumer-accessible models that are difficult to use correctly, only to cause grief for any future mentions of artificial intelligence.

Scientists will need to reckon with unnecessary negative connotations of using predictive modelling; this came with the onslaught of letting anyone use it, so anyone can make their own false assumptions about how it works.

8 Likes

Thank you so much for pointing that out, it is indeed a big shift if that is now what they want to do - connect people to nature through tech, rather than connect people to nature and advance biodiversity science.


I am very disappointing as there are a lot of security/privacy and data use questions that were brought up beforehand, and none were touched on in the blog post announcement. The fact that Google announced this first makes me think contracts have already been signed, and the silence on these issues from iNaturalist staff have me concerned they did not think about them. I hope this is not the case, but iNaturalist’s silence is concerning.

  1. What exactly will be fed from user data into Google GenAI?
    (We can debate is it just comments, or is it everything, all we want, but only iNaturalist can answer this. Is our sensitive location data in these datasets? Our photos?)

  2. Will we have an opt-out (or better still, opt-in and everyone is assumed to be opted-out) option?

  3. Can we have a Terms of Use update to clarify?
    Currently, the way it reads is even under full copyright, iNaturalist can use our observations and images - see quote below from ToU

You grant iNaturalist a world-wide, royalty-free, and non-exclusive license to reproduce, modify, adapt, and publish the Content solely for the purpose of displaying, distributing, and promoting Your observations and journal via iNaturalist, and for the purpose of displaying or promoting the Content or iNaturalist itself in other venues, such as social media or software distribution platforms.

Additionally, AI is forbidden, but only for commercial purposes. Google is commercial, but iNaturalist is not. Sharing any of our data with Google would be violation of this (?)

Prohibited Use for Commercial AI Training. Users may not use any iNaturalist data for training artificial intelligence, machine learning models, large language models, or similar networks, algorithms, or systems for commercial purposes.

  1. Can we have a date for when this training will start? Users need to be able to make their own choices of how they wish to proceed. It would be a shame for people to prematurely delete their accounts; and, iNaturalist should have public answers to these questions before the start date.

@carrieseltzer @tiwane or some other staff member, please clearly outline our privacy and data rights going forward. If this is a simple deal such as only using comments, then why has there not been a statement to the effect of “We are only using comments from expert users who consent to having their text comments used and Google will have no access to your/our data otherwise.” Literally that’s all that needs to be said and every fear is alleviated. The fact that nothing of the sort has been stated makes us think and fear the worst.

16 Likes

I think the reason some people are saying they’ll delete all of their good data is precisely because that data would be useful to iNat and the future AI training. If someone has an ethical objection to what iNat is doing, then they might want to send a message by revoking iNat’s access to their valuable contributions in protest. And threatening to do that in advance of doing it is a way to put pressure on iNat to change their plans before going ahead.

17 Likes

Look, I’m going to give you grace because this is obviously a very emotional issue for you (as it is for all of us), but you’re resorting to a lot of “everyone else is ignorant and wrong” lingo, which is not evidence-based, and you’re not giving credence to the idea that LLMs in practice are a different animal than LLMs in theory, and ignoring people’s feelings in the name of logic is a huge fallacy when you’re dealing with an app driven by people’s observations.

Do some people not quite get what an LLM does? Sure. But a huge part of the app is designed so that amateur people can learn about (and care about) our environment, and the more you automate that and completely omit the community aspect of it, the less people are going to take the time.

Out of curiosity, why do you do iNaturalist? In your perfect world, does it matter if it’s just you on the app IDing or making your life list? Does it matter to you if other people are on there or would you be just as happy if a computer sent you observations and IDs? For me, getting laypeople excited about nature is huge. It’s not about just aggregating and honing down data because if everyday people don’t get invested in that data and actively take the time to learn for themselves, we’re monumentally screwed. But we all do the app for different reasons so maybe I’m barking up the wrong tree if that’s not your jam.

20 Likes

If a human reads and studies INaturalist, synthesizes what they have learned, and begins
offering informed opinions or otherwise using the knowledge gained: this is good. It’s what INaturalist is for. Sure, some of what was “learned” will be inaccurate, but that’s life.
Keep learning.

If a LLM does the same: This is terrible. This is theft. It’s misinformation. It’s displacing
human experts who should be paid (or at least given credit!) to do the same thing. Smash
the machines!

2 Likes

Good question! If that’s the case, then iNaturalist should continue as planned.

Note: 85 users have posted in this thread so far. They are, naturally, the ones with the strongest feelings about this issue.

Freedom of speech exists as a principle exactly for situations like this: people bringing up good arguments in the face of vocal opposition. It forces the vocal opponents to use good arguments too, instead of allowing them to shout down anyone who doesn’t agree. The millions of iNaturalist users who aren’t posting here won’t benefit from knee-jerk rejection of anything AI or anything “never asked for”. Nobody asked for iNaturalist.

Give solid arguments for your opinion. Pay attention to solid arguments against it. This is the way.

12 Likes

A couple more thoughts…

It would be good if the AI said:

iNaturalist users most frequently mention these features to identity Species X: White stripe on back; Webbed hind feet; Curved horns [link to the obs with these comments]

Rather than:

Species X has a white stripe on back, webbed hind feet and curved horns.

Just seems a little more honest, doesn’t claim the AI has expertise, and shows its workings? And each ID feature can have a “thumbs up/down” emoji to vote for its helpfulness.

And it would be good, environmentally, if it only ran on request. I really don’t need Generative AI running to tell me why a European Robin, a 7-spot Ladybird or a Peacock Butterfly are what they are, but some else might.

14 Likes

A few points now that I’ve had some time for my thoughts to settle:

To the pro-AI posters, it does not matter if we are illogical and utterly wrong about how AI works. The stated goal of iNat is to connect human beings with nature. This has been repeated time and time again by staff in response to various feature suggestions that would improve data accuracy. All else is secondary to that goal.

The overwhelming majority of us have such a negative view of AI that we are considering leaving iNat if it is implemented. Will the best-case outcome you imagine from the AI integration lure in enough new users to create a net increase in the number of people ‘connecting with nature?’ I do not think it will.

One could make a perfectly valid argument for forcing everyone to exclusively view iNat in Comic Sans font - logically, it is the best font choice, because it is the easiest font for dyslexic people to read. Does that mean it’s a good choice? Of course not, because most people hate how that font looks and it would alienate the user base. Their dislike is completely “illogical” but it does not make it any less real.

Secondly, we still have NO information on exactly what the nature of this deal is with google. Are they getting data in exchange? Will the content we create be fed into their systems? Will it be entirely self-contained on iNat? These are extremely important questions that many of us have asked, and it is concerning that we have not yet gotten any response.

34 Likes

I have very mixed feelings about this. I don’t have much to add regarding the broader ethical concerns about generative AI and Google (such as environmental damage, outputting bad info/hallucinations, unfair compensation to authors, loss of critical thinking, etc.) because I generally agree, and they are big problems. On the other hand, I’m not uniformly against generative AI and I wouldn’t reject a tool solely because it’s generative AI.

However, when I judge this proposal based on its merits… I’m not sure that it would be very effective. For example, when I identify species such as Commelina communis, it is often confused with the similar species Commelina erecta. The two can be distinguished by characteristics including the spathe (open in C. communis, closed in C. erecta), leaf bases (not auriculate in C. communis, auriculate in C. erecta), and the shape of the filaments (parallel to each other in C. communis, looping outwards then in like a question mark in C. erecta). The last bit of knowledge is particularly useful, because it’s often visible in photos and isn’t mentioned in keys! However, if you look at observations of either species, most people do not leave any comments including this valuable information. We know that machine-learning algorithms need large amounts of training data, and my gut instinct is that for most species, there isn’t enough training data to produce good output. In the blog post, they said “We will incorporate a feedback process for the AI-generated identification tips so that we can maintain high standards of accuracy” but if the output is poor, then the feedback process would create more work for identifiers, which would erase the benefits of training more identifiers to ease the burden of IDs.

14 Likes

No, ad hominem is not a subset of the genetic fallacy. I double checked this and found no sources making this claim, let alone reputable sources. No offense intended at all, but I think this is a great example of how hallucinations happen both with very smart, intelligent people and with LLMs. This is also an example of arguments to which I am tired of responding. But I have a problem, so, here we go.

You could answer your own question in two ways. 1), read where I said, “when you tailor Google’s LLM to a specific domain and then use one or more easy-to-implement mitigation techniques…” Most AIs (LLMs) with which you are familiar are designed for general use, or are chatbots, and are not tailored to a specific domain. 2), you could have read to the end of my post (which admittedly is too long), where I describe an LLM that I have used almost daily for a long time that works well without hallucinations.

Ok, here’s the classic example of the genetic fallacy, which I learned in middle school and remember to this day: (quoted from Wikipedia, which quotes Edward Damer)

You’re not going to wear a wedding ring, are you? Don’t you know that the wedding ring originally symbolized ankle chains worn by women to prevent them from running away from their husbands? I would not have thought you would be a party to such a sexist practice.

It is the genetic fallacy to decry accepting money from Google to develop an LLM because Google’s LLM was originally developed unethically or because Google itself is unethical. Please note that you can still be absolutely right and use fallacious reasoning to justify your correct intuitions, so you don’t actually give up much ground by conceding here. However, if your argumentation was valid, it would lend itself to the opposite conclusion: iNaturalist has collaborated well with Google for a long time, which would be the most relevant piece of data to the subject of iNaturalist collaborating with Google.


@brynna, I would be less inclined to resort to such lingo if people would actually read what I write and wouldn’t base their responses on false claims. I think there is plenty of evidence in this comment section that what I am saying is true: the dissenters are overwhelmingly misinformed. This is not a slight against them, this is just to say, if you’re on the fence, please don’t find their logic compelling because it really isn’t based on facts. This is not an insult to them, most of the things I believe probably aren’t grounded in facts either, and you should certainly be cautious about believing me on most subjects.

… the more you automate that and completely omit the community aspect of it, the less people are going to take the time.

I think the goal of this project is to amplify the community’s ability. Right now, there are:
250,594,443 observations of 518,423 species observed by 3,724,782 people with only 433,200 identifiers, 20% of whom are doing 80% of the work. Automation is a tradeoff, but it seems to be a necessary one. I am likewise concerned that fewer people are going to take the time, due to either genuine ethical reasons (which I can respect) or misguided logic/misinformation (which I’d like to push back against).

I use iNaturalist primarily to help other people figure out what they’re looking at and connect them to nature. Right now, it’s a Sisyphean task that makes me want to quit some days because iNaturalist historically has done relatively little to make things easier for identifiers. This has the potential to amplify the positive impact I can have in people’s lives, which is why I support it.

6 Likes

I agree this needs to be outlined asap, wherever anyone stands on the use of an AI model itself. Is this a case of Google using it to research a model, an actual no-strings charity grant (whether that be altruistic, greenwashing or tax write-off), or is data changing hands?

11 Likes

I assume you’re being sarcastic with this. But in fact, I sincerely agree. I actually do have different ethical standards for humans and machines. I do think people are more important than technology. I’m okay with acknowledging that as one of my personal principles.

21 Likes

Even beyond the concerns regarding unreliable information and environmental impacts, I personally find it insulting. Here we are volunteering our time and experience helping users learn and then we get a sudden announcement saying that all of the text we’ve created is going to be scraped to create a genAI system to ID things instead.

I’ve been here on iNat for a decade now and the one thing I find most inspiring is the community here. This decision doesn’t feel like it was made with the community in mind. These feels like iNat jumping on the band wagon of the current half-baked technology that big tech has been putting out year after year.

And for what? It won’t help iNat be a better educational tool, particularly when there’s no way to ensure that the information this AI thing is gonna be spitting out is actually true. If anything I think it will make it so that casual users will interact with others even less than they already do. If I don’t have to ask user: GreenBeetleExpert about how to ID this green beetle I saw, why would I ever talk to them? And if I don’t even have to bother with asking another member of the community, why would I ever put any time into actually learning about green beetles on my own?

iNat is probably the largest network of experts in biology/ecology, both professional and amateur, ever to exist in the history of science. This feels like the iNat team wants to throw that away in the name of getting to put a cool tech buzzword on the frontpage of the website. My disappointment is immeasurable.

22 Likes

That’s the biggest issue for me. The only thing that has been addressed right now is what are they going to do, not in what conditions.

For me this info is crucial to decide if I’m willing to keep building in this site or migrate to another one.

5 Likes

As one of the people considering leaving iNaturalist, I do not currently plan to delete my account entirely. I don’t believe this is worth deleting years worth of contributions. In fact, I would still try to log in now and then to check for comments and address them.

From what I’ve seen on others’ comments who are considering leaving and deleting their accounts in the process, the concern is precisely that AI will be using their well-reasoned and accurate comments. I believe people feel this is a concern because they will presumably not be given any credit for their comments, any context to those comments will be lost, and the AI may potentially (and personally, I think likely) botch those well-reasoned comments and reframe or hallucinate them into something entirely incorrect. I’m not sure that any amount of good data can fully prevent these hallucinations.

Even resolving some of these concerns may not be enough. Say that we make sure that users get credit for their contributions- that would be a step in the right direction. However, it might cause issues if the AI summary is incorrect due to hallucination, but gives credit to a user who had provided well-reasoned ID tips. At a quick glance, some may believe that the poor information came from the contributing author, not the AI, which would actually discredit the user who donated their time and effort to a well-reasoned comment.

My personal concerns on the topic of AI trend more towards environmental issues, lack of transparency, and the potential for proliferation of misinformation. But I do hope that I’ve accurately captured some potential concerns of those that are considering deleting their accounts.

9 Likes