What is this - iNaturalist and generative AI?

Aggregation is one thing. LLM vibing off text is another. LLMs can be ringfenced in various ways to get more focus, but they are a deeply inefficient way to produce genuinely focused and clear information outputs.

9 Likes

If iNat is planning on using generative AI to make these identification tips, that means the identification tips are often going to be wrong. (And yes, more often wrong than an expert making a comment. Every generative AI “hallucinates”. This one will not be special.)

The Computer Vision model also often makes wrong suggestions. But iNat also currently explicitly says not to use the CV as your basis for generating IDs, and makes no claims as to their accuracy. It’s suggestions are just that: suggestions.

This is fundamentally different. This AI will generate incorrect text and place it on the iNat website. The CV may make wrong suggestions, but it does not state untrue things.

A website filled with untrue information is inherently untrustworthy. As an educator, not only would I tell any student of mine to ignore the AI text, I would actively discourage them from using the website to avoid absorbing false AI misinformation.

25 Likes

Most of what I wanted to say has already been said. I’m disappointed in iNat. Maybe some of the people in here are having a less informed reaction to it, and people love to claim to be on the side of logic when they’re defending it because they think it’s just a drop in the bucket with everything else, but here’s the macro issue: our ecosystems are collapsing. We are under existential threat. Just saying no to generative AI here won’t be enough, but we have to start hitting the brakes on this accelerationist, capitalist system somewhere. We’re environmentalists, for Christ’s sake. I don’t want our database to be so effective that we destroy the things we’re observing. I don’t care if our system isn’t as efficient as it could be. Nature is complex and inefficient and unalgorithmic and unscalable, and it’s meant to be that way.

29 Likes

I do think no info is better than potentially wrong AI scraped info. People looking to learn will read these pages and might not know the full context of how it was produced. There may not be a clear designation of which words are human- or AI-written (aka, the current state of the internet).

A blank page could lead to them continuing their search, asking for help, or leaving a more general ID. A page with wrong information could teach them the wrong skills, and lead to incorrect IDs (and if they later find out it was wrong, a distrust for the site, and maybe other users). We all try to be critical of what we read, but if a nature-observing website tells you “this is the best characteristic for identifying this plant”, it wouldn’t be crazy to accept it as a fact. People don’t cross-reference every statement they read with multiple sources.

If this wiki is created (and to be fair, none of us know what this would really look like), and every species in the CV gets an article generated, it will be a rush for identifiers to check and correct every word on every page. This does not sound fun. Maybe there could be a mark on the page that it had been verified by a human, but even then, people reading the page might disregard that.
A wiki with blank pages would be easier to manage and more reliable. Every page would be written by a human. Perhaps it wouldn’t cover every single species on the site, but it would still be a useful tool (and one that people have wanted (without AI) for years!)

25 Likes

My point exactly. Better to have no info than straight up incorrect or gibberish as info. It’s why I hate how both iNat and the CV push people to get observations down to the species level when that’s not always possible. It’s far better to be correct on genus or something than be incorrect on species. Gen AI is going to be even more than slightly incorrect, it’s likely to be flat out wrong on all accounts. And that’s infinitely worse than not having anything at all.

13 Likes

I posted this on the blog and I’ll post it here again:

I am completely opposed to the incorporation of generative AI in any aspect of iNat.

As this is a user based platform, let’s keep things user based and not AI based.

Using generative AI for creating species descriptions is an absolutely terrible idea given how badly this type of AI hallucinates and creates outright false information, even when making ‘summaries’.

To my mind this move is diametrically opposite of the entire point of iNat, which is to foster more human engagement with nature and learning about nature.


Also, some reddit posts about this with users there very unhappy about this:

Basically, the overwhelming majority of engaged users are vehemently opposed to this move.

27 Likes

Everything will depend on the implementation details, and I think this community could have a very constructive role in suggesting those details, which the announcement makes clear have not yet been developed.

For example, I could envision an implementation where a LLM run will initially populate an editable tab on the taxon pages with identification hints derived from any existing user-supplied comments for each taxon. Initially the raw generated text would be visible only on this tab, and be prominently marked with a disclaimer “raw unreviewed AI-generated content” or some such.

Before it could be used anywhere else on the site (or even be visible off-site?), it would need a vetting process similar to that for observations - at least two humans, comprising more than 2/3 of the total votes, up-voting the accuracy of the content on the taxon tab. If implemented well, hopefully there would be many cases where the text could be up-voted as-is. But otherwise, humans would review and edit the generated text, and each edit would notify those with existing votes that (further) edits had occurred. Edit history would be preserved and reviewable, and revertible by curators.

Taxa with blank tabs (no relevant information yet found by the LLM) could also be populated directly by humans without waiting for the LLM.

Taxa with existing human edits/votes would be unavailable for further updates from the LLM, but those whose text has no votes or edits (or is blank) would be replaced by subsequent periodic LLM update runs (and still only be visible on the taxon tab with the disclaimer).

Taxa with insufficient expertise and/or attention to garner the required threshold of upvotes for their ID text to become usable with CV or other areas of the platform would continue to have their text visible on the taxon page, but only there.

There could be a button on the tab to run the LLM on-demand for just that taxon, which would either replace existing raw or blank text, or display the results in a separate window to compare with existing edited/voted text.

That’s just off the top of my head - there are probably more elegant and/or useful implementations to be dreamed up. But it’s an example that keeps the results user-based, with AI used only as a tool to assist users in creating content helpful to the community.

In the end, having been here for over 10 years now, I trust that the original creators and founders of iNaturalist (who still lead the organization) know as well as any of us the unique and wonderful attributes of the platform that are valued by themselves and the whole community here, and that they would have no intention or tolerance for compromising those attributes. I think everyone involved in the heated reaction to this announcement needs to take a step back, seek some perspective, see what evolves, and look for constructive ways to help it enhance the things we value here.

15 Likes

$1.5 million is barely the cost of web services for half a year. Stupid to do this for chump change. And this is from 2023! Imagine how much that cost has skyrocketed. Is it really worth selling us out for an amount that won’t keep the lights running a whole year?


https://static.inaturalist.org/wiki_page_attachments/4009-original.pdf

12 Likes

Copied and edited from the blog post which is more active with more voices and doesnt have a 30 minute waiting period.

In the Unoffical INaturalist Discord, there has actually been quite a bit of confusion over the vagueness of this news. It really seems some read the blog post and interpreted different meanings. I certainly did from some others, enough i may delete my original first comment on the blog becuase i dont exactly know what is happening any more.

Also concerningly there are already reports of certain users deleting their accounts even when it seems there is much confusion over what is happening. Some even encouraging others to delete their accounts.

Everyone needs to pause hoping on the band wagon, and staff VERY much need to clarify what is happening and is planned. The blog post is vague and creating miscommunication issues.

14 Likes

Chiming in to say that I seriously hope the iNat staff will take the voices of the evidently large and vocal percentage of users who oppose this change seriously.

The potential problems with this specific implementation of genAI aside, I think it’s worth noting that much of the disapproval and backlash here stems from the sorry state of the rest of the internet due to things exactly like this. Shareholder-driven enshittification and other types of shoehorning for the purposes of keeping the GenAI bubble expanding has caused an epidemic of inaccurate, lazy, societally hazardous, misinformation-spreading slop that users overwhelmingly dislike, but are not given the option to opt out of. I can’t possibly prove it for sure, and I sincerely hope that this is just incorrect speculation, but it does seem suspiciously like a sign that iNat has fallen down a similar path.

If this sort of thing is the eventual fate of every platform online, then part of me wonders what the point even is in ever trying to contribute to something as noble as a citizen science app in the first place. The goal of iNat still resonates with me strongly and I hope to stay, but this proposed change has already made me much more reluctant to use and recommend iNaturalist. if this goes ahead, I admit I will strongly consider leaving and deleting my account after all these years.

26 Likes

We can envision a lot of potential futures, and I won’t insult anyone’s intelligence by listing possibilities realistic or not, so let’s stick with what we actually know about the implementation of LLMs so far, shall we?

We know that LLMs flagrantly hallucinate data, even when they have access to correct data. Methods are being worked on to detect this (Nature article) but they are unreliable.

We know that LLMs don’t ‘think’ and that calls into question even something as simple as summarizing, as making a summary requires critical thinking skills, not just following a 3-ring binder style set of protocols.

iNat has very vocal in the past at making sure that the CV system is not labeled as ‘AI’, has heavily distancing itself from the AI label, despite iNat’s CV system being 100% AI, but specifically a very targeted and non-generalized one. This move by iNat goes directly against what iNat itself has specifically stated it is for.

If this ins implemented this is a major step in the scientific and popular discrediting of iNat and the collapse of the site. This is an example of short-sightedness and hubris leading to poor decisions that I know you (@jdmore) @kueda, @tiwane, and all the others who have given immense amounts of time and effort to building and main this platform the runaway success it is don’t want to be the future of the site.

This is not to say that AI has no place in the site, it has been present from the very beginning, but it’s been a very carefully targeted and directed AI that’s meant to stand in the background and facilitate the human and nature interactions that the site is founded on.

Taking this action is the equivalent of watching someone shoot themselves in the foot with a nail gun, pointing a nail gun at your own foot and saying, “Ah, it won’t happen to me.”

Listen to your user base. Without that user base iNat is nothing and all the enormous amount of work done to make this a useful and valuable resource is wasted.

This reminds me of a sort of inverse example of how Kodak was once king of accessible photography and invented digital cameras, but ignored it and is now essentially defunct as a company. iNat is doing the same thing, but instead it’s chasing fads instead of playing to its strengths.

28 Likes

Absolutely, no disagreement here, I and everyone else commenting here are painfully aware of this issue. To me, AI is short for Artificial Ignorance.

Yet so many commenters are jumping to the assumption that iNat intends to allow raw unvetted LLM-generated information to pervade the platform by aggregating and then displacing all of the true human intelligence here.

My questions to everyone here: Where is iNat suggesting in any way that this is their intention, or that they don’t understand what the consequences of that would be? Why would folks so easily think that the people who have very intentionally created this platform, and then dedicated their lives to it, would not also have the intelligence to understand and avoid exactly the issues that you and everyone else have been pointing out? Do we really think that we are telling them anything about LLM genAI that they don’t already know - probably in more depth and nuance than most of us? Why so little faith?

I have to ask, if what is implemented? To my knowledge, no implementation details have yet been developed. I think forecasts of doom are highly premature at this point.

17 Likes

I must say I was a bit taken aback by the extent of negative views on this. I read this thread and the blog post. The discussion around integration of ID tips that has been mentioned here as well comes to mind. There is an enormous corpus of knowledge sitting in IDs and comments on this platform that could provide a lot of value if a system is designed to make these more easy to interface with. What form such a system takes is the open question at hand, and perhaps an LLM/AI approach could be useful. As many have noted, so much depends on the implementation - there is a big difference between improved search/recovery of information and and AI agents running around adding text to every taxon page.

Certainly, any use of such tools comes with major caveats; many have already been mentioned. I also think it’s clear from the community response that certain modalities would not be accepted at all. These views should be respected as, at the end of the day, despite all the value it currently has, iNat only continues to exist because of its people.

I think one of the biggest challenges would be preventing the inadvertent feedback of generated content into the pipeline that trains the model. Even if the outputs were ringfenced from the inputs, it’s still possible that humans interfacing with both systems could proliferate errors. Somehow I think we tolerate human errors much more than machine errors - to some extent well founded given that automated processes can outpace human output very easily.

A related consideration is how the tools would practically be used, and in that context it is very important to note that (I guess) 99% of iNat users don’t comment on blog posts or on the forum. How would the average user likely use this tool? Would the reality of errors require the ability to upvote/downvote human responses and/or machine outputs somewhere, as hinted in the blog post? What effect might that have on the experience, for experts and the rest of us?

Another common issue with these kinds of approaches, that has been touched on above, is about the attribution of outputs. This would be both in terms of providing a reference to the original text(s) that are used but also in crediting the users who provided the information in the first place. I don’t think there would be acceptance of a tool which does not appropriately address this issue.

On a more philosophical note, I think any teacher or educator would probably be able to confirm the extent to which students are nowadays making use of AI-based tools… iNat doesn’t directly have a pedagogical focus but it is a platform that facilitates learning by providing information and the opportunity to discuss, collaborate etc. There is an interesting tension between making information easily available but in the process losing the value that comes from having to engage with the problem. But again this brings in the point about what the average user is hoping to get from using the platform.

Personally, I am cautiously optimistic about the whole thing, particularly given the stretched resources we have available to deal with identification where I am. I think the iNat team have overall shown themselves to be thoughtful and considerate of how changes affect the different cohorts of users. I will say that I would be wary of something which outputs generated text wantonly rather than judiciously. Where and how the text is displayed is important. Even the mockup shown in the blog post is perhaps less than ideal, as the text lacks any context or other details. Selfishly, I would prefer a tool that is actually a bit hidden, perhaps in a tab on a taxon page. I have previously noted that others have issues with the way the CV suggestions are emphasised for example and this might double down on the kind of problems experienced in some groups and regions.

20 Likes

true @pdwhugo

That’s the point though. This knowledge is already there, in sentences and paragraphs generated by people who hallucinate less than any generative AI. A tool that helps us find those sentences and paragraphs requires no generative AI. If staff had announced, “We’re going to build an algorithmic search tool to help you find user-written identification tips,” nobody would be objecting. The AI trying to synthesize, rephrase, etc. that carefully written content can only add errors. In this entire conversation, I have yet to see anyone suggest any plausible reason, except Google’s funding, to involve generative AI at all.

30 Likes

The blog post and other comments clearly state that the intention is to use generative AI to provide summaries of why a potential ID is suggested by the iNat CV system, at least to begin with. It’s absurdly disingenuous to ask “if what is implemented” when that’s been one of the stated objectives from the initial posts on this issue. It’s such a clear thing that this response has be a bit baffled and even more skeptical than I initially was.

The issue of “why so little faith” has already been discussed over and over again here, along with papers to support the skepticism. But to reiterate, some of the reasons why for “so little faith” have to to with the well documented unreliability of LLMs as a tool, the long standing stance of iNat as a ‘people first’ platform, and the instance that iNat does not use that particular type of AI. And let’s not forget all the discssions, of iNaturalist’s carbon footprint, and energy use where generative AI came up as a topic and iNat assured everyone that iNat was not using generative AI as we all know the absurd amount of energy generative AI uses.

Quite literally on almost every level this proposal goes completely against both the stated ethics of iNaturalist and what iNaturalist has said in the past.

We have all been wading through the rampant enshittification of damned near everything online and every online platform, and have been watching the most popular search engines collapse into nothing more than bullshit aggregators as exactly this sort of AI implementation has been rammed down ever user’s throat without any option for feedback or say.

Can you really, honestly, say that anyone is surprised that there is vehement backlash when the same is proposed for a platform that has consistently stated that human interaction and connection is a critical aspect from the start? Is there really anything startlin about a lack of trust in this when the proposed roll-out was so badly botched it was reminiscent of Apple’s ‘our AI gets to use everything you produce and none of you get a say’ actions (and before anyone whinges about iNat not using the data in that manner, that’s missing the point, the point is about the user interactions)?

This is both a PR disaster and an implementation disaster for iNat, and if iNat ignores its user base it’s going to devolve into just one more purveyor of LLM bullshit, instead of one of the most respected and used online platforms for scientific and nature engagement on the entire internet.

24 Likes

I mean, the blogpost literally says: “We’d like to use GenAI to tell people not just which frog it is but why it’s that frog” and “We will incorporate a feedback process for the AI-generated identification tips…”

If iNat didn’t want us to think the text will be AI-generated, maybe it shouldn’t be saying it will be verbatim?

If the actual plan was to hide away the AI-generated text until it had been edited/verified by a human, maybe someone should have said that instead?

16 Likes

I don’t have any major points to add that other people haven’t already made. But I just want to weigh in to say that, like almost everyone else, I’m incredibly disappointed by this. The news itself is vague and worrying, and the way iNat has handled sharing it is pretty disastrous. I really hope they’ll reconsider, and if they don’t then I’ll certainly think about deleting my account.

13 Likes

One thing is clear to me – I see too little value addition to iNat (AI-generated taxon ID summaries that many users will like and many others won’t trust), in exchange for enormous value loss (many seasoned users assuredly quitting, and worse, deleting their past contributions).

I think everything else about this is secondary to that. More concerning to me than what is right or wrong is what’s at stake if this results in a fallout. If iNat is going ahead with implementing something like this, it needs to take users into confidence and consultation, however tedious or messy that is, and do it in a way that’s acceptable to most if not all.

18 Likes

I don’t have anything clever to add. I believe this is my first post on the forum.

There is a significant difference between what I might call “generative ai”, which is just a cut and copy and cut and copy telephone game a computer plays with itself, and traditional machine learning. I’m ignorant on the details, but I am an artist, who will never post their work online again. I’m a writer, won’t be posting that online ever again.

I also work in a school, while attending college myself. I successfully introduced inat to two new users yesterday alone- one homeless guy I was chatting with near the high school I work at, and a fellow insect enthusiast and classmate at my college. I’ve gotten kids at my high school to look closer at the “eeeeeee a bug!!!” Rather than just running away, I’ve gotten to tell them about all the different trees on the property, and encouraged them to learn with the tools they have.

All that to say, inat is successful, I believe, because of the human connection. I am wildly ignorant about much, but I know I can trust certain users for certain ids. When a spider guy can’t confirm an id, I trust him. When the aphid lady corrects my thoughts, I trust her.

I will never trust generative ai, and it will always be bile in my throat. Each time I have to leave my home and see another gods damned data center that took over a field of milkweed I’d been excitedly watching, not even desperately needed apartments, another damned data center, my heart sinks low, and I weep. Each time someone tells me to my face students don’t need to learn art or writing, chatgpt or whatever other nonsense can do it for them, my tongue shrivels into my heart so I do not poison them with my words.

The more I see so-called-generative-ai (we do all know that is a misnomer), forced into everything online, and more and more frequently in person- can’t even go through a drive through to grab coffee without running into it, the more I remove myself from online spaces I have existed in for the better part of 30 years.

Dead internet theory has come wildly undead, and its hydra heads snap at the heels of us all.

(And yes, I did write this myself. I wrote it from my heart, yes ignorance and all, and nothing more than a love for the world around all of us- the real, tangible one, not the fake, immaterial, hollow one Google and other big tech hydras want us to live in.)

33 Likes