What is this - iNaturalist and generative AI?

This is an unfair and inaccurate characterisation

it takes a lot of money to keep iNat going, literally millions of dollars to keep the servers running. It is extremely disingenuous to characterise this as ‘money-grubbing’, as if iNat staff were somehow eager to line their pockets with cash

27 Likes

Somebody posted the expenses, but its buried enough that im not looking for it in this thread.

I think it was like total of $5 million a year? Could be me miss remembering.

1 Like

As usual, PZ Meyers of the blog Pharyngula as strong opinions on this https://freethoughtblogs.com/pharyngula/2025/06/30/keep-your-ai-slop-out-of-my-scientific-tools/ . Unfortunately, I agree with him. More unfortunately, I expect I’m going to have to learn to co-exist with this AI. Perhaps it will cause me to learn new expletives.

5 Likes

Currently, the identification experience is near perfect

wow. No wonder he is upset.

4 Likes

A lot of people are still referencing the environmental impacts of generative AI, so I feel like it’s worth reiterating some points on that which have already been made. Of course this is just one factor among others people have mentioned which I understand better, but I don’t get statements like “as we all know” about how severe the environmental impacts are. It seems like nobody really knows, and they’re probably less than what most people think.

From the other thread on this subject:

From earlier in this thread:

Back-and-forth from the comments under the blog post:

Obviously LLMs in general are resulting in new buildings being built and large amounts of energy being used etc. that weren’t formerly being used, but it seems like individual use of them is kind of negligible compared to everyday things like driving, watching TV, running other electronic devices in your house.

My impression is that probably the most intensive energy use period comes from the initial training of the model, when it has to learn from everything in the internat and scanned books and whatnot that can be found for it. If iNat uses an already trained LLM and just updates it with information from iNat, like how Discourse’s summary AI here does, then that skips a lot of the environmental impacts. If a brand new LLM is trained (I’m not sure exactly how they work but that seems unlikely to me?) then maybe more but that’s still far less than training on the corpus used by mainstream LLMs.

7 Likes

Ouch, that’s huge. Another good reason to keep one’s contributions to a bare minimum, to ease the financial burden. Especially limit “ID tips” comments, as they take space and trigger LLM training. It’s double the injury, if iNat resign themselves to accepting Google AI money to pay electricity & AWS bills. :sweat_smile:

2 Likes

as much as people say they are afraid of generative AI producing misinformation, taking things out of context, not properly citing sources, etc., it seems like the human that produced your article – as well as the authors of the other articles it links to – do exactly those things.

you don’t need generative AI to harm humans and destroy the world. humans will do that all on their own, and we’ll get to that destructive end state all the quicker when we perpetuate moral panic.

12 Likes

Alternatively, iNat could just hire and pay human beings to write it.

3 Likes

sure, pay people to make observations and to identify, too.

1 Like

It’s true that we humans can mess things up. The thing is, I don’t think we need the aid of AI, which can mess things so much more quickly and thoroughly.

8 Likes

why do you think it’s inevitable that aiding humans with AI will result in messing things up? why is it not possible that it could make things better?

1 Like

It’s hard to estimate, yes, but I have not seen enough evidence to believe that it’s less than people think. In the thread above there is one Nature article looking at Meta Llama (Llama-3-70B/Gemma-2B-it) models. The comparison to humans is questionable to me, but that doesn’t mean it isn’t as bad as we thought. There are some questionable decisions on their methodology imo, e.g. by assuming a human will write 300words/h but the LLM is will only need a single prompt to arrive at “comparable” results. Note this comment:

We acknowledge that LLM performance may vary across different content types and that our analysis does not account for qualitative differences in output between LLMs and humans. Thus, our study aims to provide a quantitative comparison of resource utilization rather than a qualitative assessment of content.

The human resource consumption is rather unfair, e.g. water consumption is based on the daily water consumption of a human prorated for the 1,67 h it takes them to write the benchmark task + the amount of water needed to generate the electricity at that time. This is regardless of wether the water consumption can be attributed to such a task.

The results there are for specific types of models. Heed these warnings:

However, the growing model sizes driven in part by the scaling law (e.g, recently released Llama-3.1-405B16) will likely increase the energy consumption and the associated environmental impacts of LLMs substantially.

Despite the potential efficiency advantages of LLMs compared to human labor, we emphasize that our analysis is not intended to derail the ongoing efforts to curb LLMs’ own large environmental footprints.

What is more, by comparing the efficiency to human labour, the only sense of scale is that LLMs are “efficient” (allegedly) is if you want to substitute humans. Yet we will continue to exist, even if the user prompter isn’t counted on the energy consumption of the LLM:

presenting a comparative assessment of the environmental impact of LLMs vs. human labor, examining their relative efficiency across energy consumption, carbon emissions, water usage, and cost.

In this article they go over the cost of training and inference of GPT-3 (Preprint), and they also mention this:

As acknowledged in Google’s sustainability report9 and the recent U.S. datacenter energy report,25 the expansion of AI products and services is a key driver of the rapid increase in datacenter water consumption

The Washington Post reported on GPT-4(2024-09-18, so not the latest models) (Archive.org version) using he same methodology as the paper above. That’s the famous 0.5 liters and requires 0.14 kWh of electricity (about the same as using 14 LED light bulbs for 1 hour) per 100-word query.

7 Likes

I don’t think this discussion is productive anymore. There probably isn’t anything new to be said. Especially at this point, where most of the comments are based on speculation.
Personally, I feel like staff have addressed all major concerns that can be addressed in these early stages, and unless what they have said was an outright lie (which I really do not believe), then there seems to be little to worry about.

I don’t know anyone on the iNat team personally, but over the years on iNat and the Forum none of them have made the impression on me to be greedy persons only intent on lining their pockets. Many of them are active iNat users as well. So I completely trust them that they know what they are doing and won’t ruin iNat.

P.S.: I also think a lot of the negative attitude towards the technology of GenAI is largely misplaced. AI Slop, copyright violations, etc. aren’t inherent problems of GenAI, but rather stem from corporate greed and the companies’ (Meta, Google, OpenAI, etc.) disregard for ethics. If anything, that makes it easier for them to deny responsibility and continue their bad practices.

17 Likes

A forum specifically where people can post under species/genus pages about ID’ing would be an invaluable tool for both new and experienced users.

Yes! I think I wanted to suggest something like this a feature request but I was forwarded to some discussion about the generative AI collecting users’s notes… that wasn’t my intention because my request was for human-made guides.

2 Likes

Yes, absolutely. People would b ehappy to do that. I know I would.

I like guides such as opening a taxons and reading: Bougainvilleas have spiky stems, pink flowers in X or Y shape, etc.

So I can read that and see if my plant has spikes, or if the flowers have that shape.

But in the taxon page, and written by humans of course.

1 Like

In parts of the developed world, budget-tight companies and administrations are already remarkably tolerant of (…or even explicitly after /grin) careless human work when it comes to enviro stuff, provided it’s cheap. And we’re mere months away from free machine-generated slop becoming “good enough” to substitute for that, thanks to the dedication of many.

Considering how a few students, interns, and pros alike already relied discreetly on iNat (or better, unpaid minions called “identifiers”) to do the boring, tedious, tricky tasks in their stead… and how these same “identifiers” could soon be kindly asked to do the training and fixing for a new batch of AI – thus helping iNat remain in the tech race and popular and funded…
a minimal reward (beyond the warm fuzzy feeling and some impersonal ‘thanks to our great community’) is perhaps not such a ridiculous idea – even if not feasible in practice.

Thought everyone might like to know someone’s made a video about this, with over 6k views.

https://www.youtube.com/watch?v=vMc7sVrGKyU

“Google bribes iNaturalist to use generative AI”?

You don’t find it ironic that a video criticising generative AI with one of the concerns being accuracy/disinformation peddles blatant disinformation?

23 Likes

I think that sometimes AI will make things better, or at least will accomplish tasks faster. But is that worthwhile? Maybe, if people check the work. But isn’t the point of having AI do the work that it permits one to eliminate having humans do it?

One way I experience AI is by using Google translate. It seems to do a good job most of the time. With highly technical terms, especially if the same word is used in ordinary writing, it makes laughable error. (Believe me, if I can catch it, the error is laughable.) Especially interesting is when it translates scientific names.

I hear from friends who teach that essays generated with AI are pretty easy to detect (but essentially impossible to grade as zero because of university policy). I’m OK with scientific writing having a flat, uninteresting style, but the errors are a problem.

I’m OK with AI doing a secondary but potentially useful task, like compiling descriptions and identification hints. But for identification itself, I think the suggestions from CV are enough AI. And if they’re not, if you want to use AI for identifications, what’s the point of having humans in the loop?

Sorry I’m not more coherent. I’m tired.

10 Likes