Are my concerns about AI-generated plant crud overblown?

That’s a great (but horrifying) read. By now many of us have probably had the chance to be alarmed by the whole-cloth inventions of ChatGPT etc when asked for scientific answers. But I really liked this author’s exasperated exploration of the comprehensive failings of “Groot” (and the great illustrations). Thanks for sharing that.

And thanks for sharing your thoughts on things like the lack of an “uncanny valley” for plant images, and the potential value of iNat identifications as a source of quality information amid the increasing crud. I agree with all of it.

[Edited to add some new twists…]

My employer just updated us to a new version of the Microsoft Edge browser, with Microsoft Copilot built in. While this obviously isn’t intended for botanical queries, I tried a few of the questions from the tradescantia.uk article and I was gratified to get better advice (e.g. “Don’t eat Lily of the Valley”).

I asked it “How should I care for my Sisyrinchium chaguaranicum houseplant?” and it told me “It appears that Sisyrinchium chaguaranicum is not typically known as a houseplant” and then proceeded to cite back to me a bunch of info I had provided in this forum thread, which simultaneously shocked and amused me.

I was starting to wonder if I had been too hasty in my concerns about generative AI and scientific knowledge. So asked Microsoft Copilot about a different uncommon species, which I hadn’t previously mentioned in the forum: “How should I care for my Sisyrinchium valparadiseum houseplant?” I kind of expected to get the “not typically known” answer again, but no—Copilot gave me a bullet-pointed list of familiar care tips, complete with watering “around 0.8 cups every 9 days”, all of this sourced to another page on “Greg”!

5 Likes

I am disconcerted when I remember the Forum posts are public.
Were your Copilot queries Incognito? Otherwise you get your own search history flung back at you (which is, okay - if that is what you wanted. Not so much if you wanted a ‘real’ answer

Greg - is an engineer and influencer. Focused on plants. Greg for the company which is Gregarious.

https://www.forbes.com/sites/petersuciu/2020/12/07/history-of-influencer-marketing-predates-social-media-by-centuries--but-is-there-enough-transparency-in-the-21st-century/

1 Like

He ate 4 destroying angels and lived to tell the tale :exploding_head: nothing short of miraculous!
He must have been a complete noob to mistaken Destroying Angel for puffballs.
There was a local Amish family that got intoxicated from Jack o lanterns thinking it was chicken of the woods after reading an article in the paper about COW with a picture of jack-o’-lanterns🤦🏽‍♀️

6 Likes

The Copilot queries were done in a newly-installed instance of Microsoft Edge that had not previously accessed iNat or the forum. (All my forum access was via Chrome.) So it does appear that Copilot had ingested this forum thread rather than relying on my browser history.

1 Like

Certain AIs like Copilot and Perplexity.ai just summarize the top search engine results to you. So if there’s bad search results at the top of the list then the AI summary will be poor quality.

Other AIs like ChatGPT and Claude.ai are trained on a huge amount of written material (including lots of internet content) but generally don’t have live access to the internet and have to produce a response based on their background knowledge.

If you’re looking for a useful response you kind of need to judge which of those formats is more likely to be helpful. For example asking the latter ones for recent news won’t work, and asking the former ones about abstract concepts is less likely to get an insightful response.

Maybe we can do an experiment to find out where LLMs might derive raw material for generating that plausible sounding text.

Aha, great experiment! The LLM did more than just generate plausible sounding text. It mined some (your) information from the web. The quality of the information it generates is tied to some degree to the quality of the information it finds. Through analysis of sometimes conflicting information, it may be able to make some informed choices. However, the process is imperfect.

You’re both right, and I guess I took my own discussion off-topic a little. My point for this thread was not so much that generative AI makes stuff up, rather that people are polluting our corpus of knowledge with the stuff that AI makes up.

I think the unreliability of LLMs has been shown repeatedly for plants, and for pretty much any aspect of human knowledge (e.g. simple arithmetic). This is to be expected because Gen AI is really just coming up with plausible words to respond to a prompt. My concern is that these non-facts will seep into the information sources we use to the point that we’ll struggle to trust information published since the use of LLMs exploded in 2023.

Before, say, 1980, all our knowledge on biology existed only in some print or handwritten format (journals, books, bibliographies, catalogs, lab notebooks, field notes). After that point, an increasing amount of our knowledge was stored in some electronic form (online databases, and later various Internet resources). That transformation (along with the open access movement and the digitization of archives) greatly increased the accessibility of human knowledge. Not all of the information was reliable, but it was fairly clear what types of errors to consider, from incorrect transcriptions, to author biases, to misguided analyses, based on the context of the information.

At this point, faced with something new we all turn to our personal selection of Internet search techniques, whether that’s a general search engine or something specific to a particular field. Prior to LLMs, the probability of entirely fabricated information being returned in a search was very low. Certainly, there are cases of outright fraud, like Piltdown Man, but the effort required ensured that scientific fraudsters were largely limited to areas that might bring prestige or wealth.

The type of LLMs currently available can, for almost zero cost, generate vast amounts of good English text that in many cases will provide a competent summary of a topic and in a few cases will be utterly fabricated. The low cost is the main factor that is enabling the creation of huge numbers of LLM-generated web pages. This content is not going to go away, and we can see that it already turning up in the general search engine results. (If it didn’t turn up in searches, there would be little incentive to create it.)

Our favorite specialist search engines may be a little safer, but for how long? Certainly the same LLMs are being used to fabricate scientific papers, and those GPT-fabricated scientific papers are turning up in searches on sites such as Google Scholar.

If I searched online in 2023, I could expect to find pretty good coverage of many organisms, limited mostly by copyright, paywalls and the effort of digitization. Even when the actual content was unavailable online, I could see a citation and decide whether to seek it out in print. Some content might be outdated or simplistic, but outright fabrication would not be a worry.

I’m concerned that doing the same exercise in the future I’ll need to independently assess the veracity of every source published after 2023—How do I know that this web-page/journal article was written by a human based on actual facts?

7 Likes

Not just that. There was a forum post a year or so ago that mentioned some “AI” field guides on Amazon that had backfilled fake histories for their fake authors, and I think even fake antedated publication dates. So just because something will say it was published before 2023 won’t actually mean that’s when it was produced. Things are going to get very tricky in a decade or so.

4 Likes