Uploading garden plants as wild: The potential for false naturalization data to spread

Scientists interested in cultivated plants can readily download the data from iNaturalist, and I’d expect they’ll be grateful if the dataset is already accurately separated into cultivated and wild.


Here’s an observation worth considering. I recently shared my restored woodland garden as part of a garden tour here in Cincinnati. One of the visitors attending the tour made an iNat observation of Arisaema dracontium existing in my woodland garden. Understandably, the observation is marked “wild”. They didn’t realize I planted it as part of my backyard restoration.


I added a comment to this users observation letting them know that this observation is “cultivated”. I also added an “observation field” linking to the “parent colony” from which the genetic material was sourced.

So far, despite my comment and link to the parent colony, this observation is marked “wild”. This observation has also been “featured on 1 site”… gbif.

This is exactly the opposite of the outcome I would like in terms of software and data. It can only exist in gbif as something that it’s not. If it isn’t changed to cultivated, future visitors will continue to assume it’s wild… and the data will become increasingly polluted. If it is changed to cultivated, it might lead to errant future assumptions about the provenance of presumed wild descendants.

I was thrilled to have a visitor observe something I’m trying to re-establish on my lot. I saw this as a small acknowledgment that they appreciate one of the species I’ve chosen to try to re-establish after invasive removals.

Incidentally, the wild parent colony that I link to from the observation field for my cultivated plant, is not featured on gbif for some reason…. Possibly cumulative ids. This is incredibly ironic. The cultivated plant is in gbif and the wild one is not.

1 Like

You can mark the observation as “Not Wild” in the DQA which seems like it would address a good part of the issue.


Not from my perspective.

But I did it anyway. I do prefer to remain part of the community.

  • also I didn’t realize I had this power.

Why? Doesn’t this contradict your statement in the other thread? That recording provenance can provide important information about whether the reestablishment of plant species was assisted vs. spontaneous?

The reason it might lead to errant future assumptions is because the data is rejected by gbif if it’s marked cultivated. For this reason, it doesn’t contradict what i said earlier. Either way, provenance data is erroneous or rejected altogether.

I am not ideologically against the distinction of “not wild”, “cultivated”, or even “casual”. But the suggestion that these observations, especially of cultivated local genotypes, do not have value to scientists in particular to me seems silly. I’m not a scientist… so maybe that’s just my perspective.

Thanks for the goldenrod guidance; I’ll make use of it!

I think it’s mostly about focus. Information about cultivated local genotypes could be useful, but the focus of GBIF is more wild native & naturalized stuff, so people who are interested in the cultivated or semi-cultivated stuff aren’t looking there, and people who are looking there don’t want to see those records.

One nice thing about iNat observations is that you can add notes, so when in doubt about how to categorize, or when you have a casual record that may be useful, explain where it came from or why it’s cool. I definitely read notes like that when I’m looking for particular things for research.


The fact of the matter is that data on naturalized plants is really in its infancy. Most of these cases are overlooked as “it’s a garden plant, ignore it” or botanists simply not looking in urban areas or near urbanized places for plants. Citizen science has given spotlight to a lot of urbanized species that were not known previously to “jump” the fence or grow as a waif.

As such, I’ve always had the firm opinion that data should not be blindly accepted until it is proven to be credible. For instance, there’s some species I’ve found actually do naturalize pretty widely, and the likelihood of iNat data being “wild” is quite high. But there’s others that are firmly sterile or will not reproduce in an area, so you can always safely go through and mass-mark them as “captive”.

So in short, if your plant has no proven or common ability to naturalize, it needs manual review. That solves most all of these problems. What it doesn’t solve though are species with known naturalization tendencies that are actually planted and still uploaded as wild…that’s the real issue here.


People using GBIF (rather than iNat) cannot draw any conclusions at all about whether or not there were previously non-wild specimens at a particular location because GBIF does not collect data about cultivated plants. On the other hand, if your planted specimen is not marked as non-wild, GBIF users will erroneously conclude that it is a wild specimen, which would seem to misrepresent the situation.

Is your concern that if one record is marked as cultivated, iNat users might assume that later observations of its wild descendants at the same site are also cultivated and mark them as such?

This might happen, or it might not. Context matters. There are lots of situations in which wild and cultivated specimens might exist in close proximity.

It would probably make sense to include a note about the regeneration project in cases like this. The notes field exists for precisely this reason – to provide additional relevant information to understand the observation. If you want people to know something rather than making assumptions, the best way is to tell them. (BTW, I believe that GBIF does include the user’s observation notes in the records it imports from iNat, so it is possible to ensure that information about provenance is available, even if there is no specifically dedicated field for this.)

Occasionally it happens that users don’t read notes, but you can always counter one “not wild” DQA vote with a “wild” vote of your own.

It seems to me that a series of iNat records – an initial planted observation followed by wild records of other specimens at different nearby locations in subsequent years – would allow future users of iNat to draw conclusions about the process by which the species was reestablished in the area. But the only way these future scientists could know that the process was human-assisted is if records of the planted specimen are marked as such. This is why I didn’t understand your concern that having the observation marked as cultivated might lead to wrong assumptions. On the contrary, it is leaving the observation marked as wild that could result in the erroneous conclusion that the plant reestablished itself on its own.


Spring Grove Cemetery and Arborteum is a lovely place that I’ve been lucky to visit on multiple occasions. There have been 1,625 total observations made within Spring Grove. 302 of the 1,625 are “casual”, mostly cultivated observations. Not surprising because it’s an arboretum.

Spring Grove hosts 24 champion trees which they define as “… a tree that measures out to the highest number of points for its taxon or specie.”

One of the cultivated champion trees growing there is an Amur Cork tree. Spring Grove goes out of its way to host this tree responsibly. Carefully explaining that the species “is invasive but hasn’t spread at the rate of the Asian honeysuckles and Bradford pear. Dioicous flowering gives the chance of growing male cloned specimens but they may have the same reversion factor of other dioicous trees. The tree is noted for its deeply furrowed corky bark and was a highly recommended tree before invasiveness became an issue with ornamental horticulture, although the floral quality has little impact. This champion specimen has a circumference of 157-inches, 46-feet in height, 64-feet spread and 219-points. Finding this tree is easy along the road in Section 95.”

Buttercup Valley is a lovely place that I’ve been lucky to visit on multiple occasions. Buttercup is a nature preserve that shares a border with Spring Grove. There have been 21 observations of “wild” amur cork trees that have been identified and removed by volunteers and others at Buttercup.

If someone made a casual observation of the champion Amur Cork, I would be in favor of including that observation in the historical record. I also believe we shouldn’t be overly aggressive about filtering out the champion tree when ordinary people click “explore” and do a species search for Amur Cork in Southwest Ohio on iNat. Data filtration and exclusion can sometimes inadvertently become willful ignorance. All of the champion trees at Spring Grove are wonderful and I highly recommend a visit. Parts of Buttercup Valley have been untouched for 100s of years. Some of the wild trees growing there are over 200 years old.

The two places, neighboring each other, offer a unique opportunity for scientific exploration. Even by non-scientists. It would be a shame if people only visited one or the other but not both.

This is interesting info, but it seems tangential to the issue of captive/wild to me. The Cork tree is clearly cultivated and should be marked so.

If a scientist is looking for cultivated trees, they could search for that on iNat. Even without iNat, they would look for arboretums or, if they want champion trees, could find the same list you posted a link to.

And yes, I can confirm that GBIF does include user notes RE:

So adding info here is very useful. I’ve used info in this field when using GBIF exports of iNat data.


You are right…

This tangential issue i’m describing is “uploading garden plants as cultivated, the potential for false naturalization data to spread due to default data filtration and errors of omission rather than commission”.

It may seem redundant, but disallowing cultivated plants to ever reach research grade means they have less engagement due to filtration. If you want me to create a new thread, I will, but I’ll risk being perceived as redundant. If you think a new thread would be redundant, I won’t create one and anyone will be able to read the thoughts here. If you don’t like either of these options, just delete it… but you said it’s interesting so…

Let me know.

Uploading cultivated plants as cultivated does not create false data about naturalization. Cultivated plants are, by definition, not naturalized – naturalized plants are those which have escaped cultivation. Observations of cultivated plants do not tell us which species are in fact naturalizing. To determine what is naturalizing and what isn’t, observations of cultivated plants have to be marked correctly.

Nobody in this thread has been suggesting that cultivated plants are not interesting or worthy of attention in their own right – merely that data about cultivated plants and data about wild ones can tell us different things and that is is therefore useful to distingish between them.

GBIF has made the decision to not include non-wild organisms in its dataset. This is not a denial of the importance of cultivated plants: it is a legitimate part of scientific research to decide what types of data fall within the scope of a particular dataset. This may be narrowly defined, or more broadly, but no researcher indiscriminately collects all data about anything without some selection process.

This choice does not prevent researchers from studying non-wild plants using other data sources.

Many of us support changes to the way iNat handles non-wild observations. None of us are in charge of the site; thus, we do not actually have the power to enact any changes. There are various open feature requests. It may make sense to bring up the concern there (and vote on the feature requests).


Your response has five hearts and my proposal of a new thread has zero. Using engagement as a means of anointing cultivated observations with research grade status might be an interesting idea to consider.

The bar that cultivated observations have to clear to reach research grade status could and should be set higher than wild observations. But I do think some cultivated observations merit research grade status. Not allowing this to happen, I believe, risks inadvertently omitting or obscuring valuable naturalization data which should include planted instances of currently naturalizing species. It’s ok if I’m the only one here who thinks this… unless you all decide it’s not.

I think you are being misled by the labels here. Nobody here has been claiming that cultivated observations cannot have scientific value.

“Research grade” has a specific, somewhat non-intuitive meaning on iNat. There has been a lot of debate about the labels, and whether “research grade” vs. “casual” reflect the intended meaning. You will find countless forum debates about this. If you have new ideas about for clearer labels, I’m sure they would be welcome. But first you have to understand what the underlying categories mean.

“Research grade” is not a judgement about the scientific value of an observation. It merely indicates that it meets certain pre-defined criteria and will be shared with other databases like GBIF: 1) the data is complete and accurate (i.e., date, location, recent evidence of organism, etc.), 2) it has a community ID at species level or below (or below family level, if users have clicked the box “ID cannot be improved”) and 3) it is wild. The last criterion is connected with the preferences of GBIF.

Thus, by definition, “research grade” excludes non-wild observations. There is no “bar to clear”. It is not about “merit”.

Scientists using “research grade” observations still need to check their data set for wrong IDs, anomalies, etc. – the label does not guarantee quality. It is meant as a way to select those observations which fulfill the minimum criteria needed for usable data.

There is nothing that prevents non-wild observations from being used by scientists. It is a bit more work to select the correct parameters to access them, but the data is there and available. Casual observations are not “excluded from the historical record”. Users are not forbidden from uploading non-wild observations.

The fact iNat lumps non-wild observations under “casual” together with observations that are missing data (date/location) or have other issues is a problem. There have been many calls to change this. There have also been requests to allow users to set defaults that include both wild and non-wild organisms rather than just the default “wild only” as is the case at present.

The reason you are not getting resonance for your suggestion is not because users think that cultivated observations are meaningless. E.g., see this feature request full of posts by users asking for specific changes in the way that non-wild observations are handled. Perhaps some of these proposals would satisfy your desire to see non-wild observations made more visible on iNat.

The discussion is going in circles because your arguments seem to be based on an incorrect understanding of what the “non-wild” label on iNat does and what it means and because you seem to think we are all discriminating against non-wild plants when we are not.


I agree with you that cultivated observations can have scientific value.

I agree with you that there has been a lot of debate about labels. Labels, in the end, are about the modeling of data. How the data is presented to users. In the case of iNat, data about plants that are planted is never modeled with anything other than the “casual” label. This “casual” label triggers other data presentment issues to occur. When I click “explore” for a species, I am not presented with any cultivated (casual) observations regardless of the level of engagement they may have achieved.

This lack of presentment further prevents cultivated observations from getting the same opportunity for engagement as wild observations. I don’t agree with this. It would be better to offer them the same opportunity for engagement while at the same time setting a higher bar for “research” grade attainment.

Scientists interested only in “wild” observations could still search based on “wild” or “cultivated”. Providing a mechanism by which “research” grade is sometimes applied to cultivated observations would likely lessen the amount of cultivated observers entering cultivated observations as “wild” in pursuit of engagement.

In either case, there is nothing that prevents non-wild (or wild) observations from being used by scientists. Casual observations could still be hidden from the historical record by gbif, but instead of filtering out these observations using the “casual” label, they’d need to filter out these observations using the “cultivated” designation.

I do not have an “incorrect” understanding of what the “casual” label on iNat does. I do not think you are all discriminating against non-wild plants. If anyone is worried about this, they could simply increase the number of times an observation of a non-wild plant becomes the “observation of the week”. I think iNat is filtering and displaying the data differently thereby limiting the potential for engagement for cultivated observations (even really interesting ones). I think gbif is choosing to filter those cultivated observations out of their historical record making it even more important for them to be stored and modeled correctly here.

Thanks for the link. I’ll read through and contribute on this other thread as well. What happens is… i get an email telling me the “new topics” on this message board since my last visit. I occasionally read through and comment, getting drawn into long conversations. I don’t spend a lot of time trying to figure out if there are already “feature requests” tangentially related to the “new topics” presented in the email. I’ll try to be better about this.

It’s true that people rarely respond to requests for information about whether a plant is wild or cultivated. Therefore, if I think it should be marked “cultivated” I do that. I also request information and write something like, “Please comment. If it’s not wild I’ll change how I’ve marked the observation.” If the observer is long gone or ignores the request, it stays “cultivated.” Once in a while, somebody does respond, usually to say it is cultivated.


This summer my colleagues and I are trying to figure out the bamboos that are more-or-less wild in the PNW. We’re posting them on iNaturalist, both for communication among ourselves and to help us remember what traits each stand has. Some of these bamboo are clearly wild (spreading by fragmentation along streams). Some are clearly cultivated, and we mark those as such. A lot are in the gray zone, persisting and spreading locally in long-abandoned gardens or present in a currently active garden but spreading under the fence and going down the road. I’m marking the “gray zone” records wild. (It’s hard to label bamboo shoots coming up through asphalt at the edge of the road as “cultivated.”) One could argue about whether that’s the best way to treat them, but at least in this case calling them wild can’t do much harm (I think) because anybody studying these bamboo will understand this complexity. Or should.

The bigger problem is my beginner’s enthusiasm for identifying iNaturalist bamboo observations, which has messed up identifications in the genus Sasa at least. I have some error management to do.


This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.