It doesn’t override the observer’s decision, since it only takes a single vote to return the observation to Wild status. If the observer or anyone else notices the system vote and thinks it incorrect, they can easily override it.
True but if one person agrees with the machine, then the observer’s decision is overridden. My original decision, without the DQA, was that it was wild and expressed that in the record.
I’m frankly amazed at how long this discussion has gone on. I didn’t bother posting a reply when I first saw the topic because I thought it would quickly resolve itself. Since it has not, here’s my opinion: non-established observations are obviously wild, and should be marked as such. For one thing, it says so in the official guidelines, which should guide our behavior whether we agree with them or not. For another, as has been stated many times by people more knowledgeable than myself, such observations have the potential to be useful in detecting new populations that may become established. While it’s true that it is not iNat’s job to track potential invasives, that’s no reason why we should withhold the data we have. I certainly don’t agree with the point @ericroscoe seems to be making, that we should decide whether observations should be RG based on other research. What purpose would that serve? iNat data has the potential to provide helpful information on the spread of non-native organisms, but if we wait to make the data available until we have already decided whether the organisms are spreading, it will serve little purpose.
The purpose it would serve would be to allow for more thorough and comprehensive literature reviews and species assessments for any given state or other geographic location than what INat can offer and whenever doing so may be warranted. This includes whether the organism could become invasive or established in similar climates or at similar latitudes, etc. Certainly, these types of independent and individual, one-off pet releases or escapes offer little to no discernable trends if we want to look at the “spread of non-native organisms”.
This is why we need to look at how populations of a species are existing rather than simply transient or displaced individuals of that species, as stated by @ptexis . The question at hand in this thread is also how we propose to handle non-established records, not what the existing rules are, etc.
Reading through this thread I see a lot of the discussion rightfully revolves around some of the grey-area cases, such as organisms that are part of an emerging established population. This makes sense, but it obscures some of the far more open-and-shut cases that are currently being marked as wild when there is no chance of them surviving in the environment. One of these examples is the Bearded Dragon mentioned above, but one that I have come up against is escaped aviary birds. For example if you look at common waxbill species such as Gouldian Finches and Double-barred Finches, there are tens of records from Europe or North America that could not become an established population in that area…
These are problematic for a number of reasons, including that it pollutes the distribution map in the species profile and forces people to significantly zoom to the native range to see the distribution of the species, as these errant records from North America or Europe force the default map to zoom out.
Secondly, a few people here have mentioned that it is the job of scientists to mask out these unhelpful records and not the responsibility of iNaturalist to determine. This is true to an extent, but having ‘dirty’ data will just make scientists look elsewhere for data and make iNaturalist less relevant for scientific research. Continuing the bird example, when there are so many errant and unhelpful data in the iNat database (that then trickles down to GBIF), that takes time and effort to filter out, scientists looking to research a species will simply turn to eBird, which avoids these bad data points.
Because of this, I think that although there are very valid discussions to have about the grey zone between escapee and establishing organisms, it makes a significant number of obvious cases fall through the cracks and ends up polluting and sidelining iNaturalist as a scientific resource.
Perhaps expanding the ‘captive’ tag to include organisms that are clearly recent escapees would be a better fix than many of the potentially problematic solutions listed above. What is clear is that there is not an obvious consensus decision that can be reached on iNat that encompasses all possibilities. But making no changes detracts from the value of iNaturalist in more obvious cases, and I would argue that it would be worth finding a solution to some of these observations while the discussion for some of the trickier cases and populations continues.
I came to this thread looking for a way to flag some of these obviously recent escapees so that they don’t keep detracting from the species profiles and data exports. It has been slightly disheartening to see that the problem has been known for years without any solution because there are more ambiguous cases.
Hope this helps,
Nathan.
Records can’t be unhelpful, iNat is not a scientific platform, it records observations of organisms by humans, so no, that’s not dirty data, iNat has totally different aims and goals, so if scientists want to use eBird solely, it’s up to them, though there’re tons of escaped birds there that are marked as ok, platform lacks experts everywhere out of elite group of countries, so please don’t make it seem as eBird is doing the same thing better, they just do the different thing.
Just from the point of view of utility. I know that there are several times that I have used eBird for input data for modeling rather than GBIF (and iNat that feeds into it) because it is far simpler and easier and quicker to turn into a product that can produce valid results.
iNaturalist has incredible utility for records because of the spatial resolution of each point - which is generally far better than eBird, but for people looking to use the data, this utility has to be balanced with the manual effort required to clean each export. Scientists are already overworked and underpaid, so I think there absolutely is some onus on iNaturalist as a platform to make its data as helpful and accessible as possible.
My point is there should be steps taken to allow people to mark escapee records to make it easier to filter them out for scientists even while there is still valid and enduring discussions on how to approach some of the trickier cases.
Scientists have to check ids anyway, they shouldn’t believe them without that, meaning they need to check them, it’s not hard to filter out areas with no stable populations.
Sure, third category is needed, but as escaped animals fit perfectly under current wild definitions I see no justification of getting them into captive pile as you suggested.
You can easily use annotation fields or ask people to use tags to filter those out.
Tags or annotations would be fine, as long as there is a consistent and recommended approach that the community can apply to all the ‘problematic’ records. The point of Research-Grade records is to make it so scientists don’t need to individually check records and can trust the export, this partially gets negated when there are records that would invalidate modeling or other research in the export dataset (only partly because generally when filtering out these bad records you wouldn’t need to cycle through each one).
Any solution would be fine, but at present, there is no consistent approach that allows researchers to catch the lion’s share of all escapee records without just looking for ones that look out of place manually. I’m just suggesting that there is some site-wide guidance on how to treat these obvious escapee cases even if a catch-all solution is still being developed that takes into account the far more ambiguous and tricky taxa or observations.
Point of RG is just to filter those obs that currently have community consensus and are most probably ok to go to different platforms, it doesn’t make them all correctly ided, users of data must check them. But to stay on topic, nothing prevents creating the field and using it, all you need is time to add it, most users allow fields to be added, so it all can be done in a week, site-wide use comes from how common it is, if there’s a user who can monitor that, others will follow.
As I noted above, I don’t think it’s accurate to call this data “dirty” or lesser quality or “detracting” from data exports. It’s just different. Trained scientists are used to filtering data and will always be doing data checks, cleaning, and quality assessments with whatever pipeline they utilize before using data for any kind of final product. Other data sources, including natural history collection (NHC) data from GBIF, have the same types of records of waifs that iNat does (see the case study I posted on Cuban treefrogs above). A scientist using GBIF data will have to take the same types of quality control steps to use these data regardless of whether iNat data is included in their GBIF export or not, and it doesn’t really add time or effort to the data processing. In fact (speaking from personal experience), it’s often significantly easier to clean/process iNat data than NHC because the source of the record is more accessible (photo as opposed to specimen you’d have to physically handle) and the location data is generally of higher quality than older NHC records that have been geo-referenced post-hoc. Most scientists will generally want to clean their own more complete dataset to meet their requirements rather than having a less transparent, less consistent process producing their dataset (which is what I think would result if iNat users tried to determine what was/wasn’t established in an ad hoc manner left open to user judgment).
I got a chuckle out of that, considering the amount of time the super-reviewers – volunteers all – spend on identifying records and trying to clean up problematic ones.
iNaturalist is a citizen science project. People who expect the data won’t reflect the diversity of citizen input are being unrealistic. Yes, we need to try to keep the records as clean as possible. However, those scientists who want the data collected according to the highest standards of accuracy should pay people to collect it that way, not expect that a citizen science, volunteer project will exactly meet their needs. iNaturalist provides lots and lots of (somewhat imperfect) data for free. That’s a very good thing!
And I will say, I enjoy coming across an observation where the notes reveal the person to be fairly new, and excited to be discovering nature. I remember one in particular where the observer commented on how fuzzy the plant was and that they couldn’t stop petting it. It gave me a smile.
@nrg800 @loarie : What if we expanded the “Location is Accurate” DQA point to include/also mean that the organism is either within its natural or native range and/or has to meet additional prongs as it pertains to the definition of “wild”, such as also being in a natural state or environment for the species as a whole? This would eliminate highly ambiguous places that are clearly not natural environments such as greenhouses, etc.
This could be extended to just about any other native/non-native species which enters a house or building (i.e. see the butterfly example) in that we can determine/differentiate whether that butterfly is still of a native or indigenous species of said butterfly. This I would suspect help clean up at least a large share of these types of records, and also remove a lot of the ambiguity than if we had only one DQA point.
The problem I see with users that are new is that there are just as many who do not understand how to use iNat in the first place, or what it is for (i.e. not their entire trips to the zoo, class projects where the animals are clearly captive or do not match the locality data, photos with nothing in them, etc.) as there may be who are attempting to submit a legitimate finding. That’s why I like to view everything with a much more jaded and careful eye.
Why would we want to expand the use of that DQA point though? It’s already one of the most misused features on the entire site. And it simply isn’t correct to say the greenhouse you photographed in a greenhouse isn’t the true location.
I don’t see issues with the “true” location nearly as often as I do many other things, which was why I suggested it be expanded to make it a more useful and widely applicable DQA point in more cases. Site admin might have further insight into those statistics, although I would suspect those issues to otherwise be very tiny. Without that, or perhaps some other additional DQA point, some folks might be left to their own knowledge and discretion, which isn’t ideal.
That completely changes the meaning of the DQA point though; as currently defined, a ‘location is inaccurate’ is flagging a user correctable problem with the observation, as are all the others with the partial exception of ‘recent evidence of organism’. And there is a very real practical difference between someone uploading their photo of an Thomson’s gazelle they took in Africa accidentally geotagged to their house in Arizona and actually seeing a feral Thomson’s gazelle in their backyard in Arizona.
When people use any of the DQA flags solely to express their general displeasure with an otherwise fine observation, it distorts the meaning and makes it even harder for people who like IDing captive observations to find. It can also discourage the observer by causing it to appear there is a specific problem with their handling of the observation, when in fact there is no problem with the observation itself, other than that someone on the internet doesn’t like it.
Somehow though the “Explore” map for any given species should, at least ideally, reflect and represent that species’ natural and indigenous range as closely as possible to what other range maps I can find and compare with from elsewhere. I would tend to agree that having a bunch of irrelevant data points for a species elsewhere, where there is very little to no chance that it could survive or form evidence of an established representative population there, only contaminates the data map as a whole, and forces it to scale out, rather than serving as a true or very useful reflection of what that species’ range actually is, or what one may otherwise want to look at.
If, perhaps, there were also a way of making it so that an organism has to have been part of a reproductively active or viable, self-sustaining and/or at least overwintering population or group of the same species in a given location of over one climatic year (not including naturally shorter lived organisms of one year or less), this I would suspect, would rule out a lot of such one-offs, recent escapees, or other individually transient specimens, which might be even better.
I work thru those out of range obs when I come across them. Can’t make much headway, but can at least help the ones that I stumble over.