I would add a 4th which is you can’t assume the lack of the CV badge means the user independently did the ID. It is very possible to run the CV and then type the name in.
iNaturalist is currently automatically providing a Computer Vision suggestion for every single observation. I don’t think that this is inherently an issue, except that the Computer Vision suggestion is masquerading as a human ID as soon as someone selects it, which happens very frequently. This means that the Computer Vision suggestion will:
- Count towards a Research Grade status
- Be used to further reinforce the computer algorithm, which is circular and can reinforce mistakes
- Cause other people to think that a human looked at the photo, knew what characteristics were required to identify it, and decided that it was that species.
Every subsequent iNaturalist user that chooses to ID the observation is then prompted to add the Computer Vision suggestion again as their own ID, which just magnifies the above problems.
My suggestion is to not present the Computer Vision suggestion as a person’s ID. Provide the Computer Vision suggestion (or short list of suggestions), but not as an ID from a person that appears in the “Activity” section. People are able to look at the computer vision suggestion and think about it, but it is not presented as an ID from a human, it does not count towards a Research Grade status, and it is not used for reinforcing the computer algorithm. The wording for this suggestion could be very clear in stating where it comes from. I.e. “The computer algorithm suggests that this might be …”
@cmcheatle
That’s fine for well documented species, and places in the world with good coverage. It’s going to create a nightmare of issues in taxa not well documented, not easy to evaluate with photos or places with lower rates of coverage on the site.That’s going to just add to and likely create more problems as users insist on ‘well the computer says it is genus Setophaga’, so I’m going to ignore feedback from humans who say it is Setophaga magnolia. To say nothing of the CV is insisting on identifying the flower, when I want to document the bee on it etc.
If every record shows on it what the CV thinks it is, the level of and complaints about people blindly following the CV will skyrocket.
That is currently the situation, the Computer Vision suggests an ID for every observation, and it creates the issues that you are describing. My suggestion would (at least partially) solve this issue, by not presenting the Computer Vision suggestion as an ID from a knowedgeable user.
It could easily be set up so the Computer Vision suggestion could be deleted if the person posting the observation knew that it was completely off. But I personally don’t think that would be necessary, I would be interested to see how it was performing.
@wildskyflower
Pollinators, galls, parasites, bird nests in trees, the case I saw a couple days ago of some Chives growing through Angelica leaves where the user didn’t know what either was or even that they weren’t the same plant. In that case because both are IDable, so if the CV had forcibly autosuggested anything more specific than angiosperms it could have become an RG obs of that and the user might never have known what was going on… Yet in most of those cases it is relatively easy for a human to tell what is centered or what the more interesting obs would be. For those cases the list of CV suggestions is very helpful but whatever the top one happens to be does not.
This is good logic for why the Computer Vision suggestion should not count towards RG status. It currently does, and there is no way to prevent this because it appears as an ID from a person. But following my proposed solution, the Computer Vision suggestion is not appearing as an ID from a person, and would not be considered for RG status.
Yes, a list of Computer Vision suggestions could be provided instead of a single suggestion, if appropriate.
@DianaStuder
Asking people if they can ‘explain why it is this or did you just accept what the computer said’ might not help your issue, if they can click yes, next.Now - when a person accepts the computer suggestion, you have a human layer. Your version would be fully automated, without even the most basic
No - that rock - is NOT a seal.
My suggestion is that people would actually provide an ID if they knew what it was, and that this ID would remain a separate thing from the Computer Vision suggestion. The person may have no idea what it is, in which case there is no human layer. However, if the person is able to provide any information, it is clear what they are providing.
In the current interface, there might be a human layer, or there might not be. We have no way of telling if the person actually knew what they were looking at and agreed with the Computer Vision suggestion, or if they just clicked on it not having any idea what it was.
@fffffffff
Take in mind please not everyone can even type all the time, I upload mostly with 1 arm and don’t need to type in most cases, I get problems and pains from typing, so autofill is not the solution.
With this proposal you wouldn’t need to type at all to upload something. It is just clearly identifying that it is the computer vision suggestion.
@anasacuta
Using CV when entering an observation is not just about auto-fill, it can be of good help to remember the species’ name. I may perfectly recognise a species and yet not quite remember its name; or maybe I’ll remember the name but not exactly how to spell it (is it Turritellinella or Turritellinella; Mimachlamys or Mymachlamis?). If I can see the right species among those proposed by CV, it is much faster to pick it than to go and check the name elsewhere so that I can type it
My proposal would have the Computer Vision suggestion(s) right there, so you would be looking at the name while you started to type it for the autofill. I think that should still be pretty easy.
@cmcheatle
I would add a 4th which is you can’t assume the lack of the CV badge means the user independently did the ID. It is very possible to run the CV and then type the name in.
iNaturalist has a huge number of users, and a very high percentage of them are always going to be people that are relatively unfamiliar with the platform. There are always going to be misuses of it. However, the interface can be set up to try to minimize this. Currently the way it is set up is encouraging people to select a Computer Vision suggestion and present it as their own ID. In fact, there is no other way to do it. In my opinion, this is resulting in the Computer Vision function being a detriment to iNaturalist instead of the huge advantage that it could be.
Looking over the responses to my proposal, I am only seeing one criticism of it, which is that in some cases it would take a couple seconds longer to enter an ID. I personally don’t see this as an issue at all, it is a very small amount of extra effort in order to confirm that the information that we are getting is actually true (in this case, that a person actually ID’ed the observation). I am reminded of a conversation that I just had with one of my field techs, where they were trying to convince me that they could do their surveys more quickly if I didn’t make them collect any data. Which is true, but defeats the entire purpose of doing the survey.
However, if that bit of extra time is really that significant, it could be set up so that the Computer Vision suggestion had an “Agree” button, just like an ID proposed by a human does. Personally I think that would just make it too easy for someone to agree with the Computer Vision suggestion without actually stopping to think about it, but it would still be a lot better than what we currently have.
Making someone retype the computer vision ID as opposed to clicking the section makes the problem you are describing significantly worse. If anything, you should want to actively encourage users who are going to use the CV anyway to click the suggestion from the list. The sparkling shield icon, while not perfect, is very valuable in gauging how seriously you should take the observer’s ID when reviewing observations, and whether its necessary to leave a detailed comment justifying a disagreement. If its a CV ID, a comment as simple as ‘flower shape is wrong’ with no elaboration is often enough to get them to withdraw their old ID if they’re still around and its a taxa they know nothing about and don’t care to learn about. If its a manual ID, a higher level of detail in why you are disagreeing may be appropriate.
The main mitigation for these problems is just to make it more aggressive about suggesting higher taxa than species, which I think is already going to happen. Many casual users probably don’t even know the difference between IDing to genus and to species; the average person if probably mostly satisfied if the CV can tell them whether the small mammals they saw in the grass was a prairie dog or a marmot or a weasel.
This thread has made it very clear to me that this is not at all the case. Apparently a lot of knowledgeable users use the Computer Vision suggestion just to speed up their ID entry. The sparkly shield icon is telling us nothing, even if we know what it means. (Which I imagine that many people do not, I did not until I was told on this thread).
I can’t think of any reason why having users enter their proposed ID as something separate from the Computer Vision suggestion would make any of these issues worse, and as I outlined above, it would solve a lot of the issues.
The main mitigation for these problems is just to make it more aggressive about suggesting higher taxa than species, which I think is already going to happen. Many casual users probably don’t even know the difference between IDing to genus and to species; the average person if probably mostly satisfied if the CV can tell them whether the small mammals they saw in the grass was a prairie dog or a marmot or a weasel.
No argument from me here. The computer algorithm should be less confident in species level IDs. That would also help things a lot. However, I suspect that this might be more difficult than it would initially seem. It seems much better at identifying likely IDs than it is at deciding whether its ID is likely to be correct, and I think that will probably be a more difficult thing to program.
I’m not sure where you get the idea the computer vision is run on every observation automatically. I don’t think that is the case. It is only initiated by a user request. You can’t find a ‘cv id’ anywhere in the observation data that I am aware of.
Yes, this is something I think about quite a lot. I am not a programmer (and I’m not sure even the programmers know what features of a photo the computer vision “looks at”) but the computer certainly does not “know” which traits humans are using to identify a species. Its using its own criteria, whatever those may be, and no doubt they’re unrelated to what the human brain does. It has no concept of what humans find difficult or easy. So theoretically even in cases where humans say “there is no way to tell these two species apart visually” the computer might actually be able to do it, if provided a good enough training data set.
I spent a good five minutes or two trying to work out what was differen
The computer vision is run every time someone suggests an identification, which is multiple times per observation. It actively encourages you to select the computer vision suggestion as your own with “We are pretty sure this is…”.
The lack of a “CV ID” in the observation data is a big part of the issue. Many of those identifications are CV ID’s, we just have no way to tell which ones.
@arboretum_amy
Yes, this is something I think about quite a lot. I am not a programmer (and I’m not sure even the programmers know what features of a photo the computer vision “looks at”) but the computer certainly does not “know” which traits humans are using to identify a species. Its using its own criteria, whatever those may be, and no doubt they’re unrelated to what the human brain does. It has no concept of what humans find difficult or easy. So theoretically even in cases where humans say “there is no way to tell these two species apart visually” the computer might actually be able to do it, if provided a good enough training data set.
Yes, I completely agree. This is a fascinating thing about the computer algorithm. It has no actual understanding of a species or what to look for to identify it, and there would be no way to extract any kind of ID criteria from it that a human could use. But yet it is successful a surprisingly high percentage of the time, and in some cases it will be able to successfully identify a species from a photo that would be impossible for any human. An amazingly powerful tool, but a different thing than an ID by a human. The computer algorithm is also always going to have failures, and I suspect that it is always going to have a lot of trouble recognizing when it doesn’t have enough data for a reliable ID. It isn’t capable of critical thought.
I still fail to see this as an issue when the majority, potentially a high majority of computer vision suggestions are correct.
Between users who are fully aware of what a record is and use it for speed, and the records being generally skewed towards common taxa that the cv does not struggle with, most suggestions are correct, and frankly if you are properly identifying things which is to critically evaluate the evidence it shouldn’t matter the source of any other identification.
Yes there are taxa and groups it struggles with. Yes there are some cases where users blindly enter or agree with a recommendation, but showing the cv assessment on every record is going to increase those issues, not help alleviate them.
It is also important to recognize that any cv identification is what the current training model thinks something is. What the 2019 model thinks that neo tropical butterfly was may well be different than what the 2021 model does and again different than what the 2023 model will think.
Unless you plan to somehow refresh 100+ million observations cv assessment each time the model is updated
Are we sure that a majority of CV suggestions are correct? I don’t doubt that a high majority of common taxa are correctly identified by the AI, but for most taxa this is not the case - e.g. with the groups I mostly ID I would say that the accuracy is less than 25%, potentially much lower. Whether the rarity of these taxa balances out the proportion of incorrect suggestions I am not sure of, but I think it’s hard to determine exactly what the accuracy rate is without actually looking at the data.
I could be completely wrong, but I think it’s very easy to overestimate the success rate of the AI. Here’s a scenario that I come across often:
Australia has two species of Tenodera, T. australasiae and T. blanchardi. T. australasiae is more widespread and far more common than T. blanchardi (approximately 9:1 ratio of observations). The AI cannot distinguish between the two, and indeed most people can’t either. So the AI pretty much always chooses T. australasiae (or at least it used to; I haven’t checked recently). So about 90% of the time it’s correct, which is great. BUT it cannot distinguish between the two, so we get a false sense of how good it is. It’s right 90% of the time, so it seems like it can accurately distinguish Tenodera species 90% of the time, but in reality it cannot distinguish them at all. For a lot of other genera and even families that look the same but have more equal observation ratios, the AI has a huge amount of trouble identifying them and frequently gets it badly wrong.
In very common taxa like birds or butterflies I am sure that the AI is very good, and for very rare/uncommonly sighted taxa like webspinners or pseudoscorpions, I don’t doubt that you would agree that the AI is pretty terrible. But for the ‘intermediate taxa’ it is also pretty terrible in my experience. E.g. I try to ‘monitor’ all sightings of Orthoptera in Australia, and I aim to review them all at some point. The AI is pretty good with some common species, and can probably reliably identify about 50 common species. But Australia has more than 5000 species of Orthoptera, the vast majority of which the AI is hopeless with. The common species that the AI can identify do make up a large proportion of the sightings (maybe about a quarter), but the majority are made up of all the other species that the AI really has trouble with.
The AI gets better and better with every training round, and I don’t doubt that one day it will probably be even better than we humans. But it’s nowhere near that point yet, and the wording does not reflect this. My qualms are not really that the AI is inaccurate - I don’t expect it to be, and honestly I am very impressed with how good it already is! But the current wording of the CV suggestions vastly overstates its ability and I strongly believe it should be changed to reflect its actual ability. It may be very accurate with common species, but it’s definitely not accurate with the majority of taxa, and the we can’t expect a new user or the AI itself to be able to distinguish between a common and a rare species, so the wording should take a more cautious approach that more realistically reflects the AI’s ability.
My main worry is the possible use of CV-initiated IDs in the training set for future models. If that is the practice, then it’s a positive feedback loop that can lead future models astray. The people in charge of the CV training procedures should make sure that this is not happening. I would suggest that CV-initiated IDs be excluded from future training sets. Yes, there does have to be an agreeing ID, but those can be influenced by the first, CV-suggested one.
If you define a majority as 50 percent or higher, then I am absolutely confident. Between the use by people who know what something is and use it for speed, and the observation dataset is skewed in distribution towards common, generally easily identified taxa.
There are groups and taxa where unquestionably it is not accurate the majority of the time, but across the entire dataset, when a representative dataset is used, then no question it is accurate a majority of the time.
Again, why, if the ID is accurate regardless of its source should it be excluded?
I have little doubt this user knows what this is https://www.inaturalist.org/observations/92661184
so why throw it out because they are using a more efficient method to do their entry?
If you start excluding ID’s ‘made’ by the computer vision, whether it is for training, whether it is to exclude them from counting in the community ID, all you will do is encourage more people to use the computer vision and then just manually type the name it comes up with in so it looks like they did it without the CV. All that does is make it even harder to understand the CV issues and slow users entry time down.
Right now, with 72 observations T. Blanchardi is just shy of the threshold for inclusion in the computer vision model, 100. If you can ‘mine’ Australian insect observations to get at least 28 more observations of T. Blanchardi ID’d to species, it can be included in the CV and we can find out how good it is capable of being for them.
On the first obs of T. Blanchardi I pulled up, the CV is ‘pretty sure’ it is Tenodera, and of course offers T. australasiae as the top suggestion below that. So right now it is working about as well as theoretically possible without knowing about T. Blanchardi.
There is a thread here: https://forum.inaturalist.org/t/identification-quality-on-inaturalist/7507 and the model is about 95% accurate when it is confident enough to say ‘we’re pretty sure’. As in the Tenodera example, it can really only achieve full accuracy when it knows about everything common in the genus or at least all reasonable lookalikes.
I believe the original motive for tracking whether CV was used for an ID and creating the ‘CV used’ sparkling shield was partly to preserve the ability to do something like this if it becomes a problem in the future. Right now, it is not a problem for the most part, because for the vast majority of taxa the main limiting factor on the CV accuracy is insufficient data, not mis IDs, so excluding partially CV-derived would be counter-productive. If in the future inat has billions of good observations to choose from and CV accuracy is still a problem, it might make more sense to be pickier about which ones get added to the CV.
I’m new around here, but I’d guess the majority of the amateur or casual users use the apps much more than the web interface — if they visit the website at all. So I’d think the question of data quality — whether CV suggestions pollute the data — would arise more from the apps.
On the web that may work great, but in the iPhone app typing brings up a list of choices, not an “autofill.” And if I type “oposs”, for instance, the first eight entries are genus-level or above. I have to scroll down to choose the ninth, Virginia opossum — the only possum species found in Arkansas. This is substantially slower than picking the CV selection.
In the iPhone app, the only way to enter an ID is to tap “What did you see?” which automatically loads a list of suggestions. As I said above, I believe the overwhelming majority of observations from casual users will come through there.
So if you believe casual users will too often accept CV suggestions blindly, and experienced users should always type in their IDs to keep the data quality as high as possible — then what’s the point of having the CV suggestions at all?
I’m all in favor of improving the phrasing. I agree that “We’re pretty sure this is…” is too strongly-worded, and fails to indicate no human analysis was involved. The app should actively discourage casual users from using the CV suggestions if they aren’t knowledgeable themselves. Instead, they should be encouraged to wait for the community to identify their finds.
As I said earlier, I think presenting a stack of suggestions down the classification tree would best allow users to select the level of ID they’re comfortable with. That might also cut down on unnecessary “unknown” observations, where a user didn’t want to select a CV suggestion, but couldn’t be bothered to type “plants”.
I dont know about the iOS app, but on both the Android and website, if you start typing something it stops the computer vision suggestions and simply starts an autofill process based on the text matching. So if you know your plant is a Large White Trillium, you can if you wish simply start typing that and it will stop the CV. Why you would when it is a slower method though I’m not sure.
Records are almost equally submitted between the 3 platforms https://www.inaturalist.org/stats
The website actually generates the most, at around 40k on an average day, iOS does about 30k and Android about 25k
An interesting test is to find and add the Chrome extension that colour codes ( https://chrome.google.com/webstore/detail/inaturalist-enhancement-s/hdnjehcihcpjphgbkagjobenejgldnah ) the CV suggestions based on the iNat matching score. While I have not done a search on every one of my records, I have yet to see one where ‘pretty sure this is’ is not used against a suggestion coded as green. It only says ‘we are pretty confident this is x’ when it actually is pretty confident it is x, otherwise it goes to the ‘we are not confident to make a suggestion’ like the below example. It some cases it is wrong, but it is still pretty confident based on what it has been trained on when it says that.
The challenging cases to me are records like this one
https://www.inaturalist.org/observations/90898206
where it spits out it is not confident enough to make a suggestion, but if you have the extension enabled, it shows the first species suggestion as green, suggesting a high level of confidence, when that suggestion is wrong (the species it actually is is not in the training set yet due to insufficient records)
The issue is that this is not actually an ID. To suggest an identification, a user needs to know what characteristics are used to reliably identify that species, and be able to confirm from the photo that the specimen has those characteristics.
To quote directly from the iNaturalist help:
An identification confirms that you can confidently identify it yourself compared to any possible lookalikes. Please do not simply “Agree” with an ID that someone else has made without confirming that you understand how to identify that taxon. If you agree with the ID without actually knowing the taxon, it may reach Research Grade erroneously.
Even if an observation was identified by the world expert in that taxa, you should not agree with it unless you can ID it yourself. Otherwise it defeats the purpose of a Research Grade status. If there is only one person with the required knowledge to of ID an observation from the provided photo, it should never achieve Research Grade status.
This was always an issue even before Computer Vision, but with the way that Computer Vision is currently presented to the user, it is much worse. The interface is actively encouraging the user to ignore this rule, which in my opinion is a very important part of iNaturalist, and an essential part of both educating people about the organisms they are seeing and providing good data.
The interface is already showing the CV assessment every time someone decides to suggest an ID, and actively promoting them to present that ID as their own. Removing that would greatly improve the issue.
@opisska
If I don’t know much about the given type of organism and the AI tells me that it’s “pretty sure”, I pick it. Where is exactly the harm in that? The only problem is if someone blindly confirms it. But what is harmed by having a “needs ID” observation with a wrong ID? It says it in the label - it needs ID. It’s far better to pick something than to let it as “animals” or “arthropods” or something this vague, because many IDers clearly watch only specific sets of taxa and this way the observation has a far better chance to reach the right person to have a look at it. The experts at the taxa will be quickly able to say “no it’s not this” as well.There is no problem with CV here. The only problem is people who refuse to accept that “needs ID” observations aren’t reliably IDed.
With the way that Computer Vision is currently presented in the iNaturalist interface, your conclusion makes a lot of sense. This is the problem with that interface, because it is convincing even experienced iNaturalist users to misuse the identifications in a way that I think is quite harmful to iNaturalist. See my response above in this post for more details.
In general, regarding the accuracy of Computer Vision, it may be correct for the significant majority of observations, but only for a small minority of the taxa. I personally don’t find it particularly useful, as it only seems to be reliably correct for species that I could already easily ID, but generally don’t have to because others on iNaturalist quickly provide two IDs. For the observations that actually require some knowledge and thought, it doesn’t seem to do much other than suggest it is probably the most commonly photographed option. (Which is true, but not particularly useful input). It does mean that now users with no knowledge of the taxa are now providing very specific IDs even though they do not know how to ID it, which goes against the guidance provided by iNaturalist.
Casual users are very often accepting CV suggestions without being able to ID the observation themselves. I don’t think there is really any disagreement on this, it is currently the norm and not the exception. There needs to be a way to differentiate between a Computer Vision suggestion and an actual ID, which follows the iNaturalist protocol for providing an ID (quoted above in this post). That is not possible in the current interface. Yes, modifying the interface to differentiate between these things may cause some experienced users a slight amount of more work for an ID. But I don’t see that as an issue, it isn’t worth removing the value of the data or misleading users about how ID’ing species works just in order to have a few more IDs of unknown value.
The value of the Computer Vision suggestion is that some people want to use iNaturalist to immediately identify something that they took a photo of. Computer Vision does provide this, and this user may not care that there is no way to confirm the accuracy without waiting for someone else to weigh in.
In order to suggest an ID, a user needs to know what characteristics are used to reliably identify that species, and be able to confirm from the photo that the specimen has those characteristics. The Computer Vision cannot tell you what characteristics to look at, or even what the other possible options are that look similar. So, it cannot help you with the ID, and should not be presented as such.
In what way is a user who knows what something is but is letting the system generate a list that includes that taxon in order to click on it because it is faster and more efficient than typing something ‘not an Id’.
If you can demonstrate some way to separate records where people do that versus blindly agreeing to a suggestion, great. But the solution to that is not to punish or penalize people or devalue their input because they are using the most effective tool to enter their ID. Telling them if they want their ID to count they need to use a less efficient method is a step backwards.
It is the ultimate negative reinforcement process to say that as the CV gets better and better we should be encouraging fewer uses of it.
Absolutely it does. The computer vision doesn’t suggest a single option. It presents what is 8 or 10 different ranked options.
How “Agree” button even connected with cv? You should add id you are sure in, but cv is here, so people will choose what it suggests and it can’t be stopped, if the suggestion is correct there’s no problem at all, if someone agrees with it because that’s what they think it is, even if it’s not, it’s not that bad too, as it’s part of learning, anyway it’s a topic not about agree clickers, but about cv.
Majority of observations is of minority of species, as everything in life, so most obs don’t need “much thought and consideration” as most “regular” people are more likely to be drawn by common well-seen and documented species. CV does the good job and it’s uch better right now, suggesting winged insects a lot, while it didn’t before, that’s really cool, and with good photo and species already in cv it can guess quite hard taxa, main problem is most arthropods, worms, etc. are underdocumented and we have no humans to id them, so of course cv will show wrong results as it doesn’t know what it is, as you see problem is in humans always, not the tool.