Include computer vision model version in identification metadata

Platform(s), website

URLs (aka web addresses) of any pages, if relevant:

Feature request details: When an observation is identified using iNaturalist’s computer vision suggestion feature, a small badge is displayed in the indentification box of an observation indicating that a CV model was used. However, as these models are regularly updated, older suggestions may be less accurate compared to newer versions…

Each identification timestamp provides information about when the identification was made, but it does not specify which version of the computer vision model was used. Given that different versions of models are used concurrently and that some, like the Seek app’s model has not been updated for over 450 days, it is challenging to assess the accuracy of a CV identification. Recording the model version used at the time of identification would facilitate the evaluation of each model’s performance over time and improve the validation of computer vision-based identifications.

Feature request details:
I propose adding the version number of the of the computer vision model used in the CV suggestion popup box.

The suggestions made by the CV should always be evaluated based on knowledge of the taxon in question and the evidence provided in the observation. It shouldn’t matter which version of the model was used for this. Some people use the CV as a shortcut because it is faster than typing in the taxon they want, so the symbol next to the ID says nothing about why the user chose to select that suggestion.

As far as I know, apart from adding a symbol next to the ID, the CV suggestions for a particular observation are not recorded and not used in training the model. It also does not indicate which suggestion was used – whether the user chose the first one or something lower down on the list.

9 Likes

I disagree. AI and CV is not infallable, the accuracy between different models varies significantly. For my species group of interest, to name an extreme example, names suggestions made with CV 6 years ago are almost always incorrect.

I’m a bit perplexed that this notion is so common. My experience with Gomphaceae is that the majority of people who use CV use it as a replacement for checking a textbook and then don’t check the name suggestion for plausibility. The most common reason, once again in my experience, why a “user chose to select that suggestion” is simply because it was the top hit. The statement that CV “says nothing about” the accuracy of an ID just isn’t true. Regardless, this is off-topic to this thread.

This is also a separate discussion.

3 Likes

that is your niche case.

It remains true that many of us use CV as a less RSI way to get to the name we know.

9 Likes

You completely missed my point. I will repeat it, since it is not clear to me what role you think the CV is supposed to play for IDers.

The ID on an observation (regardless of whether it was made by the CV or some other means) should always be evaluated based on knowledge of the taxon in question and the evidence provided in the observation. In other words, it should never be assumed to be correct.

I’m well aware of that. I mostly ID a taxon where the CV is egregiously wrong as often as it is right. If it is not reliable anyway, why does it matter which version of the CV was used?

As an IDer, I should not be using the CV to figure out what an organism is. If I am evaluating an observation, it doesn’t matter why the previous user suggested what they did. It doesn’t matter whether they used the most current version of the CV, an outdated version of the CV, an external website, their own expertise, etc.

What matters is what ID is supported by the evidence in the observation, not what the CV says it is. It would save me a lot of work if it were correct more often, but again, it doesn’t ultimately matter which version was used to get that ID. A wrong ID is wrong regardless.

It is not a separate discussion. You mentioned that it is important to record which model is being used because this would “facilitate the evaluation of each model’s performance over time”. The only way CV IDs on observations can be used to evaluate how well the CV does and whether it improves over time is if the specific CV suggestions are recorded in the first place. To my knowledge, they are not.

4 Likes

My opinion on the role of CV in making identifications is not the focus here. This thread is about adding another tool to help validate identifications, particularly for mass identifications and corrections. The current CV version is 2.13, which means there have been at least 13 different models with varying degrees of accuracy. Why oppose making this information easily accessible?

As I previously mentioned, each model version has its own level of accuracy, and older versions are, in some cases, abysmal. The CV badge becomes arbitrary if two very different CV systems are in concurrent use, with even more over time.

It makes a significant difference if the CV suggests the correct genus versus synonymizing the genus with an incorrect species. Manually correcting species-level IDs is incredibly time-consuming, especially when it scales to thousands or, in my case, more than 30,000 corrections. Confirming a correct genus-level ID takes about 1-2 seconds, while disagreeing with an incorrect species-level ID takes 15 seconds. Additionally, explaining why the ID is wrong can take anywhere from 30 to 90 seconds, as many users expect such clarification. Correcting incorrect species-level IDs is 15 to 100 times more time-consuming than agreeing with a correct genus-level ID.

Furthermore, incorrect Research Grade (RG) identifications propagated by hidden CV versions undermine the scientific credibilitty of iNaturalist data, which is piped into GBIF. Every scientific paper includes a Materials and Methods section that specifies which version of software was used. Since iNaturalist promotes its data as valuable to scientists link, it should uphold the same standard. I’ll address this topic in a different thread, where I’ll share statistics showing that CV accuracy for species within Gomphaceae is close to zero, leading to thousands of misidentifications. If the CV works well for your use case, I’m glad for you. I’m not sure why that would be at odds with improving the CV even further.

Yes, this information is currently not readily available and this should be improved. The only way to currently gather this data is manually. I still maintain that this is a separate, though interesting, discussion for another thread. Adding the CV version number would be a great first step, especially for the sake of maintaining accurate historical records.

2 Likes

I’m glad that CV provides a less RSI-intensive way to assign names to species that interest you, which is certainly a valid use case. However, for the specific and diverse group I mentioned—where I tirelessly and selflessly work to mass identify others’ observations—CV, when not used with care, leads to widespread misidentifications. This is especially problematic for families like Gomphaceae, which contain very diverse genera. Rather than dismissing these concerns, mass identifications, or my interests as “niche,” we should focus on improving CV usage to better serve the entire community.

1 Like

Are two versions in concurrent use? My impression of web-based software is that every time an update goes live, the previous version is no longer available. If the current version is 2.13, that suggests that the previous 12 models are no longer in use.

Now, where I can see the version being relevant is where an identification was made before the latest version came out. Is that what you meant?

(At least) Two versions are in concurrent use. The lastest version is being used on the main website and is promptly updated. The Seek app, however, uses an outdated CV model. See https://forum.inaturalist.org/t/the-seek-app-suggests-species-despite-being-removed-from-inaturalist-s-computer-vision-model/54577

As far as I can tell, the CV model used by the Seek app hasn’t been updated in over 451 days. For example, Seek still suggests Ramaria formosa even though it was removed from the main model a some point in time before 2023-06-14 (Internet Archive snapshot from 2023-06-14) indicating that the version used is at least 450 days old. As far as I can tell this is not documented anywhere and is not apparent from the web GUI. Presumably other (and even older) versions are being used by people with outdated mobile applications.

Yes, I meant that as well. v2.1 is presumably less reliable than v 2.13 but without an exactly version noted down it’s hard to tell from just a timestamp how ‘modern’ of a version was used to make an ID.

1 Like

it’s possible to find observations that were generated via Seek and include a Ramaria formosa identification: https://www.inaturalist.org/observations?ident_taxon_id=48770&place_id=any&oauth_application_id=333

3 Likes

What would this tool be? What exactly are you proposing?

CV suggestions are not automatically added to observations, nor are there plans for this to happen. (And creating a program to mass add IDs based on the CV would likely violate iNat’s policies on machine-generated content and the use of AI).

People add IDs, using the CV suggestions or not. I don’t know why knowing which model was used would be relevant after the ID has been added – it doesn’t change the ID on the observation. It will still be either right or wrong and in need of correction or not.

I’m aware of this, and of the significant performance problems for some taxa. Nowhere did ID suggest that I do not wish to see the CV improved. Quite the contrary.

But I don’t see how listing which CV model was used in the badge beside an ID would change anything. It would not prevent wrong IDs in the first place. Moreover, it would not contribute to improving the CV, because past ID suggestions for an observation are not stored or used for training.

The CV badge is already arbitrary, because it doesn’t tell us which suggestion a user picked or why. It has absolutely no effect except that of adding a symbol beside the ID.

How would knowing that a RG observation was ID’d using a now-outdated CV model improve the credibility or usability of iNat data for other users?

The whole point of requiring at least two IDs is to improve the likelihood that observations get looked at by at least one person who has some expertise. In theory, observations should not become RG merely on the basis of a CV suggestion. If the two-person validation system isn’t acting as a barrier to prevent wrong IDs, this is the methodological issue that should concern data users.

I don’t oppose making this information accessible. I am questioning what purpose it would serve to do so. I see no benefit to something that plays no role in the CV training, would not reduce work for IDers, and tells us nothing about how or why an ID was actually selected.

5 Likes

I can understand @martinax’s frustration with mis-IDs, and I think the reasoning for including the model version is to add additional contextual information, going beyond the current situation where the Computer Vision symbol is useful contextual information, even while it is important to remember that it doesn’t show that an ID was added “automatically”.

If I see a few observations that are way out of the expected range of a species, uploaded by users without many observations and with no IDs for others, and that have the CV symbol on them, it’s pretty likely that those are simply mis-IDs. The Computer Vision symbol gives me an extra piece of evidence that these may have been IDs made by uncritically accepting one of the suggestions. I will take all that information into account alongside my own assessment of the evidence and knowledge of the taxon in question. As I understand it, the op would like to see the model version as well, for additional contextual information. With time, as an identifier focused on one group, it might be possible to start seeing how well the Computer Vision is doing - “up to version 2.13 there were a lot suggestions of Species X for observations of Species Y, but now it’s getting pretty good at correctly suggesting Species Y”.

I can understand why that would be useful extra context for an identifier. If this request is straightforward and low-cost to implement, I don’t see a major downside of doing it. If it would be complex or require a lot of reprogramming, then I could understand why the development team might focus on other issues where the benefit would be higher.

There may also be other more effective ways of addressing the problem that prompted this request - one would be to regularly check the Research Grade observations at @pisum’s link - for now there is just one.

6 Likes

While pisum’s approach is creative and interesting, virtually all Ramaria formosa observations are currently made with Seek and all of them are incorrect. I need the influx to stop but that’s a separate issue with the Seek app.

1 Like

without capturing a lot of other information, i don’t think computer vision version number alone provides anyone enough information to do anything meaningful.

that said, i still can’t really envision a use case where capturing all that complexity is really worth it either. see https://forum.inaturalist.org/t/log-computer-vision-taxon-guesses/44224/7.

the last paragraph of this may also be worth reading: https://forum.inaturalist.org/t/ive-noticed-some-unusually-innaccurate-suggestions-when-uploading-lately/54055/13.

3 Likes

In terms of stopping the influx, I have a question about the Seek app, which I have used very little. Does the observer have the opportunity to select a species, or do they have to accept the suggested species? If the latter, then we would indeed have a situation of automatic IDs being added, without observer input, which would be concerning.

Seek allows to edit identification before uploading to iNaturalist.

3 Likes

I think you can create an Excelsheet for yourself with three columns: CV version, date inservice, date outservice
and with the information published in the monthly overviews you can fill this Excel sheet.
https://forum.inaturalist.org/t/inaturalist-updates-for-august-2024/55068
The timestamp when the observation was published will correspond with an period in the excelsheet and so you will find the used CV version.

More interesting to know if the observer was sure about the observation or not. Other websites have the possibility to add a question mark to the observation.


Other users above noted, the Seek model is separate. It is possible to determine which observations have been made with Seek, so you can make a separate Excel sheet for Seek with the same columns and maybe one or two rows,( cause the CV model has not been updated yet for almost 450 days )…

1 Like

I agree this would be interesting information to see, but I don’t know that it would impact anything we do as identifiers.

If I see an observation labeled as species A, but I know it to be species B, I’m going to suggest species B as an ID. Whether the original ID was or wasn’t CV-suggested doesn’t change that, and certainly which version of CV was used doesn’t change that.

Frankly, in my experience, a more informative piece of information as it speaks to CV accuracy is whether the location was used to limit the CV to “nearby” sightings. The most egregious misidentifications I see the CV suggest are ones which are completely out of range. And when I click to enter the corrected ID on these observations, the CV suggestion I see is, more often than not, the correct ID (even for observation a few hours old, so this has nothing to do with different CV versions). I can only assume these users told the CV to identify their image without regard for location, which gives some really wild results.

Even in these cases though, how would it change what I’m doing as an identifier to know why the CV got it wrong? (older CV version, location not included, it was IDing a different image in the series and the user later altered the image order, image was of too poor quality, etc.)

I think the point @spiphany is making, which I agree with, is that there is only one way an identifier is meant to assess the accuracy of a CV observation, and that is to make your own ID from your own knowledge and suggest that, without regard for what the CV suggested. If I’m IDing someone else’s observation, I’m not going to pay much mind to what the CV says it is, whether it’s an old CV or a new CV- I’m just going to say what I think it is. If I’m uncertain enough in my ID suggestion that I’d be likely to change my mind if the newest and greatest CV version disagrees with me, I probably shouldn’t be putting that ID on someone else’s observation anyway.

Also, yes, this. I’d estimate about 85-90% of the observations I post that I know the ID for have an ID with the CV symbol next to it, because about 85-90% of the time one of the CV suggestions that pops up when I make an observation is exactly what I was going to call the organism anyway. The 10-15% of the time when the ID I think is correct isn’t listed by the CV, I’ll manually type it in. This is very common practice. Some of the world-experts in the taxa I work with have the CV symbol next to their ID on nearly every observation they make- it’s a shortcut. So the “CV icon = user blindly accepted an automated ID” assumption that is sometimes made does not accurately reflect many/most cases.

I guess my take can be summed up as “I don’t even think the CV icon itself makes much of a difference in how I ID something, so I really don’t think a CV icon with more CV data would alter my behavior at all.”

8 Likes

But we can, and do, leave a comment if we want to express uncertainty.
Or - no ID - and only a comment query with a possible name.

2 Likes

I agree with others that adding the version number would not be very useful for two reasons:

  1. Interpreting the CV “icon” as an indication that a user actually used the CV is a pretty fraught and poor inference in most use cases. The CV is used so often as an autocomplete (or by people who know what their ID will be and just want to see the CV ID before adding it) that it’s mostly meaningless in most situations.

  2. I think this information is already more or less available if someone wants it: It’s possible to tell which observations have been made with Seek (though even then, there’s no guarantee the user has posted the Seek provided ID). As @optilete noted, it’s possible to more or less figure out (within a day or so), what CV model was used to generate an ID based on timestamps.

Additionally, as noted above, it’s already possible to “eyeball” the CV model just based on looking at when the ID was posted (even without looking it up formally). I don’t see how knowing the exact CV model (as opposed to just the ballpark) on a given ID is going to allow a user to treat it differently - either the ID is correct or incorrect. The only real use case I can see for knowing the specific CV model is in an analysis of thousands of observations. In that case, the request should be to make the CV version available via API or similar, not as a single use piece of info available via a manual user click into a pop up.

4 Likes