Automatic iNat suggestion for "unknown" observations that reach a certain age

The CV-based observations are already marked – I agree that an automatic cv annotation would help get observations seen by the appropriate experts. I try to go though and do this manually – tweaking cv suggestions based on location, my own knowledge, etc – but it seems like an inefficient use of time.

Alternately – perhaps “unknown” observations could remain marked “unknown”, but show up in searches based on the cv best-guess when no other information is present?

1 Like

I think you would need to actually run and save the CV guess rather than this. On the fly running of the CV against all unknown observations every time someone ran a search is likely extremely server intensive and unlikely to run in any kind of acceptable time.

1 Like

Completely agree. It’s mostly a bookkeeping thing; keeping the automatic CV guess … discreet … unless someone has manually approves it.

I just now notice that there’s sometimes a “placeholder” field that might be similar to what I’m describing. This is a subtlety I hadn’t picked up that looks like it might be important…

Would probably also need to be re-created if/when observation photos change, and whenever there are significant changes to the CV system (as recently happened).

1 Like

I do like the idea of an obs getting auto-ID’d with a CV choice if it hasn’t been given any ID at all after a reasonable time, say 12 months. Usually by then someone has got around to putting it to Order or Family, but if the auto-ID was put at the Order or Family of the leading CV suggestion, then I think that would be at least comparable to a volunteer identifier doing so with taxa that they are not familiar with. Still marked with the CV symbol of course. It would cut down a lot of grunt work. They would still be in the Needs ID pool, but would become included in the filtered ID pools that many specialists limit themselves to.

4 Likes

I periodically go thru the Unknown taxon to put obvious things where knowing eyes can see them. Some of them just can’t be IDd because of poor photos or you just can’t see what they think is the organism. Others are observations including photos of several organism that need to be split, so they can’t be identified either. Many of them are from people who started years ago and then became inactive.

Would it work to have a category for that?

1 Like

@jbecky I think there is, sorta: There’s a checkbox at the bottom of the page where you can specify “this observation cannot be further improved”, or some such. I’m … not actually sure what this does, but I assume it takes it out of rotation from needs-ID pools.

I know if two people have marked it as life, it does change it to casual, so it’s out of the ID pool. Not sure in other cases. I try to name in down to the lowest common denominator that I know, note as a comment the photos are of different subjects, and if I’m the second are later to do so, flag it as good as can be.

Wow - that actually seems kind of undesirable, since certain types of difficult identifications (among algae, or between fungi or slimes, or of microbes) are easiest to find bc they are marked “life”.

2 Likes

They are marked as “unknown”, not “life”. They become “life” after someone marks it as algae, then someone marks it as red algae. And while it is true someone could mark it as life, no one seems to do so. It is amazing the amount of obvious insect, plant, arthropods, etc. that are lingering at “unknown”, let alone the things that can only be determined to be life.

2 Likes

There are a lot of observations with no ID out there. It takes a lot of human work to ID all of those to just kingdom or phylum so that the experts can find the observations and ID them further. I suggest a bot that IDs these unreviewed observations. It could be the same account that automatically marks some species (like Picea pungens) as not-wild. This bot would find observations that have gone one month or more without being IDed, and would use computer vision to give them a kingdom level ID. I think this would really help with getting unreviewed observations to identifiers. If kingdom isn’t enough, it could go to phylum or class, but that gets risky. Also, as to not cause problems with the computer vision training, an observation IDed only by this bot would not be used to train the computer vision, because that would be too recursive. Would this put too much effort on the servers, or would it be too risky to use the computer vision that much?

2 Likes

@mws My apologies, just after approving your feature request, I realized there is already an existing feature request for essentially the same thing. So I have merged it here.

2 Likes

I think this would result in way more State of Matter Life observations than there already are, unless the bot could also read the observer’s description and/or any placeholder to know whether it is the plant or the butterfly or the cat in the photo that is the subject.

It could be made so that the bot’s ID is automatically retracted when a real person IDs the observation, so that errors in the computer vision that might cause bad photos to be mis-IDed does’t result in state of matter life observations.

2 Likes

That could help keep it out of State of Matter Life. What would happen to any placeholder that had been on the observation when it was Unknown, though? Usually human observers will try to preserve them in a comment.

2 Likes

I would think, in lieu of a better system that displays placeholders, it would just skip observations with placeholder text.

2 Likes

Given the number of things in the “unknown” pile that are that way either because there are multiple photos attached of different organisms, or because it’s utterly unclear what the photographer was centered on, I think a fully automated approach is not a good way to go.

It also seems a bit perilous to add ML identification access to the Identify tool–it has the potential to massively exacerbate our recently discussed problems with careless ID. What if Identify had a button that let you add the ML identification, but only to, say, Kingdom or Phylum level? That would still go faster than manually typing it out, but would keep a human in the loop to skip over fundamentally ambiguous observations.

2 Likes

My problem with this is that it would still take probably thousands of human hours just to do that. It would be a quicker, but still fairly unfeasible way to clear out unknown observations

2 Likes

Observations with multiple photos could have the AI go through each photo individually. Without consensus, it could then just not ID the observation, or ID as “stateofmatter life” to show that there’s an issue. It could also do something similar for a subject-less observation, where it doesn’t leave an ID if it can’t come to a strong conclusion on any kingdom.

2 Likes

and maybe skip observations with anything in the description section as well?

what is “ML identification,” please?

1 Like