What should represent a 'range' for a species in iNaturalist

Charlie, I don’t think the risk is in having to use it to identify the rare species, it is in having the rare species shown as an option to choose from.

I will use the example of the species I listed as really rare, but not a mega rarity, the Black-throated Grey Warbler. This is a species seen maybe once a year in Ontario.The odds of an inexperienced iNat user independently finding one are very low.

However, the odds of that same inexperienced user finding a Blackpoll Warbler or a Black-and-white Warbler, two common birds in the province that look virtually identical are quite good.

The computer vision visual match for the 3 species will be very high.

The root of the question was depending on how you define the range of Black-throated Grey Warbler (is a bird seen less than once a year in range or not?, should the atlas have the area or not), or tool to select proximity for the computer vision (checklist, nearby records etc) will this species appear in the options presented by the vision tool?

2 Likes

I’m not sure my grasp of the question is 100% accurate, but I’ve pondered this or a related topic a fair amount. My thoughts are in relation to how the “computer vision” (AI, auto identifier, not sure what the official name is) takes geographic location into consideration for its suggested ID and how the results of the analysis are displayed. And, perhaps this is all explained somewhere (?). My thought is that when the AI analyzes a photo (does it do the whole set of photos in an observation or only one photo?), it should have independent values for (1) how well a photo matches a particular species, and (2) a measure of commonness or likelihood. I am not versed in all the metrics used in an AI/computer vision similarity assessment, how they are weighted, etc., so can’t comment on that. When it comes to a measure of commonness or likelihood, this could be based on one or both of these things: (a) distance to a human-defined range (think range map in a field guide), and/or (b) a measure of reporting frequency within radius bands. My ideal would be that each suggested species includes–perhaps right after it–a measure for the computer vision match AND a measure for the range likelihood, perhaps 0-100 for each. So for example, in Chris’s example of a Black-throated Gray Warbler in Ontario for a particular date, the computer suggestion might be something like this:
Black-throated Gray Warbler 95/5 (meaning high score of visual match and low score of frequency of reports of that species for this location and/or time).
Black-and-White Warbler 85/76 (meaning moderately high scores for both visual and time/location)

I would like to see something like that when the computer vision gives suggestions. Should they be ordered by the highest visual analysis scroe or the geography/time likelihood? I’m thinking sort by highest visual match, but then the likelihood or frequency measure gives some caution to the observer.

If someone has a link to some of the details on how this currently works, I’d be interested.

4 Likes

That’s effectively what it does now, right down to the scoring. The geo match handled by the seen nearby indicator.

Right now the rank display is solely visual match driven though, the geo match is solely for info, so things with a 0 score on geography still get suggested even as the first option.

EDIT - to answer your other question, it only reviews the 1st photo in an observation.

3 Likes

Chris, I’m thinking to be most effective for use in informing Computer Vision rankings, it should be proximity to a combination of your #1 and #4 – where it is regular and expected including invaded areas.

The question then becomes, how do we define that range for the vast number of taxa in iNaturalist that have enough high-grade observations for Computer Vision to include (or that soon will)? I think the only practical solution to that is going to have to be driven by the geographic data already in iNaturalist for those same high-grade observations – even though that leads to some circularity. That doesn’t bother me too much because it is only a tool to suggest IDs, and should still deal with things way beyond previously reported ranges pretty well.

In cases where atlases or ranges already exist for taxa in iNaturalist, those should certainly be considered as part of the “geographic data already in iNaturalist” and probably be given higher weight it determining “range proximity” to help with the circularity issue above – although in the case of atlases in particular, they can potentially be either incomplete or overly broad in defining a taxon range (or both!).

My final thought has to do with all the other potential taxa that don’t yet have enough high-grade observations for Computer Vision to grab. I would like to see any implementation of geography-informed Computer Vision include, via a link next to each listed suggestion, any other taxa in the same genus (family?) with too few observations to be included in Computer Vision, but with range proximities at least as close as the suggestion being shown. This would (1) remind the user that Computer Vision is not an infallible or complete list possibilities, (2) give the user additional tools to actually find the right ID.

Sorry if this last is veering too far into the weeds of implementation, but I think how we want to use taxon ranges needs to guide how we want to define taxon ranges in iNaturalist.

2 Likes

Housekeeping FYI for all, related discussion going on here.

A post was merged into an existing topic: Species Suggestions for the Wrong Continent

It may be unlikely that an inexperienced user will find rare animals, but it will happen. I saw an Ocola Skipper in rare my second year of doing butterfly checklists. Rick Cavasin says on his web site that he has never seen one in Ontario. I saw a Black Dash my third year and Max Larrivee emailed me to congratulate me as he has never seen one. I certainly wasn’t looking for either of them since I had never heard of them until I saw them. I think most new users make range mistakes - I know I did, but if I had only been using iNat when I saw these species, and the range had been set to eliminate them (especially the Ocola Skipper), It would have been frustrating to ID them and perhaps off putting for me to continue using the platform. I don’t have an answer to the range problem other then the very nice and patient people who helped me understand how the site worked and pointed out my mistakes kindly.

4 Likes

I’ve had similar experiences with a couple insects, though I had ID’d them just to genus or order and more knowledgeable users later revealed them to be exciting finds. But with the question here mainly being in regard to refining the computer vision suggestions, I’m not sure the changes proposed here would have changed your experience. You IDed those butterflies, right, not the iNat suggester? I don’t think anyone would suggest that users should be prevented from manually IDing an observation outside of a given set range. The solutions suggested here wouldn’t eliminate those species from iNat suggester consideration outside of their designated range, just consider proximity of known range more heavily in ranking species suggestions.

5 Likes

Actually, they were identified on e-Butterfly. If I wasn’t using that platform and trying to ID them only on iNaturalist, it would have been difficult for me if they were shown as out of range, especially when I was just starting to use iNaturalist. I made range mistakes a lot at the beginning so wouldn’t want to make that mistake again. Likely I would have ignored any species shown as out of range.

The range shouldn’t include IDs that are not community-based.

I just spent quite a bit of time correcting the reports of Scotophaeus blackwalli (aka “Mouse Spider”) reported on the map given for the species.

A few frustrated me by remaining on the map despite my having corrected them. A friend explained that some of these have opted out of community ID.

If they’ve opted out of community ID, they can’t be trusted, unless I specifically know the identifier and trust that person’s IDs. So I don’t want to see them on the map.

I also don’t see a way to exclude these opt-out observations from the filtered search. Am I missing something? (But it wouldn’t suffice just to provide a checkbox for getting rid of them – they really shouldn’t distract me on the main species page map.)

1 Like

if they don’t meet the criteria for research grade, they will display as not research grade, so you can filter them that way but you may also lose evidence-less observations you might want.

1 Like

Oh right. But my purpose is to correct people’s IDs and improve the map. I want those that are not research grade, but not those I can’t help correct.

1 Like

Two things - the percentage of people who opt out is really quite low, so they are a relatively infrequent find. Secondly, just because someone has opted out does not mean you can’t help correct an ID of theirs. If the user is active, and sees your ID, they may choose to update their identifications. It is really only the cases where the user has opted out AND is inactive that are beyond help.

Also, if/when their incorrect identification is overwhelmingly outvoted by the community (becomes “maverick”), their observation then disappears from the range map. It’s not an ideal situation, but it’s how it currently works in the system as it’s set up, with “observation IDs” (what’s at the top of the page and what the search queries find) and community ID (the ID on the right side of each observation page).

1 Like

I see. They could still change their ID, and if they don’t it eventually goes off the map.

But how do I keep myself from repeatedly revisiting recalcitrant observations in my efforts to clean up the IDs?

Mind you, I’m already frustrated with a few observations that ARE community based but no one bothers to come back to revisit for my correction. I already deal with putting effort into observations that never get fixed. I’m not keen on having any more of that.

I think iNaturalist is great for posting observations and spreading love for biodiversity, but I have a love/hate relationship with it trying to provide IDs. I go through spurts of “I’ll get these cleaned up” and “Why bother, as there’s a limit to how clean they can get.”

2 Likes

I hear where you are coming from, that has been a learning curve for me too. The best I can do in a community-driven site is to do my best at making my case on each observation where I disagree, adding a comment that is informative (without being argumentative) about why my ID differs. It’s up to me to be persuasive enough to the rest of the users involved with the observation.

If successful in enough cases, then over time it will help bend the knowledge curve of other observers and identifiers for that particular taxon.

Failing that, I can still find and export observation data of interest for my own uses. It’s just less convenient than having it already clean and tidy within iNaturalist.

4 Likes

I assume you are finding things to look at from the taxon page map right now ?

There you can’t limit sightings to things you have not reviewed, but you can do that on the explore page.

Go to the Explore page, enter the taxa you want to review, click filters button and then expand the more filters section and set reviewed to No. That way anything you have entered an ID on, or manually marked as reviewed will be excluded, then look at the results in map view.

For example this query which is buit using that shows all records of Common Loon in Canada that I have not reviewed :
https://www.inaturalist.org/observations?place_id=6712&reviewed=false&taxon_id=4626

2 Likes

Or similarly, use the identify page and filter for not reviewed and click r for anything you don’t add an ID for but don’t want to see again.

Okay thank you. I forgot about marking things as reviewed.

I so much prefer the BugGuide approach. I trust their spider IDs. And when I find a problem and report it, we have a discussion, and the ID gets resolved to everyone’s satisfaction.

I like iNaturalist because it’s so easy for people to use that so many more photos get posted. I’m tracking a few groups that require revision and even an undescribed species I can ID from photo. Under the iNaturalist approach, I’d rather be able to specify the people whose IDs I trust and opt to just see their IDs an no others, particularly in range maps.

Voting works when there are large numbers of experts, as there are with butterflies, birds, plants (I assume!), and even dragonflies. Reliable field guides don’t exist for spiders, and voting makes no sense for them. We aren’t even using the voting mechanism to “vote” per se. Non-experts typically throw out a guess and then just duplicate whatever we later post. That’s not voting. The experts get into a conversation and reach consensus conclusion. That’s not voting either. Maybe on rare occasions will experts vehemently disagree enough to post two different IDs, and that may be voting. It’s just not a good fit, and I’d rather the range maps reflect MY confidence rather than nonsensical voting algorithms.

1 Like

Maybe a partial solution would be to let me define a named set of people and filter search result to just the IDs by those people.