Is there a description of how Seen Nearby should work?

rymcdaniel · March 18, 2022, 6:16pm

Is there a description anywhere of how Seen Nearby in taxa suggestions is supposed to work? I spent some time last fall trying to clean up some range confusion on two species of Callirhoe, but it seems like I am still getting incorrect “Seen nearby” results on new observations. However, it is difficult to find a description of how the feature is supposed to work and how frequently it is updated, so I don’t know if it is actually operating correctly or not. Thanks.

thomaseverest · March 18, 2022, 6:24pm

There’s some helpful info here:
https://forum.inaturalist.org/t/range-covered-by-the-seen-nearby-feature/2849/5

rymcdaniel · March 18, 2022, 6:40pm

Thanks. Adding the link to the old forum from that link:
https://groups.google.com/g/inaturalist/c/-QJM52iQm10/m/-K2EZY8FBwAJ

The basic operation described in that link from Tony Iwane was:
““Seen Nearby” searches for Research Grade observations - excepting the observation in question - within 100km of the requested coordinates and 45 days before and after the date specified (if there is one).”

Based on that, it seems like I am getting incorrect suggestions (a species included in seen nearby that shouldn’t be), but I’ll wait for more feedback. I am wondering if perhaps the species ranges are updated infrequently, but I have not seen any description of when it is evaluated.

egordon88 · March 18, 2022, 7:04pm

Did you review all erroneous IDs, including Casual, Research Grade, and people who opt out of community ID (might have to use the “No” checkbox in DQA if they are unresponsive)?

thomaseverest · March 18, 2022, 7:21pm

Are you seeing this on your own observations or those from other people? The AI can still suggest things (with the suggested icon) that were not seen nearby.

I’m also not sure how long it takes for this to get refreshed. If you just reIDed things, it might not have updated yet.

rymcdaniel · March 18, 2022, 7:29pm

It is on other people’s observations. The observer identifies the observation as the incorrect species, presumably based on the computer vision/ seen nearby suggestion. When I go to ID the observation, I can see that it is suggesting the out or range species not only as Visually Similar, but also as Seen Nearby.

I corrected all the local erroneous IDs at least a month ago and probably more. I assumed it would take some time to update, but I am wondering if the range only updates every few months or so, as that might explain what I am seeing.

rymcdaniel · March 18, 2022, 7:33pm

I am not sure exactly what you mean. Basically, I can see that the suggestion box is suggesting that a certain species is seen nearby when the closest research grade observations I can find are roughly 200km away. Maybe that is within spec. I don’t know, which is why I looking for clarification on how it should work before opening a bug report.

bernhard_hiller · March 18, 2022, 7:48pm

Experienced that, too: https://www.inaturalist.org/observations/83386805
The nearest non-research grade observation of Podarcis liolepis is already farther than 100km, research grade some 500 km upwards.

egordon88 · March 18, 2022, 8:31pm

I mean that you have to be really thorough to “cleanse” every wrong ID that might feed into “Seen Nearby” and there is also probably a small waiting period for the system to reanalyze data. I went through this in the Austin area trying to stop the CV/users from picking Chamaesaracha coronopus vs the expected C edwardsiana. There are still a few lingering incorrect observations where I have maverick IDs.

I don’t see any issues when I click in the species box for CV - only the Common Wall Lizard is suggested.

thomaseverest · March 18, 2022, 8:54pm

It only looks for RG observations.

Where do you see this?

tiwane · March 18, 2022, 9:18pm

The FAQ about Seen Near by was no longer accurate, I just updated it: https://www.inaturalist.org/pages/help#cv-seen-nearby

rymcdaniel · March 19, 2022, 2:50am

Thank you for the update. If I am interpreting that correctly, based on the “nine 1-degree grid cells in around the observation’s coordinates”, then for an observation at coordinates Lat: 30.695944 Lon: -97.827117 then an observation at coordinates Lat: 32.521448Lon: -97.344051 would not qualify as seen nearby. Is that correct?

Also, can you elaborate on the nine 1-degree grid cells? Are they centered on coordinates of the observation or on the 1 degree grid cell that the observation falls in (by truncating the coordinates to degree or similar)?

If I have that correct that the above coordinates would not qualify as seen nearby, then I will have to try and track down if perhaps some RG observations have changed (changed from RG of the species in question to some other status) in the last few months that might be affecting this. I don’t see anything so far, and I am not sure it is possible to find, but I have a vague recollection of changing one last one sometime in the last 6 months (could have been 2 months, could have been 6).

Is it possible to find out when the seen nearby data was last updated for a certain species, say Callirhoe alcaeoides (taxon 159670)? :) Just thought I would take a shot.

Thanks again.

DianaStuder · March 19, 2022, 8:30pm

would be research grade if it wasn’t captive

An exotic garden plant (captive) will be suggested as seen nearby?

bernhard_hiller · March 20, 2022, 8:18pm

Today. Not when I submitted the observation.

bernhard_hiller · March 20, 2022, 8:24pm

Could make sense: imagine the wind blew the seeds away, and they germinated somewhere in the neighborhood.

rymcdaniel · April 5, 2022, 5:23pm

I think I have tracked down the unexpected behavior in seen nearby to how the grid of cells is defined. I spent a few hours over the weekend trying to track down the relevant code, though I am not terribly familiar with javascript, ruby, or the iNat codebase in general, so forgive me if this analysis is incorrect or the code not in use. If I am correct, the relevant code is found in the following file:
https://github.com/inaturalist/iNaturalistAPI/blob/main/lib/controllers/v1/taxa_controller.js
in the following section:

TaxaController.nearbyTaxaRedisQuerySet = req => {
let swlat = Math.floor( req.query.lat - 0.5 );
let swlng = Math.floor( req.query.lng - 0.5 );

Later code defines the range of cells based on this southwest corner of the grid.
const baseKeys = ;
// search within a range of cells
_.each( _.range( swlat, swlat + 3 ), lat => {
_.each( _.range( swlng, swlng + 3 ), lng => {
baseKeys.push( ${lat}.${lng} );
} );
} );

The problem is the subtraction of only a half a degree instead of a full degree from the original coordinates. For coordinates where the partial degree is greater than 0.5, the base of the grid will be set to the same degree that the observation is in. For example, for a coordinate of 30.6, 25.6, this code would produce a base cell of 30,25, and it would set the cell that the observation is in as the southwest cell of the grid. This produces a search grid that would be heavily biased to the northeast of the observation in this scenario. Basically, in many cases, this way of doing things puts the observation’s cell on the edge of the search grid, instead of in the center. I don’t know if one can call this a bug, since there doesn’t appear to be a clear specification, but it definitely seems to set up an asymmetrical grid in many cases. That seems undesirable, but perhaps there could be a justification of it.

For a real world example, I have been working in the genus Callirhoe. Light colored specimens of Callirhoe involucrata var. lineariloba are often confused with Callirhoe alcaeoides, which is also light, despite the fact that they typically are not any closer than about 200km.

For example, in the following observation of C. involucrata var. lineariloba, C. alcaeoides is suggested as seen nearby.
https://www.inaturalist.org/observations/110036378
Lat: 30.651008Lon: -97.667386Accuracy: 23m

The closest research grade C. alcaeoides is in the general area of the following observation, though a couple have been added recently that may be marginally closer.
https://www.inaturalist.org/observations/24411506
Lat: 32.528108Lon: -97.35782Accuracy: 86m

For the first coordinate, the grid code would set the “swlat” to 30 and cells at latitude 31 and 32 would be included in the search grid, so that the C. alcaeoides observation would be in the seen nearby grid. The original observation is placed on the southern row of the 9 cell grid, and search grid is skewed to the north.

In contrast, in the following observation of C. involucrata var. lineariloba, which is just a few tenths of a degree to the south of the first one, C. alcaeoides is not suggested as seen nearby.
https://www.inaturalist.org/observations/42914447
Lat: 30.353528Lon: -97.735595

Here “swlat” would be set to 29 instead of 30, and the latitude searched would be 29,30, and 31, so that the C. alcaeoides observation above would not register as seen nearby. Here the observation is set in the center row north/south.

I have not evaluated all the cases, and I don’t fully understand what area negative coordinates cover, but I suspect one would want to subtract a full degree from the original coordinates to acheive a more balanced grid.

keirmorse · April 21, 2022, 2:11am

I’ve definitely seen a slight increase in misidentifications lately, which appears to be due to suggestions that are “seen nearby” even though there are no research grade observations of that species nearby or casual observations either. Looking at one example of a species that has been popping up in suggestions a lot lately well outside of its range, I see the closest observation of the species suggested to an example observation with the suggestion is ~130 km and this is a garden planting. The actual range of this species is only about 25 km further though, so ~155 km.

The wide range for “seen nearby” seems to be possibly problematic for those regions that have a lot of observations like California. That said, it could arguably be too small for those regions with few observations. I’m guessing the chosen size is to try to find a balance between each situation. It would be awesome if it could be scaled to the number of observations in an area.

rymcdaniel · April 22, 2022, 4:56pm

@tiwane Just wondering if you had any comments as to my findings about the setup of the search grid. I see on the announcement of the new Computer Vision model that you all are working on improvements to suggestions, so perhaps this methodology will change anyway?

tiwane · April 22, 2022, 9:38pm

Yeah. Not much more I can say about it at this point, but I think it should replace the current method so we’re unlikely to make changes so we can focus resources on other things.

system · June 21, 2022, 9:38pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Range Covered by the 'Seen Nearby' Feature General	12	2387	August 10, 2019
Seen Nearby appears not to be working Bug Reports web	5	542	May 24, 2019
Wrong Seen Nearby? General	6	678	November 30, 2022
"Seen nearby" suggestions not showing up when using CV to identify Bug Reports	12	796	November 8, 2021
Uploader not showing Seen Nearby species suggestions Bug Reports	6	395	January 20, 2021

Is there a description of how Seen Nearby should work?

Related topics