Is there a description of how Seen Nearby should work?

Is there a description anywhere of how Seen Nearby in taxa suggestions is supposed to work? I spent some time last fall trying to clean up some range confusion on two species of Callirhoe, but it seems like I am still getting incorrect “Seen nearby” results on new observations. However, it is difficult to find a description of how the feature is supposed to work and how frequently it is updated, so I don’t know if it is actually operating correctly or not. Thanks.


There’s some helpful info here:

1 Like

Thanks. Adding the link to the old forum from that link:

The basic operation described in that link from Tony Iwane was:
““Seen Nearby” searches for Research Grade observations - excepting the observation in question - within 100km of the requested coordinates and 45 days before and after the date specified (if there is one).”

Based on that, it seems like I am getting incorrect suggestions (a species included in seen nearby that shouldn’t be), but I’ll wait for more feedback. I am wondering if perhaps the species ranges are updated infrequently, but I have not seen any description of when it is evaluated.

1 Like

Did you review all erroneous IDs, including Casual, Research Grade, and people who opt out of community ID (might have to use the “No” checkbox in DQA if they are unresponsive)?

1 Like

Are you seeing this on your own observations or those from other people? The AI can still suggest things (with the suggested icon) that were not seen nearby.

I’m also not sure how long it takes for this to get refreshed. If you just reIDed things, it might not have updated yet.

It is on other people’s observations. The observer identifies the observation as the incorrect species, presumably based on the computer vision/ seen nearby suggestion. When I go to ID the observation, I can see that it is suggesting the out or range species not only as Visually Similar, but also as Seen Nearby.

I corrected all the local erroneous IDs at least a month ago and probably more. I assumed it would take some time to update, but I am wondering if the range only updates every few months or so, as that might explain what I am seeing.

1 Like

I am not sure exactly what you mean. Basically, I can see that the suggestion box is suggesting that a certain species is seen nearby when the closest research grade observations I can find are roughly 200km away. Maybe that is within spec. I don’t know, which is why I looking for clarification on how it should work before opening a bug report.

Experienced that, too:
The nearest non-research grade observation of Podarcis liolepis is already farther than 100km, research grade some 500 km upwards.

I mean that you have to be really thorough to “cleanse” every wrong ID that might feed into “Seen Nearby” and there is also probably a small waiting period for the system to reanalyze data. I went through this in the Austin area trying to stop the CV/users from picking Chamaesaracha coronopus vs the expected C edwardsiana. There are still a few lingering incorrect observations where I have maverick IDs.

I don’t see any issues when I click in the species box for CV - only the Common Wall Lizard is suggested.

1 Like

It only looks for RG observations.

Where do you see this?

The FAQ about Seen Near by was no longer accurate, I just updated it:


Thank you for the update. If I am interpreting that correctly, based on the “nine 1-degree grid cells in around the observation’s coordinates”, then for an observation at coordinates Lat: 30.695944 Lon: -97.827117 then an observation at coordinates Lat: 32.521448Lon: -97.344051 would not qualify as seen nearby. Is that correct?

Also, can you elaborate on the nine 1-degree grid cells? Are they centered on coordinates of the observation or on the 1 degree grid cell that the observation falls in (by truncating the coordinates to degree or similar)?

If I have that correct that the above coordinates would not qualify as seen nearby, then I will have to try and track down if perhaps some RG observations have changed (changed from RG of the species in question to some other status) in the last few months that might be affecting this. I don’t see anything so far, and I am not sure it is possible to find, but I have a vague recollection of changing one last one sometime in the last 6 months (could have been 2 months, could have been 6).

Is it possible to find out when the seen nearby data was last updated for a certain species, say Callirhoe alcaeoides (taxon 159670)? :) Just thought I would take a shot.

Thanks again.

would be research grade if it wasn’t captive

An exotic garden plant (captive) will be suggested as seen nearby?

Today. Not when I submitted the observation.

Could make sense: imagine the wind blew the seeds away, and they germinated somewhere in the neighborhood.

I think I have tracked down the unexpected behavior in seen nearby to how the grid of cells is defined. I spent a few hours over the weekend trying to track down the relevant code, though I am not terribly familiar with javascript, ruby, or the iNat codebase in general, so forgive me if this analysis is incorrect or the code not in use. If I am correct, the relevant code is found in the following file:
in the following section:

TaxaController.nearbyTaxaRedisQuerySet = req => {
let swlat = Math.floor( - 0.5 );
let swlng = Math.floor( req.query.lng - 0.5 );

Later code defines the range of cells based on this southwest corner of the grid.
const baseKeys = ;
// search within a range of cells
_.each( _.range( swlat, swlat + 3 ), lat => {
_.each( _.range( swlng, swlng + 3 ), lng => {
baseKeys.push( ${lat}.${lng} );
} );
} );

The problem is the subtraction of only a half a degree instead of a full degree from the original coordinates. For coordinates where the partial degree is greater than 0.5, the base of the grid will be set to the same degree that the observation is in. For example, for a coordinate of 30.6, 25.6, this code would produce a base cell of 30,25, and it would set the cell that the observation is in as the southwest cell of the grid. This produces a search grid that would be heavily biased to the northeast of the observation in this scenario. Basically, in many cases, this way of doing things puts the observation’s cell on the edge of the search grid, instead of in the center. I don’t know if one can call this a bug, since there doesn’t appear to be a clear specification, but it definitely seems to set up an asymmetrical grid in many cases. That seems undesirable, but perhaps there could be a justification of it.

For a real world example, I have been working in the genus Callirhoe. Light colored specimens of Callirhoe involucrata var. lineariloba are often confused with Callirhoe alcaeoides, which is also light, despite the fact that they typically are not any closer than about 200km.

For example, in the following observation of C. involucrata var. lineariloba, C. alcaeoides is suggested as seen nearby.
Lat: 30.651008Lon: -97.667386Accuracy: 23m

The closest research grade C. alcaeoides is in the general area of the following observation, though a couple have been added recently that may be marginally closer.
Lat: 32.528108Lon: -97.35782Accuracy: 86m

For the first coordinate, the grid code would set the “swlat” to 30 and cells at latitude 31 and 32 would be included in the search grid, so that the C. alcaeoides observation would be in the seen nearby grid. The original observation is placed on the southern row of the 9 cell grid, and search grid is skewed to the north.

In contrast, in the following observation of C. involucrata var. lineariloba, which is just a few tenths of a degree to the south of the first one, C. alcaeoides is not suggested as seen nearby.
Lat: 30.353528Lon: -97.735595

Here “swlat” would be set to 29 instead of 30, and the latitude searched would be 29,30, and 31, so that the C. alcaeoides observation above would not register as seen nearby. Here the observation is set in the center row north/south.

I have not evaluated all the cases, and I don’t fully understand what area negative coordinates cover, but I suspect one would want to subtract a full degree from the original coordinates to acheive a more balanced grid.


I’ve definitely seen a slight increase in misidentifications lately, which appears to be due to suggestions that are “seen nearby” even though there are no research grade observations of that species nearby or casual observations either. Looking at one example of a species that has been popping up in suggestions a lot lately well outside of its range, I see the closest observation of the species suggested to an example observation with the suggestion is ~130 km and this is a garden planting. The actual range of this species is only about 25 km further though, so ~155 km.

The wide range for “seen nearby” seems to be possibly problematic for those regions that have a lot of observations like California. That said, it could arguably be too small for those regions with few observations. I’m guessing the chosen size is to try to find a balance between each situation. It would be awesome if it could be scaled to the number of observations in an area.

@tiwane Just wondering if you had any comments as to my findings about the setup of the search grid. I see on the announcement of the new Computer Vision model that you all are working on improvements to suggestions, so perhaps this methodology will change anyway?

Yeah. Not much more I can say about it at this point, but I think it should replace the current method so we’re unlikely to make changes so we can focus resources on other things.

1 Like

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.