Geomodel Hexagons Still Lacking Basic Error Checking

This closed topic touched on when geomodel predictions got horribly bad and then improved again somewhat, but they still have some ridiculous and easily fixable solutions. One great example is Malacothamnus suggestions in Sequoia National Park. There are many observations of Malacothamnus fremontii in the park and no observations of Malacothamnus orbiculatus. However, CV suggestions (presumably based on the geomodel’s “expected nearby” hexagons) are suggesting the species that does not occur in the park and not suggesting the one that does. Below are the “expected nearby” maps for this area for both species with the problem hexagon highlighted in yellow.

It seems very problematic that a hexagon with several observations is excluded from the Malacothamnus fremontii “expected nearby” polygons with an adjacent hexagon “expected nearby” that also has several observations. It seems like there should be at least some very basic error checking of these “expected nearby” hexagons that would just recode them when there are such simple and blatant errors. How hard would it be to just add an extra step that recodes all hexagons as “expected nearby” if there is X number of observations in that hexagon? If you want to be conservative, it could only recode those that are adjacent to another hexagon that is already coded as “expected nearby”, though many of those are wrong as well.

The “expected nearby” hexagon without Malacothamnus orbiculatus observations is more acceptable as one could argue that maybe there is some Malacothamnus orbiculatus somewhere within that hexagon, even if unlikely. It’s a lot harder to refute absence that doesn’t have evidence than refute presence that does have evidence. So, while this Malacothamnus orbiculatus “expected nearby” hexagon is likely wrong and is leading to erroneous CV suggestions, the big problem here is that there are Malacothamnus fremontii in that hexagon and CV does not offer it as a suggestion even though there should be an easy way to recode hexagons that do have a species to be “expected nearby”. This is just very basic error checking. Is it present in the hexagon? Yes or no?

Good post, Keir! Your idea of basic error checking is something that should be implemented no matter what Geomodel is being used. But a larger question is why does the Geomodel produce such non-sensical results? It is bizarrre that M. orbiculatus, which lives at the eastern base of the Sierra Nevada, gets predicted for a hexagon on the western side of the mountains.

Possibly related is that recently the iNat CV has gone south at San Jacinto Mountain again, with expected species that don’t include the correct species. See, for example, this obs:

https://www.inaturalist.org/observations/360244285

Your proposed error checking would have at least re-introduced the correct species in the suggestions.

Yes, Agreed.
Most recent one I have come across Polystichum californicum


The major population in BC lies outside of the geomodel hexagons, and there are hexagons in WA were there are definitely not suitable habitat.
One of many taxa with a geomodel that needs work.
I wish there was a way to flag taxa for this so they could be individually repaired.

I don’t think any sort of manual/individual repair solution is really feasible given the scale of implementation.

But I do like @keirmorse’s idea of some kind of conservative programmatic solution that would take out the low-hanging fruit of easy errors. I’d be interested to know if anyone can see any obvious issues with something like

buckle up, this is a long answer I am gonna try explaining what happened here to my best effort knowledge.

I dont think the model is improved as whole for every observation, specifically the SINR that is causing its own issues was scrapped interim by reverting back to previous grid approach - since model 2.24 but what you found here is not limitation of SINR but grid approach discretization itself that was noted when “Expected nearby” from geomodel is replaced over “Seen nearby” - see this old thread, same artifact as this thread - when discretized cells are used any geomodel is going to get stuck with such discretization artifacts and hampers learning things at boundaries.

Your suggestion that there should be a code that counts these first at bare minimum to remove “False Negative Rate” on your _nevada_ example here is valid but there is important subtlety that its a tradeoff on precision and recall, say if I said “ok there are 10 nevada observations in this cell, so I want geomodel to always say it occurs in this cell” is antithetical to all that geomodel training in first place because it has another points that it is co-learning aka elevation and jointly modelling of species distributions.

so here is the reality, if we overlay those nevada species observations onto elevation:

I believe you can see cause of your issue now: the cell border observations you found are at low elevation (1000m ish) than cell’s centroid (3000ish) elevation (notice this line “Prior to 2.22, we trained and predicted at cell centroids” and here is the code line for single elevation for a cell used by model while learning)

so what is happening is as that hexagon cell is dominated by high elevation centroid, and that small cluster on those cell edges observations that were recorded on iNat being low elevation acts as a conflict of the cell info (which it decided as high elevation) it is thus learning for. so the final geomodel oracle learns prediction on such discretized cells as whole as - “Nope. I wont agree that this low altitude species belongs to this cell having high altitude centroid”

now lets go to orbiculatus:

see the pattern? the geomodel is saying “so others have seen it in high elevation cells around that absent cell, so now let me apply that logic and predict this middle cell is also highly likely cell even if no one has seen anything in that entire cell yet” - this is the hallmark of geomodel per se, we just dont want to learn only where it is recorded, but also on where else it could be. That inductive bias is what is badly needed in any model.

The big caveat with this assumption and predictions? maybe that high elevated area middle cell has some conditions that is never gonna be supportive to that orbiculatus species which it is currently assuming on priors of spatial and species correlations (again note precision-recall tradeoff), but learning such ecological correct correlations takes time and more diverse data and even subtle push by domain experts, that is definitively beyond the data that is fed to geomodel as of today, and obviously takes dev time and effort and scientific progress on finding such balanced practical algorithms first.

to reiterate, the current geomodel is not doing seen nearby (I would love that toggle indeed although it can be coded with inat api now) - “hey I saw this here at X elevation, if someone else saw something like this (cv prior) at say 1km away at similar elevations, you should recall that info of species via direct range query of this cell via counts and suggest it for what I saw” but rather it is learning a function of expected nearby as “hey I saw it here at this elevation, now given these numbers and cv prior, can u directly answer something in one shot way without recalling data of those real observations on possible species - so to avoid doing range queries, disk I/O of true data, … all those things we are in first place trying to get rid of from such oracle geomodel training”

And so with this comes the caveat of real world engineering versus expectations, if we want a learned model that is able to answer it perfectly, we can add everything in it but the computational power to train, update, maintain, payoffs, avoiding overfitting and precision-recall tradeoffs is the reality.

and so those h3 hexagon cells were picked as trick to enter this practicality realm where our one-shot model learns the discretized cell level predictions instead of continous granular curves as distributions on entire world. (note SINR is supposed to model this better by avoiding such discretization effects if it gets tweaked properly someday iirc)

and so the reality? because we collapsed the cells and aggregating at cell level as geo predictions for taxon, those low level altitudes observations even in the same cell failed to meet static taxon level geo thresholds, on a cell that is dominated by other terrain, which the model is learning in first place. aka the current geomodel cannot learn there are 5 low altitude observations in this high altitude cell too.

And I have to look at code, but since this is joint correlated species learning, maybe it being overconfident on "orbiculatus" for that specific cell you are highlighting could have created some second order impact on it also reducing its confidence to “nevada” prediction during training stage maybe.

finally, can we do better?

as long as we discretize as cells and work on that rigid lattice to predict while collapsing information, these artifacts will be present to some level (just like this counterpart effect of unlucky homes with certain workflow being uniquely fingerprintable even when using obscure because obscure discretizes onto static square cells too), probably a continuous field geography modelling that learns around the manifolds to bypass rigid cells is gonna help these but its not easy nor simple nor practical to get a stable model covering millions of taxa in that engineering realm - case in example being SINR revert by iNat now. Simply increasing the cell resolution is also not practical, although it definitely reduces this issue, as it will directly impact both training resources and inference resources for all places where this edge cases are minority, but multi scale modelling can be done and some community flagged taxa can then be pushed to higher resolution scales atleast or atleast in cells where there is such elevation conflicts, and on another side simply increasing resolutions forever has a negative effect being it biases to human sampling behaviour and misses the regularization effect that is gained at higher resolutions.


here is @loarie comment back then so hopefully it will come someday: (but also note that this is different system than “Reduce computer vision errors” proposal because these errors are never easy to catch from purely accuracy-precision tradeoffs and such aggregate metrics automated pipelines, as in this example we definitely need external community corrective feedback if we are going to rely on such models always)

not to get too much ahead of ourselves, but it would be neat to be able to capture the community’s expertise on absences and feed that into the model. For example you’re saying ‘I know Mellagenic Horned Owl doesn’t occur here’, that’s useful information and it would be neat to try to capture that from the community in order to better teach the model - maybe a bit like how atlases are being used on the site currently

I think this is a fair look at the geomodel process itself. But I would also add that the geomodel process isn’t an end in and of itself - it’s primarily a tool to improve CV suggestions. The main point isn’t necessarily to research/build/optimize a modeling process for biodiversity, but to make a tool as useful as possible. So it there’s a post-processing/additional step outside/in addition to the model that could easily be incorporated which increases the usefulness of the tool by increasing the accuracy of the CV suggestions, I think this should very much be considered.

sorry, the point I am making above is there is no post-processing or simple additional steps because of such model itself - like what is called in this thread as basic error check by simply count - because the model is primarily training here as substitute to avoid such range queries in first place at end of pipelines.

so the point of the model here is not “definitive counts of seen nearby” but “probabilsitc expected nearby” and without overhauling the design itself to account of what we see as missed cell, there is no easy post steps like “just count as error check” as they are antithetical to what we are trying to avoid by designing the computationally amortized model.

let me explain by example:

if we rely on such count and check as method of post-processing step “at cell level” it is obviously a wrong decision if we view it in the lens of “precision” as all those red birds species regions that sit at high elevation and where the yellow birds can never occur starts screaming out now as “hey its yellow bird too there” when its not, this is what I was trying to iterate as precision-recall tradeoff. currently the model thresholds tradeoffs and current learning cells design end up acting as penalty on recall of those yellow birds that are not dominated in this aggregate cell level (with respect to centroid elevation of cell and joint species it is learning as here)

and the purple curve is what we need to model and predict as decision boundary, if we have to be pedantic to avoid this scenario ultimately as in SDM, but as such a process is costly and as we all want a practical tool that is cheap and efficient in compute, the tradeoff of cell level predictions is what we settled for.

so without overhaul of model itself and in particular the discretization as h3 cells lattice onto which we are predicting, we cant avoid this with post processing without hitting on fundamental precision-recall itself.

aka should users now get “both yellow birds and red birds suggestions now” in that entire cell points observations just because an yellow bird is recorded at cell edge even though most of the region elevations and cell as whole could predominantly be “red species”?

and the fundamental limit of answering broad decision questions of “is it not dominated by red species or whether yellow bird is more likely here too” cannot be ascertained without manual tuning on every such edge case regions from community itself as bluntly making post processing decisions of “lets include yellow birds always in that cell even when user made observation at high altitude” (as count and error check post processing is) can end up as direct anti precision stance if that yellow bird ends up as first prediction entry now (say there are two cryptic species with reliance on altitude info)

The answer is absolutely yes. If the red and yellow birds are both in the cell, CV should offer them both as options for the cell and, if they are cryptic and in the same genus or family, it should offer the genus or family as the primary suggestion with the species shown after. By not offering suggestions of taxa that are actually in a cell (sometimes abundantly so), it creates a system that encourages misidentifications and a feedback loop that will make things worse over time if identifiers don’t hold back the misidentifications. Having taxa known from a cell not shown as options while showing other options that may not even occur in that cell creates a sense of certainty for less experienced users (likely the majority of users) that the one reasonable option shown is likely correct. We need to take away that certainty and show the real world. They should see that there are multiple options for that area and that it is confusing and that they will have to put in effort to learn the differences rather than relying on the implied confidence of erroneous model results.

Models will never be 100% correct and often will be nowhere near 100%, therefore you should not rely entirely on the model. The model is a great starting point but, when there are such blatant errors that can be easily corrected in post processing, post processing to reduce those errors should be implemented.

Step 1: Run the geomodel.
Step 2. Code the cells based on the results.
Step 3: Accept that the model is imperfect and has a bunch or errors.
Step 4: Run a basic sanity check and recode all cells that erroneously suggest a taxon should not be expected there when it is known from the cell.
Step 5: Accept that there are still a bunch of errors but embrace the fact that there are much less errors than there were before.

I have also run into several annoying examples of this geomodel issue lately, especially in the Siskiyous and the mountain ranges of southern AZ. Cells with highly variable elevation do seem to be the sources of error. These same cells tend to have high biodiversity, compounding the issue. In the previous iteration of the geomodel, most observations of Horkelia congesta fell outside the geomodel’s predicted distribution!

Has anyone suggested recording elevation using two variables, i.e. the max and min elevations within a cell? I imagine the geomodel could still learn about elevational distributions by comparing the elevational ranges of cells where the taxon is present in much the same way it currently learns elevational distributions by comparing elevations of centroids of cells where the taxon is present. This could address the problematic cases we are seeing.

Apologies to @einsum if you feel I’m oversimplifying, but…

The benefit of including the geomodel is to broaden CV suggestions beyond just those visually similar species seen nearby and include additional other visually similar species that we might expect given the elevation. Because elevation is only a partial guide, this will likely mean that the list of suggestions sometimes includes more incorrect options, and so long as that’s not too frequent the trade-off is worth it.

But… there would also seem to be no disadvantage to automatically tweaking the hexagon map for a taxon to add in any cell with RG observations.

A model that pays attention to elevation range while always suggesting species that occur nearby seems to be better suited to real-world use by iNat observers and identifiers than one that ignores nearby records in favor of only trusting the elevation model it has built.

Elevation variation may be one source of error, but it’s very easy to find examples where the model seems to extend a species’ range in one direction and completely ignore its clear presence in a different direction, in a way that I don’t think can be to do with elevation?

Just a couple of examples: Daviesia brevifolia has a decent population down near Geelong, but the model ignores that but happily expects it a bit further north where it has no records. Similarly in the north-west of its range it’s expected on one peninsula where it’s not found but largely ignored on the one where it is found.

Then there’s Asplenium flaccidum, which is mostly found in NZ but has a bit over 100 records scattered across eastern Australia. The geomodel expects it on a few islands south of NZ where it’s never been recorded, but completely ignores Australia. That doesn’t seem very helpful to me. (Going to the other extreme, Leucopogon parviflorus is only found in Australia plus Chatham Island but is expected on much of the south coast of NZ - not to mention in many ocean-only hexagons off the coast of Victoria and South Australia??)

The end result is that I’ve tried using the geomodel anomaly to hunt for out-of-range observations before but found it so clogged up with things that clearly are present that it’s hard to find anything that doesn’t belong - and as far as relying on what’s ‘expected nearby’? So many false positives and false negatives leave me somewhat leery. All that to say: I love it as a concept, but haven’t been altogether convinced that it’s actually better than just using ‘seen nearby’.

oof. what I said is lost I guess. You just committed an Ecological fallacy.

Let me try again. Firstly, I absolutely agree that the Geomodel is making type 2 errors as you highlighted already here and most users qualms with geomodel per se will be in this similar bug to large essence, and ofc any fix to reduce such type 2 errors is beneficial to geomodel users, and it has capability to reduce some identification issues from users unknowingly committing to such wrong geomodel gated results, …

But what I am trying to show is, the current geomodel is predicting on Cell level and cell’s centroid elevation (as its cheap to design in first place but again I agree with you all it should not be the final version we should really settle for) and when a model does that you will be hit with fundamental limitation in statistics “Modifiable areal unit problem”, and so what you are assuming as:

will end up being ecological fallacy as “assuming that yellow bird exists in an arbitrary location in cell in first place, the conditions for yellow bird must be uniform in that cell” and the alternative that current geomodel does is ALSO ecological fallacy “assuming that yellow bird exists in an arbitrary location in cell in first place, and the high altitude cell centroid not supportive of that (probably because its learning it as low altitude) yellow bird, the yellow bird must be discarded by geomodel”. You are missing elevation thing altogether by hardcoded uniformly reassigning probability.

what we want is neither but rather “if I observed something at X altitude in A cell I should see predictions correlating with that X altitude and also predictions correlating to A cell location within my top-k that are ranked on CV prior candidates”. That is the true real world. It is never about absolute entire cell level total recall and that is the whole point of designing Geomodel (although it falls seriously short as this case)

and so if we collapse the altitude info onto single cell centroid number and learn it, we are gonna obviously hit MAUP I am pointing and any fixes that looks basic are never going to work. Because they will be bound to commit type-1 errors on other set of predictions. Take a minute. It is definitely gonna increase type-1 errors as the data generated is filled with large sets of homogenous regions across cells and the type-2 errors it makes as highlighted here is probably going to be minority (no one has real statistical numbers on all taxa and geomodel predictions, and seeing cells in /explain is not same as ability to compare two model version improvements mathematically or ability to debug to us, and yes I blame iNat for seriously underreporting these real model metrics and stats as its extremely easy and would cause real progress to these “model bugs”, but anyway if I have to pull an old number see the recall at geomodel post so I believe it is not seriously undercommitting on whole but when it does it does) over such type-1 error reduce optimizations that is also happening on whole by model in first place.

They are minority in the sense, that even if these effects are highly pronounced at altitude variations cells and some taxa, viewed on a large set of observations on iNat they are minority because the recall as (the real taxa shown in top-k on cv alone - real taxa in top-k on cv+geomodel on randomly sampled observations on inat) is close to zero.

so you or none of us except staff can prove any kind of patch is good because nobody knows how those h3 cells at 4 resolution are behaving across all geotrained taxa. (my point above on inat underreporting). There is almost nil stats expect seeing the final predictions cells page on per taxon basis.

and so a patch you are proposing to substitute current P(yellow bird | cell, cell centroid) to hard-recode a subset of them to 1 based on P(yellow bird | cell presence is true) is a type-1 error to all those previous red birds predictions (that could have previosuly predicted red bird correctly as top-1) suddenly after this seemingly simple patch is pushed. (again I am agreeing model is also committing the fallacy of type-2 error but I am only disagreeing that this isnt the right fix)

and so I bet someone will come again and create a new forum thread with scenario like “why is my red bird seen at high altitude here showing up with yellow bird as top-1 prediction in new model, when it was red bird at top-1 under all last models?”

tldr: well I was trying to simplify but ofc there is CV prior onto which geomodel is acting but the point is seemingly simple post hacks are the bane (in the class of GOFAI - such expert rule systems are largely non existent in today’s models for rightly so) when viewed as whole unless the geomodel is overhauled entirely in first place as whole or we design a new reranking model that is learnable as post processing on top of geomodel to compensate geomodel issues as is (see next reply)

Bingo, but I would personally say it is better to not mess the geomodel per se, but add a new model on top for learning these corrections aka learn a new function that can dealias such cell aggregates the freezed geomodel has learned by trying to find these correlations across cells where taxon is already predicted and using their centroids (and more).

so such a new function could ultimately inference on thing like Pgeocorrector(lat, long, elevation at the point of observation, main geoprior, current cell’s statistical moments, taxon cells statistical moments).

Note that the “elevation at point of observation” is absolutely essential for such corrector to avoid type errors. I think maybe a Gradient boosted decision tree on those tabular data or 1/2 hop Graph neural network works wonders in learning that corrector function that is ultimately trying to approximate the purple curve I shown earlier to some level onto the coarse grained cells predictor that comes with spatial aliasing as it is now.

But again we are making another hidden assumption here - any such methods cannot recover intra cell level distinctions suddenly, that is the fundamental limit we have to let go because of per cell aggregations we are learning for and MAUP.

to see this assumption as example, if there are two cells where two species occur and those two cells have similar cell level aggregates on elevations or such, we are never gonna decouple whether these two species are fundamentally different altitude species in any of these schemes. Aka our learned taxon elevations preferences are already coarsely granularised to cell levels anchors be it single centroid elevation or bunch of min-max elevations to cells and unless the iNat goes around first and learns the true reality of species by sampling true elevations at finest resolutions it has to live in this assumption and it has to live on approximating P(taxon | altitude) with P(taxon | coarse cell altitude statistics) and these two are never going to be same and will again always cause some type errors.

not exactly. the geomodel has another equally important job - it has to gatekeep the seemingly perfect CV suggestion from another continent from popping up suddenly as top-1

so under this precision-recall tradeoff and also training for elevations now equally in new geomodel (in not so perfect way but decent as whole), what the system is currently doing is “Visually similar AND expected nearby” and its not “Visually similar OR expected nearby” (maybe one of those current blatant misrepresentation on iNat - another being branch disagreement code and reality of UI that is pending for years) that I guess is probably stuck when changing the label from “Seen nearby” to “Expected nearby” when this geomodel is pushed. I am not a longterm user on iNat so you veterans has to correct me, but if this @kueda presentation is the original cause of that sentence then you can see how it was OR back then aligning with what you said but its AND now (see nearby injection in slide where chrysolepis is injected without CV prior and is the reason for OR)

yes, the tweak should be done I agree but it cannot be done until we overhaul the geomodel assumptions in first place by possibly learning a new function and what counts as “automatic simple tweaking” is not so automatic when we look at real details of precision-recall tradeoffs overall on platform.

so yes we need a new learnable function and more importantly we badly need some more published model statistics (instead of one highly aggregated curves which to me is useless in these discussions) from inat to realise these benefits by tweaks across model versions to be caught by anyone.

and we dont know this again, whether what trade offs are worth it. you are arguing only from that missed recall suggestion point that geomodel gatekeeped currently and by presenting it as perfect recall is good now, but my point in above new reply is we cannot say this on the top of head and assume it applies equally to entire inat level whether that gatekeeped suggestion is better or worse - when viewed in the lens of true precision and we are lack real numbers to prove such argument.

I think the problem is too much reliance on a model and not enough reliance on simple realities. Even really complicated modelling of potential habitat of a species with a huge amount of data is often very questionable because it is still working off of limited data. Those models are useful for showing where something could possibly occur given the limited data and can lead to great finds, but still often suggest large areas as possible suitable habitat where a taxon does not occur for a myriad of reasons.

There are two things we want to know regarding these hexagons: (1) Is this species known to occur in this hexagon? and (2) If not known there, how probable is it that it does or does not occur in that hexagon? The first and more important question, assuming the data is correct, is already known. It does not need to be modeled. If this data is not used for determining what is “expected nearby” or expected in that hexagon, the process is already broken. The model itself is only useful in determining how likely it is that a species would occur in a hexagon it is not known in. There are a lot of possible ways to approach a fix (pre-model decisions, within-model decisions, or post-model decisions) but the most important thing is that a hexagon with a species known to occur in it (with some accuracy filters) should always be classified as “expected nearby” for that species. That should be incredibly simple and, if it is not, the system is incredibly flawed. If the decision-making process can’t allow something as basic as coding a hexagon with a species present as “expected nearby”, there is something incredibly wrong with it.

I generally agree with this line of reasoning, but I would note that it can be useful to check for observations of a taxon but the model predicts that there should not be, see:

https://www.inaturalist.org/blog/99727-using-the-

This can surface misidentifications or otherwise very interesting observations. I think this is one reason why the model hasn’t been designed to automatically predict in every hexagon where there is an observation - we don’t want one wrongly IDed observation to drive additional misIDs by saying that the mistaken ID should be expected.

That said, I think there’s some reasonable cutoff (5?, 10?) where we can say that, if there are that many observations, it’s more likely that the species is truly present in a cell than mistakenly present. Adding those cells to the prediction would seem to be a low hanging fruit.

one is same issue with elevation again, here is your brevifolia:

see. same way. the cell centroid is trying to conflict with the true elevation of observations there as it is seemingly trying to learn and approximate for high elevation correlation for that taxon.

its not that the geomodel is hogwash as whole, its just unlucky in our true reality, because we are forcing it to adjust itself on low centroid far off cell with an high elevation centroid cells the species was dominant with being on top of this cell.

here is unthresholded map, see it tried its best to learn that corner under this model constraints and faulty assumptions, but just unluckily failed to meet the static final threshold of that taxon (that is autodecided by final model unless staff edits)

usually the staff past approach on forum I notice is to let users provide these problematic taxon, and they will retune these thresholds to satisfy somethings but the caveat is as they retrain the model the new model thresholds can change again and so its not really a scalable solution nor perfect as retuning comes with new assumptions that the model may have perfectly satisfied elsewhere in some other cells in view of type-1 error reduction :person_shrugging:

the first is type-1 error it made from what it learned and any “predictive” model will make them to some extent, and second has probably to do with centroid elevation again and so type-2.

This one is interesting, the immediate fix that staff asks is let us know this and we will readjust thresholds (which can fix this to some level)

If I have to guess from this unthresholded map, it probably has to do with not enough samples (note not all RG observations are trained on but only some are randomly sampled in first place and so only those subset goes to train) on each cell levels from those Australian observations vs NZ observations to push the model to commit to those on left side to higher unthresholded scores during training.

see the elevation thing again and MAUP (i mentioned above), so some samples from Australia even when it looks valid to us will become conflicting to model by centroid aliasing design.

or who knows what other competing species (if it exists) took away this one’s predictive power in some cells of Australia as the predictions are jointly learned across taxa. but yes i cant honestly say exactly until we see more numerical stats for model.

:) you said it. maybe it worked for some, but I am mostly with you on this one with same issues to me when I tried.

you seem to be assuming there’s some sort of symmetry between the CV refusing to suggest a species that should occur somewhere and suggesting a species that shouldn’t. obviously it would be preferable if neither ever happened, but it’s much less destructive when the CV sometimes offers the right answer alongside added incorrect answers than when the CV only offers incorrect answers. a patch which forces the CV to always include a cell with occurrences in its geomodel would be a clear improvement over the status quo even if it meant the occasional high elevation species being suggested at a low elevation or vice versa.

for example, if the model started to include mountain goats in some coastal area’s geomodel this would have a fairly small impact, as it would be rare for the image recognition to think any organism there could be a mountain goat and rare for an observer to choose mountain goat in those cases. on the other hand, if the geomodel stopped suggesting mallards in that coastal area it could create a massive headache for identifiers.

I am not assuming there should be symmetry. I only said those contrastive examples to indicate there are atleast two sides of coin when any system is trying to work on coarse cell levels and one must not ignore the other side in discussion. The final reality is always a careful balance of a single ranking balancing both, and so which one one requires relaxing is firstly about breaking the geomodel semantics as it exists now - primarily being asking for getting the older feature “Seen nearby” back under a perfect recall system (which the geomodel is not although it looks desirable feature for any suggestions in first place) over “Expected nearby” that the model is currently making under its own semantics (in some faulty way).

So, all I am saying is such a patch will definitely have influence on the top-1 precision in final rankings on many other taxa and observations as whole on site, aka there is a tradeoff of precision to recall in any predictive or retrieval-ranking systems.

secondly, there are two systems and its important to talk of them separately. CV is not refusing the suggestions anywhere. the CV is a different system than geomodel, and the suggestions one sees are two types - CV&geo or CV. Under “not expected nearby” only CV usage aka when we silence the geomodel, all the original CV suggestions would show up and there is nothing that is blocked in that list by geomodel gating, and there is no change in that other list final rankings with a patch we are talking of here.

I hope everyone in this thread is aligned that the CV&geo list is what we are talking of here even if they meant it as CV or suggestions.

that is the entire point, unless the numbers are run to show how such a patch vs not will effect the reality of top-1 and top-k on carefully sampled site data, one can never say any patch is going to be called as “clear improvement” on whole data level as there is precision to your point of “even if it meant …” on those other side of predictions.

The problem is that “seen nearby” is not good for some observations and “expected nearby” is not good for some observations. Each has its own benefit, the first in showing what is actually known and the second in showing what might be found. To not be using them both together is the problem. The careful balance is missing.

Let’s take a closer look at the CV suggestions of one of the Malacothamnus fremontii observations I was referring to at the beginning of the post.

Below are the CV suggestions using the geomodel. This gives us one suggestion that does not occur there that looks very similar to Malacothamnus fremontii (Malacothamnus orbiculatus), one suggestion in the same family that does occur there but looks pretty different (Fremontodendron californicum), and another suggestion from the same family that looks a little bit like some Malacothamnus species, but not this one (Sphaeralcea ambigua). So, it does suggest the right genus and all other suggestions are in the correct family but only one of those suggestions actually occurs there and it looks different from the correct species Malacothamnus fremontii, which is documented to occur there on iNat and is not suggested.

And here are the CV suggestions excluding the geomodel. These are the same except that it adds five more Malacothamnus species with the correct one being in the number two position, which makes perfect sense as the first two sometimes look pretty similar.

Now, if “seen nearby” and “expected nearby” were used in combination as they should be, we’d probably have the following order of suggestions:
Malacothamnus - Correct genus ID.
Malacothamnus orbiculatus - Looks very similar but doesn’t occur there.
Malacothamnus fremontii - Correct ID.
Fremontodendron californicum - Same family and occurs there but looks pretty different.
Sphaeralcea ambigua - Same family and somewhat similar but does not occur there.

So, by combining “seen nearby” and “expected nearby” together with no removal of questionable “expect nearby” species, we get the exact same suggestions except that the correct suggestion is now present and at the number two species suggestion.

And, of course, the use of the term “expected nearby” is a problem too as “expected nearby” should be “seen nearby” plus geomodel extensions. The current “expected nearby” often does not show what actually is expected nearby simply because it excludes “seen nearby”.