Amount of "Unknown" records is decreasing

jeanphilippeb · December 9, 2020, 6:33pm

Thanks a lot! This is encouraging!

I have now included an option for loading in the drop down box a taxonomy built from the 10 taxa suggested by AI, as shown in the 2nd screenshot below.

The application also provides a few additional taxonomic ranks worth displayed, in the drop down list. Nevertheless, this is also tuned in order to keep the list short, even when much dispersed AI suggestions push to display more taxa in the list.

I also worked on how to find and highlight the most relevant taxon/taxa among the whole set of the AI suggestions and the additional taxa displayed. This ends up with something similar to the “We’re pretty sure this is in the …” displayed in the Observation page.

The display of a selection of higher rank taxa is something important that the web application does not do and that is a reponse to something I often do manually when identifying: looking for the ‘common denominator’ of the AI suggestions, consisting in a higher rank taxon that may give an adequate identification, not too generic (at a too high taxonomic rank), and not too risky (at a too low rank).

However, the taxon/taxa highlighted by this application are not always exactly the same as the one proposed as the “We’re pretty sure this is in the …” in the Observation page. In some cases, the application suggests a taxon at an higher rank (more prudently) that the “We’re…”. In some other cases, the application suggests a taxon at a lower rank (for instance a “Section”, more specific) than the “We’re…” (for instance a “Genus”). For instance, it is useful to quickly find the “Felis catus” entry highlighted in the list, when the Observation is a photo of a cat.

The example below corresponds to the observation:
https://www.inaturalist.org/observations/64033887

In the screenshot above, the key point is to highlight and preselect an ID as relevant as possible, so that it can be submitted immediately by operating the “Submit this ID” button.

Now let’s check all the suggestions, displayed in a treeview:

The numbers in brackets are the Computer Visison scores (between 0 and 100).

Note that the Observation page in the web application displays 8 suggestions, while the API provides 10 suggestions. That’s why here the application displays 10 suggestions with 10 scores.

Now let’s compare this with the display of the AI suggestions in the Observation page:

In this case, there is no “We’re pretty sure this is in the …” in the web application. I find useful that the application displays and preselect “Order Agaricales”. But if we prefer “Kingdom Fungi”, it is also selectable in a clic.

See also this feature request:
https://forum.inaturalist.org/t/identify-page-observation-page-dropdown-list-of-upper-taxonomic-levels/12319

The treeview is never ambiguous. For instance, we can deduce from the precise position of “Veronica persica” in the treeview below that this species does not belong to a “Section”:

Rationale:

“Section Hebe” is displayed because it is the closest common ancestor of 7 AI suggestions, and because it is then desirable to enable us selecting this ancestor, in the case we hesitate only between these 7 “Species” suggested by AI.
Because “Veronica persica” is in the same “Genus” as this “Section Hebe” being displayed, another “Section” containing “Veronica persica” would have to be displayed also, only to ensure the consistency of the display, but not because we could desire to select that other “Section”. (As we are supposed to ID at the lowest possible rank, it would be pointless to ID at the “Section” rank if we agree with the “Species” suggested).
As a consequence, if that other “Section” containing “Veronica persica” is not displayed, it is only because it does not exist at all. This display actually intends to tell us that.
On the contrary, positionning “Veronica persica” by default at the bottom (of the “Genus”), would make us think that it is also in the “Section Hebe”, which is wrong. As a consequence, the only way not to be wrong is to display “Veronica persica” at the top of the “Genus”, before the “Section Hebe”.

(How to make unambiguous a treeview lacking connecting lines reminds me the Polish Notation making it possible to specify unambiguously a complex calculation without parenthesis. In both cases, the order of items encodes the required extra information that can be retrieved provided we know the rules).

In short, the treeview displayed is the result of a tuning taking into account:

The limited amount of space in the drop down list. (Some intermediate taxa have to be hidden).
The display of all intermediate taxa that we may actually desire to select as an ID.
The requirement to display only unambiguous information.
The possibility to learn a little bit more of taxonomy while reviewing observations.

Other general considerations:

This tool is primarily designed for identifying the “Unknown” observations. I didn’t think yet about possibly using it for reviewing observations already identified. These tasks are clearly very different.

I made every effort possible to make the browsing/reviewing/identification as fast as possible. The point is not spamming/gaming, it is to treat as many “Unknown” observations as possible, so that they can be found and can get better IDs from other people.

But “I need to stop” at some point, because the obvious next step woud be to choose and submit identifications entirely automatically. I assume that this is NOT what the community desires (otherwise the web application would do it already?).

Because we (apparently) do NOT want to fully automate the first ID, I would propose another feature request for the web application : a search option to include in the results the observations “Unknown” whose taxon according to the “We’re pretty sure this is in the …” matches the taxon we desire to review. For identifiers, this feature would have the same effect as having a robot identifying automatically all “Unknown” observations, without the (possible) drawbacks of an automated first ID. User story: as a person interested in the Subfamily Caesalpinioideae, I wish to review all “Unknown” observations that are likely related to the Subfamily Caesalpinioideae. Such a feature would allow me to identify observations (directly at a low taxonomic rank), observations that otherwise I would have no way to find.

Topic		Replies	Views
Does anyone else get bothered by how many observations are marked as "unknown species"? General question	233	8940	February 25, 2023
Thoughts on unknown ID level? General	60	1246	December 8, 2025
Identifying taxa just to remove them from "unknown" General	43	3054	December 26, 2022
Why do some serious "power users" add so many unknown observations? General	123	13456	September 17, 2021
Poll - How many people seek out "unknown" observations to ID? General	51	3189	December 22, 2019

Amount of "Unknown" records is decreasing

Related topics