Amount of "Unknown" records is decreasing

Thanks a lot! This is encouraging!

I have now included an option for loading in the drop down box a taxonomy built from the 10 taxa suggested by AI, as shown in the 2nd screenshot below.

The application also provides a few additional taxonomic ranks worth displayed, in the drop down list. Nevertheless, this is also tuned in order to keep the list short, even when much dispersed AI suggestions push to display more taxa in the list.

I also worked on how to find and highlight the most relevant taxon/taxa among the whole set of the AI suggestions and the additional taxa displayed. This ends up with something similar to the “We’re pretty sure this is in the …” displayed in the Observation page.

The display of a selection of higher rank taxa is something important that the web application does not do and that is a reponse to something I often do manually when identifying: looking for the ‘common denominator’ of the AI suggestions, consisting in a higher rank taxon that may give an adequate identification, not too generic (at a too high taxonomic rank), and not too risky (at a too low rank).

However, the taxon/taxa highlighted by this application are not always exactly the same as the one proposed as the “We’re pretty sure this is in the …” in the Observation page. In some cases, the application suggests a taxon at an higher rank (more prudently) that the “We’re…”. In some other cases, the application suggests a taxon at a lower rank (for instance a “Section”, more specific) than the “We’re…” (for instance a “Genus”). For instance, it is useful to quickly find the “Felis catus” entry highlighted in the list, when the Observation is a photo of a cat.

The example below corresponds to the observation:
https://www.inaturalist.org/observations/64033887

In the screenshot above, the key point is to highlight and preselect an ID as relevant as possible, so that it can be submitted immediately by operating the “Submit this ID” button.

Now let’s check all the suggestions, displayed in a treeview:

The numbers in brackets are the Computer Visison scores (between 0 and 100).

Note that the Observation page in the web application displays 8 suggestions, while the API provides 10 suggestions. That’s why here the application displays 10 suggestions with 10 scores.

Now let’s compare this with the display of the AI suggestions in the Observation page:

In this case, there is no “We’re pretty sure this is in the …” in the web application. I find useful that the application displays and preselect “Order Agaricales”. But if we prefer “Kingdom Fungi”, it is also selectable in a clic.

See also this feature request:
https://forum.inaturalist.org/t/identify-page-observation-page-dropdown-list-of-upper-taxonomic-levels/12319

The treeview is never ambiguous. For instance, we can deduce from the precise position of “Veronica persica” in the treeview below that this species does not belong to a “Section”:

image

Rationale:

  1. “Section Hebe” is displayed because it is the closest common ancestor of 7 AI suggestions, and because it is then desirable to enable us selecting this ancestor, in the case we hesitate only between these 7 “Species” suggested by AI.
  2. Because “Veronica persica” is in the same “Genus” as this “Section Hebe” being displayed, another “Section” containing “Veronica persica” would have to be displayed also, only to ensure the consistency of the display, but not because we could desire to select that other “Section”. (As we are supposed to ID at the lowest possible rank, it would be pointless to ID at the “Section” rank if we agree with the “Species” suggested).
  3. As a consequence, if that other “Section” containing “Veronica persica” is not displayed, it is only because it does not exist at all. This display actually intends to tell us that.
  4. On the contrary, positionning “Veronica persica” by default at the bottom (of the “Genus”), would make us think that it is also in the “Section Hebe”, which is wrong. As a consequence, the only way not to be wrong is to display “Veronica persica” at the top of the “Genus”, before the “Section Hebe”.

(How to make unambiguous a treeview lacking connecting lines reminds me the Polish Notation making it possible to specify unambiguously a complex calculation without parenthesis. In both cases, the order of items encodes the required extra information that can be retrieved provided we know the rules).

In short, the treeview displayed is the result of a tuning taking into account:

  • The limited amount of space in the drop down list. (Some intermediate taxa have to be hidden).
  • The display of all intermediate taxa that we may actually desire to select as an ID.
  • The requirement to display only unambiguous information.
  • The possibility to learn a little bit more of taxonomy while reviewing observations.

Other general considerations:

This tool is primarily designed for identifying the “Unknown” observations. I didn’t think yet about possibly using it for reviewing observations already identified. These tasks are clearly very different.

I made every effort possible to make the browsing/reviewing/identification as fast as possible. The point is not spamming/gaming, it is to treat as many “Unknown” observations as possible, so that they can be found and can get better IDs from other people.

But “I need to stop” at some point, because the obvious next step woud be to choose and submit identifications entirely automatically. I assume that this is NOT what the community desires (otherwise the web application would do it already?).

Because we (apparently) do NOT want to fully automate the first ID, I would propose another feature request for the web application : a search option to include in the results the observations “Unknown” whose taxon according to the “We’re pretty sure this is in the …” matches the taxon we desire to review. For identifiers, this feature would have the same effect as having a robot identifying automatically all “Unknown” observations, without the (possible) drawbacks of an automated first ID. User story: as a person interested in the Subfamily Caesalpinioideae, I wish to review all “Unknown” observations that are likely related to the Subfamily Caesalpinioideae. Such a feature would allow me to identify observations (directly at a low taxonomic rank), observations that otherwise I would have no way to find.

7 Likes

There are some users in favor of this. Also it is by no means true that the web application would do it already if the community desired it. Any aspect of the site’s functionally is limited by the organization’s limited staffing and funding.

8 Likes

One use of this could be for monotypic taxa: for example, here I found a bunch of observations of Ginkgo biloba that are only down to genus level.

3 Likes

I tried two - but - they have opted out of Community ID - so pointless, sadly.

1 Like

Wow! This is a very helpful interface, especially for IDing Unknowns. Potentially it could also be used to work through observations already IDed to a high level (e.g. Plantae, Angiospermae). I think all your optimizations make a lot of sense.

4 Likes

Now that I got a solution to an issue related to my profile settings and the API, the application is able to display also all common names available (in the language selected in my profile):
https://forum.inaturalist.org/t/api-v1-taxa-common-name-missing/18654

1 Like

Because of an issue occuring when reaching the end of the set of “Genus Ginkgo” observations, I could catch only 4 Ginkgo biloba after you identified all the others at the rank “Species”! (Reaching the end never happened before, when I was working on the infinite set of “Unknown”).

For this “Genus Ginkgo” use case, I planned to use a Custom Taxonomy containing a single entry “Ginkgo biloba”, so that this taxon remains selected always, and to use the “Skip IDs submitted”, so that only one ENTER keystroke is needed to identify an obseration and to move to the next one. Provided I let the application load all observations before starting, it becomes very fast. One round to “Skip” the observations that I cannot identify, then another fast round to identify them all with the ENTER key (and keep the application open as long as the status bar shows that identifications are still on-going).

image

3 Likes

Despite some people are really found of web application development, my opinion remains that web development (say, javascript) is the “issue” in the first place (for me, at least, maybe because I never got trained correctly with this technology). Just consider the historical origin of the technology: a script over a static page. A patch is not a good start. Moreover, this technology is also at the origin of security issues. For instance, phishing would not happen in a client/server architecture: you just need to ensure that you got an authentic client application (directly from it’s editor), to trust it once for all (except if there are too many updates), and that’s it.

Don’t be surprised that it’s possible for one developper to develop such a UI and such features in a few days. The reason is that this is not web development. Another reason is that it is easier to start developping an application from scratch with only a few features, than maintaining a complex existing application with many features. The complexity is not proportional to the number of features, it grows faster, because of the many interactions between features, and the unexpected side effects.

(This is absolutely not a criticism against the iNaturalist web application. I am so impressed by iNaturalist that I still see it as if I were dreaming, but it’s real!).

8 Likes

I’m not a developer, but I’ve seen pros and cons of client/server and web-based applications over the years. Certainly, rapid application development is a lot more feasible with an installed client, but maintenance of dedicated client software for each application can become a huge task (and poses it’s own security risks). It would also be a major limitation to public adoption.

Interestingly, mobile app development has largely taken the dedicated app route, facilitated by the app store concept, which has addressed many of the challenges inherent in installing and maintaining a dedicated client.

1 Like

For Ginkgo you can take your search all the way up to class Ginkgoopsida since the species is alone in its class.

There’s plenty of other monotypic genera to work on, like Ricinus which is Ricinus communis, and a nice distinctive plant, easily identified. Or Heteromeles, which is Heteromeles arbutifolia, but you do have to know how to tell it apart from Pyracantha and Cotoneaster.

Or sometimes you can work on something where there’s only one species in a certain range; for example all Larrea in North America are Larrea tridentata.

5 Likes

Thanks! I did initially start the search all the way up at class level, but I guess I missed a couple. I decided to focus my efforts on Ginkgo biloba, since it’s very distinctive.

I know I could try some other genera, but this kind of work is quite tedious, so I might need to take a break before working on it. For the record, after first stumbling upon the Ginkgo observations there were about 27 pages or about 800 observations. Definitely this could be better handled automatically, somehow.

1 Like

I was trying to use “identify” to look at North America unknowns thinking I could get stuff to kingdom level at least, but there are pages and pages of stuff that actually have some form of ID given by either the observer or someone else; mostly microscopic organisms (Cyanobacteria, viruses, etc). I’ve got filter set to unknown, date set to ascending…why are these showing up? There’s not really any appreciable number of larger organisms. There’s also a lot of “state of matter life” observations.

can you share your url and/or screenshot?

https://www.inaturalist.org/observations/identify?page=13&iconic_taxa=unknown&order=asc&place_id=97394

Life, bacteria, viruses, they all fall under unknown section in identifying, I think there’s a URL to exclude those.

I think you have a bug report here because when I follow the url I see the obs in descending order…

I use the Identify page, and filter it down as you say, but then I use the keyboard shortcuts to move through quickly.

You can use the arrows to go back and forth through the records, type i to add an identification or c to add a comment, type r to mark it as reviewed. Tab to move to next field, down arrow to go to the suggestion you want in a drop-down.

I can’t remember what the shortcut is to mark it as captive/cultivated, but you get the idea. You can move through records much faster that way than with the mouse.

To exclude those, put &identified=false on the end of your search URL.

You might also want to start at a more recent date, for example, observations about a year old.

On another thread - someone said well meaning volunteers ripped out the only specimen of (that name I forget) thinking it was Ricinus communis.

Ah well it does have a few leaf-lookalikes which people less commonly observe on iNat, like the castor aralia, Kalopanax septemlobus. But the fruits are very distinctive.

2 Likes