CV making strange suggestions

Please fill out the following sections to the best of your ability, it will help us investigate bugs if we have this information at the outset. Screenshots are especially helpful, so please provide those if you can.

Platform (Android, iOS, Website): Website

App version number, if a mobile app issue (shown under Settings or About):

Browser, if a website issue (Firefox, Chrome, etc) : Chrome

URLs (aka web addresses) of any relevant observations or pages:

Screenshots of what you are seeing (instructions for taking a screenshot on computers and mobile devices: https://www.take-a-screenshot.org/):

Description of problem (please provide a set of steps we can use to replicate the issue, and make as many as you need.):

Step 1: Go to any observation requiring I.D

Step 2: Suggest an I.D. with a single numeric value e.g. 7

Step 3: The CV returns a suggestion of Limpkin - Aramus guarauna
Previously this input has always returned 7-spot Ladybird Experimenting with numeric input has resulted in the following:
Input 0 - No returned suggestion
Input 1 - Suggestion returned = Animals
Input 2 - Suggestion returned = Chordata - Phylum Chordates
Input 3 - Suggestion returned = Birds - Class Aves
Input 4 - Suggestion returned = Cranes, Rails, and Allies - Order Gruiformes
This continues through to Input = 7 when the returned suggestion = Limpkin
Input of 8 or 9 returns no suggestions.
It seems that the suggestions returned are primarily being derived from the level of taxonomic status of the numeric input, although I haven’t figured out the next stage of the interpretation.

Seems to be related to the taxon ID.

https://www.inaturalist.org/taxa/1-Animalia
https://www.inaturalist.org/taxa/2-Chordata
https://www.inaturalist.org/taxa/3-Aves
https://www.inaturalist.org/taxa/4-Gruiformes
https://www.inaturalist.org/taxa/7-Aramus-guarauna

1 Like

It isn’t the CV making the suggestions once you have an input. It is the system looking for something that matches your input.

1 Like

Regardless of which specific aspect of the system is performing the search, the fact remains that prior to 22 Mar, whenever I entered a single digit of 7 the system returned a single suggestion of ‘7-Spot Ladybird’ which it no longer does!

The exact order of suggestions changes over time, I’ve seen it happen a lot. It’s particularly annoying for me when I have scientific names set first, type in part of a scientific name, and then it either suggests a common name that matches or a now-synonymized name before what I’m looking for. Regardless, I don’t think it’s a bug. I imagine it has something to do with which taxa are most observed/identified, but I couldn’t say. Your situation might also have to do with common name priority, as someone changed it around on March 22 (albeit maybe how that is interacting is unintended?- I only see taxon #7 rather than that and then the ladybug). For now, I would suggest you type in “7-” which is just one extra keystroke and gives you what you want.

Finally. Search by taxon ID. It’s a welcome change. I find it annoying that it wasn’t possible before. Taxon IDs can be easier to type in than names.

https://www.inaturalist.org/taxa/7 is the limpkin.

At the same time, I feel like “7” should continue to show taxa with “7” in their names, like 7-spot ladybird, in addition to the limpkin.

Taxon IDs reflect the order taxa were added to iNat. It appears that Gruiformes and its ancestors (but not Life or Vertebrata) was added first, though it was Gruiformes in the traditional sense, which is polyphyletic.

Taxon IDs are used in deciding which order to show taxa in a list, and which order to show species in observation searches. The funny thing is that the order of 1, 2 and 10 is 1, 10, 2 rather than 1, 2, 10; it’s actually alphabetical, not numerical order.

It’s not the first step though. The first step is the number of observations of each species in the search (doesn’t apply to lists). The second step is the ancestry of the taxa. So basically, it’s a taxonomic sequence, but it’s only based on taxa that are in iNat’s taxonomy. The third step is taxon IDs. For example, while a proper taxonomic sequence will put palaeognaths first within Aves, iNat will probably put Nyctibiiformes (potoos) first, as their taxon ID, 1583752, is the first in alphabetical order. Gruiformes would be first if it were numerical order.

I’ve found though, that this rule doesn’t apply when you have taxa outside of Life. Life comes after all the taxa outside of it. Taxa outside of Life will be put in an “Unclassified” section, and they will not be ordered alphabetically by taxon ID in this section.

This is due to some recent deliberate changes (see here and here) - so it’s not really a bug. I agree that the new behaviour is a little frustrating, but I can also see that searching by taxon ID is a desirable feature. I used to use that 7 shortcut quite a lot myself, but there is a slightly clunky work-around available: you can either add a hyphen (as previously suggested), or even type in any punctuation character (other than space or hyphen) before the number, and the auto-complete will work as expected - so e.g. 7-, #7, =7, \7, etc will all get “7-spot Ladybird” as the first hit.

This works because the taxon ID search is strictly numbers only. Personally, I would have preferred a special syntax for searching by taxon ID, as this will be far less commonly used. Something like #1234 would have been a reasonable choice, since I don’t think a character like # is normally allowed in either scientific or common names.

2 Likes

I changed this to “General” instead of Bug Reports as I believe that it is intended behavior related to the addition of requested ability to search by taxon ID.

On a side note, I would note that the common name of the species in question here should probably have the “7” written out as “Seven” due to both grammatical rules and the typical way that this common name is written out in most sources.

3 Likes

It’s not an isolated case - there are many ladybirds with similar common names (e.g. “14-spot”, “22-spot”, “18-spot”), and these are very widely used elsewhere. It would be most unfortunate if the grammar police spoiled this highly convenient shorthand for referring to these species, some of which are amongst the most commonly recorded insects in many regions. There’s no general grammatical rule prohibiting the use of numerals in names - it’s purely a matter of style (and/or practicality). It’s a shame that the convenience of these shortcuts wasn’t considered before implementing the new taxon ID search, because it has negatively affected some existing functionality which is likely much more widely used than the new functionality ever will be.

Although the recent changes themselves aren’t a bug, it could be argued that they have introduced a regression relative to some of the old behaviour. The general point was touched upon in this issue-tracker comment (and the following one by kueda), but the particular application of it reported here wasn’t explicitly mentioned. It seems questionable that the ability to search by taxon ID should be given the highest priority.

1 Like

Sounds like an edge case to me. What percentage of identifiers even know the taxon IDs?

I never suggested that any other name was problematic. 14, 22, and 18 would all be written as numerals based on English guidelines for grammar in scientific publications and they are widely used as common names because of this.