Parsing species names to "Section"

Please fill out the following sections to the best of your ability, it will help us investigate bugs if we have this information at the outset. Screenshots are especially helpful, so please provide those if you can.

Platform (Android, iOS, Website): Website

App version number, if a mobile app issue (shown under Settings or About):

Browser, if a website issue (Firefox, Chrome, etc) : Safari 14.1.2

URLs (aka web addresses) of any relevant observations or pages:
https://www.inaturalist.org/observations/114807635
https://www.inaturalist.org/observations/114807574
https://www.inaturalist.org/observations/114805043
etc.

Screenshots of what you are seeing (instructions for taking a screenshot on computers and mobile devices: https://www.take-a-screenshot.org/): Original filenames for the photos in this observation were of the form “Pelochrista matutina_6013”, etc.

Description of problem (please provide a set of steps we can use to replicate the issue, and make as many as you need.):

Step 1: Create files with the species name as part of filename such as “Pelochrista matutina_1234”

Step 2: Use uploader page to drag and drop files for uploading. Allow the uploader to parse the filename for the taxon, rather than using a dropdown menu.

Step 3: For species which are part of a “Section” of the same name, the uploader parses the filename and fills in “Section Pelochrista matutina” instead of the species name, and gives no indication before upload that it has made that switch.

With more and more “Section” groupings being created in the iNat taxonomy–which I think is a good thing overall–this glitch in the filename parsing is creating many incorrect placements and added after-the-upload work to correct taxon names to the species level.

Needed fix: Filename parsing should avoid “Section” level taxons unless it is explicit in the filename.

3 Likes

The same unexpected and undesirable uploader behavior seems to be happening with filenames being parsed to “Complex Yyyyyy Zzzzzz” when only a genus-species combination is in the filename. Here’s an example where I didn’t catch the “Complex” substitution before upload:
https://www.inaturalist.org/observations/114805034

Yet another example, this time with a “Section” of the genus Ethmia:
https://www.inaturalist.org/observations/116729248
With the addition of a great many Sections and Complexes to the iNat taxonomy, this has become a major annoyance at upload.

The uploader makes the following incorrect photo tag->ID taxon associations.

The species tag “Pero honestaria” is mapped to Section Pero honestaria. It should instead be mapped to the species Pero honestaria.

The complex tag “Pero honestaria complex” is also mapped to Section Pero honestaria. It should instead be mapped to the complex Pero honestaria.

And the group tag “Pero honestaria group” is also mapped to Section Pero honestaria. This is actually correct. Once the sorely needed species group rank is added to the taxonomy and this taxon’s rank changed to it, this tag should be mapped to the species group Pero honestaria.

1 Like

@treichard Tim, am I correct that your filenames include the Genus-species name in some form? If so, as in the cases I recite, this is an issue with the part of the upload process which is parsing the filename to get a taxon ID. I want to help staff understand the specific point(s) in the upload process where this glitch is introduced.

The incorrect mapping of these taxon names to taxa occurs in both cases: the taxon name in the photo filename, or the taxon name in a photo tag.

This is not a bug, the uploader intentionally chooses the coarsest taxon if multiple taxa have the same name. See https://github.com/inaturalist/inaturalist/commit/0229e1ab2a95ce2a25e124747f88e161e533267c

1 Like

Then it’s partly a logic error. The coarsest taxon rule may have some sense for genus vs. subgenus but not for species vs. species complex. The rule eliminates the capability to have certain tag/filename taxa interpreted as the intended ID taxon. Perhaps it could be refined to better handle these lower ranks.

And the way the uploader displays the interpreted ID taxon does not allow the user to see which same-named taxon was chosen, or that the wrong taxon was chosen. It is generally quite unexpected to tag/name a photo file with a species name, have it appear in the uploader that the species was correctly read, and then discover later that a different rank was chosen.

2 Likes

I totally agree with @treichard’s latest remarks.
Let’s get the source coder into this conversation: @kueda Ken-ichi, do you see how the “coarsest taxon” rule for parsing names seems inappropriate at the species/species complex level? It’s causing great headaches.

I honestly don’t know. I don’t think sections or subgenera should exist at all, so it pisses me off every time the uploader chooses either. My preferred solution would be to ban them both. If there’s some significant number of people who actually think they provide value BUT don’t want the uploader choosing either automatically based on tags and filenames, I guess we could carve out an exception for sections. I thought I had a nice solution to the subgenus problem, but as with everything on iNat these days, solving one problem just caused other problems. #zugzwanglife.

2 Likes

Try iding taxa like Tipula, subgenera are very helpful, distinctive, and often the best id for a photo.

1 Like

Another example yesterday. This has to change.

Since section-rank taxa should never exactly match a species name, I don’t think any coding exceptions for name matching need to be made for that.

As far as species complexes, seems far better to auto-match the coarser rank than the potentially incorrect species rank.

Cassi, I’m not sure how you are thinking about or applying the taxonomic concept of “Section” but in most of the above examples, the “section” that the uploader is filling in as a substitute for my species name is identical to the genus-species binomial. ??? From my moth references, some of the above examples are actual formal “Sections” and others are casual species complexes (as I think that Pyrausta laticlavia-tyralis grouping seems to be). I may be wrong about that.

@kueda Could the rank be displayed in the uploader page’s observations for ranks between genus and species, so they are shown as part of the taxon name? Then the user could at least see if the ID taxon that was interpreted from the photo filename/tag was the one they intended, and change it there if not. The rank is displayed in so many other instances across the web site, and there is no ambiguity there.

For example, here is how Pero honestaria complex.jpg, Pero honestaria group.jpg, and Pero honestaria.jpg show up in the uploader. All appear to have read the filename taxon as a species, but it’s actually the Section that was read in all three cases. This is too misleading. Instead of displaying “Pero honestaria”, display it as “Section Pero honestaria”.

1 Like

in short: I see absoluetly no need for “sections” whatsoever. It is just a nuisance!