In taxa autocomplete search, why different results with capitalized personal name?

benarmstrong · October 12, 2019, 7:53pm

We’ve discovered some quirky results when giving personal names to the taxa search which can be easily demonstrated on the web. If you capitalize a personal name, some names return a different “best match” result in a taxa autocomplete search from when the name is not capitalized. There’s no discernible pattern to the results, so I’m not sure what to say back to the users who discovered this while helping me test the code for the Discord bot we’re developing.

Try it yourself. Go to:

https://www.inaturalist.org/taxa

Select a personal name from the list below and first type it lowercase and observe the top result, then type it capitalized and observe the top result & it will be different. At least these pairs match different results depending on whether you capitalize the name or not:

andrew/Andrew
bruce/Bruce
hannah/Hannah
joseph/Joseph
mary/Mary

Other common names return the same result when capitalized. I won’t bother listing those here.

Thoughts? It isn’t causing real problems, except for a bit of initial confusion & wondering if there were a bug in the new code we rolled out recently (clearly not the case, since it’s reproducible on the web) but it certainly is mysterious!

Thanks,
Ben

benarmstrong · October 12, 2019, 7:56pm

Here’s andrew/Andrew, for example:

andrew:

Andrew:

JeremyHussell · October 13, 2019, 12:09am

The results from “andrew” all have “andrew” in the scientific name. The results from “Andrew” include things that don’t, but have “Andrew” in the common name.

benarmstrong · October 13, 2019, 12:21am

It doesn’t seem that this holds as a pattern for all of those names … hannah/Hannah?

JeremyHussell · October 13, 2019, 12:44am

There seems to be some sort of complicated prioritization of near-complete matches and whole-word matches over partial matches. E.g. for “gra” the first match is “Grasses”, for “gras” it’s “Grayish Saltator (GRAS)”, for “grass” and “grasse” it’s “Grasses, Sedges, and Allies”, and for “grasses” it’s “Grasses” again.

But that doesn’t explain your example. It just shows that the algorithm is complicated. Your best bet may be to read the source code to find out what the algorithm is.

bazwal · October 13, 2019, 12:53am

It seems that a slightly different algorithm is used for capitalised and non-capitalised search terms. But the list is limited to a maximum of ten items, so you are unlikely to get exactly the same result for both (although there will usually be some overlap).

pisum · October 13, 2019, 2:21am

i suspect it’s not just capitalization. it looks to me like simple strings (no mixed casing, no special characters, etc.) are matched in a slightly different way than more complex strings are matched. the results and the scores assigned to the results probably don’t change, but the results are probably come back in a slightly different order. so if there’s no explicit order defined, then things probably just show up in the order they are returned.

you might be able to force everything to take the same path by putting a “+” in front of your match string when you send it to the API. for example, q=+hannah should probably give you the same results, in the same order, as q=+Hannah or q=+hannaH.

benarmstrong · October 13, 2019, 5:17pm

Ah, nope. I had to revert the initial “+” in the query because of unintended consequences. It makes some sections match higher than genus which is almost certainly not what is normally desired. Example:

http://api.inaturalist.org/v1/taxa/autocomplete?q=+rudbeckia

Rudbeckia sect. Rudbeckia matches 1st!

http://api.inaturalist.org/v1/taxa/autocomplete?q=rudbeckia

Genus Rudbeckia matches first, as desired.

system · December 12, 2019, 5:17pm

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.

Topic		Replies	Views
Differences between Capitalisation of Species Names General	55	13051	November 4, 2019
Common name capitalized two different ways in the suggestion box Bug Reports web	1	398	July 27, 2022
Consistently Auto-Capitalise across the Website and the App Feature Requests web , android-app , ios-app	1	453	August 14, 2019
Capitalization of common names policy? General	6	3723	March 29, 2019
Unexpected results from taxa/autocomplete Bug Reports	10	479	May 5, 2022

In taxa autocomplete search, why different results with capitalized personal name?

Related Topics