Add taxon_id filter to /v1/taxa/autocomplete

benarmstrong · October 10, 2019, 12:58am

As most recently discussed in https://forum.inaturalist.org/t/prefix-matches-on-snow-better-match-than-the-aou-code-snow-for-snowy-owl/7061/14 , the /v1/taxa/autocomplete interface appears to be better than /v1/taxa at more reliably matching from the terms typed what the “best” result is.

If the v1/taxa/autocomplete interface had a taxon_id filter, just as /v1/taxa supports, it would allow users to be able to search subtrees of the iNat Taxonomy and get more accurate results. The filter would weed out irrelevant results, while the superior scoring system from autocomplete would help them zero in quicker on the expected match. That is, ideally, this would make their expected match the topmost match, or if not, then close to the top.

The same outcome can’t be practically achieved with either interface due to considerations I elaborated on in the discussion linked above. On the one hand, /v1/taxa/autocomplete doesn’t return enough results (maximum 30) to cover the case where 30 or more records match the terms, but none of them are in the desired Taxonomy subtree. On the other hand, the /v1/taxa results don’t contain a complete enough set of fields for downstream code to be able to impose a better ranking system on the results that more closely matches users’ implicit expectations (i.e. that the results should be similar to what they get with the /v1/taxa/autocomplete interface, which I use for all other calls except taxon_id, since autocomplete doesn’t support it).

pisum · October 10, 2019, 1:03am

wouldn’t it be better to ask for an option to sort by best match or most observations or whatever on the other endpoint?

benarmstrong · October 10, 2019, 10:31am

I’m only following up on @pleary’s suggestion that I ask for more filters on /v1/taxa/autocomplete, so maybe they could jump in here with their opinion on this alternative. I cannot comment on which one would be more work and/or more consistent with the goals & design of the two different endpoints.

If the end result of supporting a sort_by or sth. similar on /v1/taxa is that the score could be made the same as autocomplete, and matched_term from this call is therefore consistent with the autocomplete end point, then yes, I’d be satisfied with that. Should I file a new feature request (possibly superseding this one)? Or could this one simply be retitled/reworded to request that feature instead?

Ben

pleary · October 10, 2019, 5:33pm

I think having taxon_id as a parameter for the autocomplete endpoint is a fine addition, and I’ve just added it along with a few other parameters, rank, rank_level and all_names which returns all taxon names in the response. I do not know the requirements if what you’re building, so I don’t have an opinion on whether this parameter is what you need.

Please keep in mind that we still cannot guarantee for either the search or autocomplete endpoints that matched_term will be the term you are expecting. The logic for choosing the matched term is the same for each endpoint, but the queries we use are different, so it’s up to Elasticsearch what the best match is. But maybe with the addition of a way to get all names you can pick the match you feel is best.

If you’d like to see more things added to the API, it would be helpful to have separate posts for each request.

benarmstrong · October 10, 2019, 5:42pm

Thanks. I’m eager to try out the changes and see if Elasticsearch gives results that are consistent both with and without the new filters. I’m not looking for guarantees so much as consistency. If users want something special ~~(like a lookaside at a table of names imported from AOU’s list)~~ I could do that, too, but I’m hoping this will make that unnecessary. Oh my goodness! I didn’t notice all_names on my first read. If that’s what it sounds like, that would make it trivial to rescore matching AOU codes if necessary. Thanks so much!

benarmstrong · October 10, 2019, 6:36pm

After coding the changes in my bot to use rank & taxon_id in the request, my problem is now solved. Thanks again.

bouteloua · October 11, 2019, 3:01pm

3 posts were split to a new topic: Issues with all_names on /v1/taxa/autocomplete

Topic		Replies	Views
Prefix matches on "snow" better match than the AOU code SNOW for Snowy Owl? Bug Reports question	17	1178	October 17, 2019
/v1/taxa/autocomplete returning results without requested ranks Bug Reports api	5	680	October 26, 2019
Prioritise descendant or proximate taxa in autocomplete to speed identifier work Feature Requests web , under-review	10	433	February 9, 2024
/v1/taxa/autocomplete taxon_id returning a result without that id in ancestor_ids Bug Reports api	3	516	October 26, 2019
Taxon search by name does not prioritise exact match Bug Reports	1	345	June 29, 2023

Add taxon_id filter to /v1/taxa/autocomplete

Related topics