Happy new year!
I am still working on the tool I presented above. I made several rounds of rework, that’s why it takes so much time. I hope to release it soon, but I would like to test it further, to make it as robust as possible against unexpected events.
I would like to show you here a view of the data collected with this tool.
So far, the tool downloaded the data of 178383 “Unknown” observations from these places:
- Central America, South America,
- Mexico,
- Alabama, Arizona, Arkansas, California, Florida, Georgia, Louisiana, Mississippi, New Mexico, North Carolina, Oklahoma, South Carolina, Tennessee, Texas,
- Hawaii.
The tool generates a “best ID” (similar to the “We are pretty sure this is in…”) for every observation and I was curious to generate a report to show the taxonomic distribution of these many “Unknown”.
This report shows the (guessed) distribution of these 178383 “Unknown” observations down to the rank Infraclass:
Every raw shows 3 numbers. For instance, the raw Animalia means that, among these 178383 observations, 2187 are guessed to be animals, and 25748 other ones are guessed to be animals with a best ID at a rank between Infraclass (included) and Kingdom (excluded). (Note that these 2187 ones may have a best ID at a lower rank, for instance a Family that would be grafted directly to the Kingdom, without a Class, if it exists). 27935 is simply the total of all of them.
This is part of a more detailed report showing the distribution of these 178383 observations down to the rank Family:
This is part of the most detailed report, showing all the 145 best IDs in the Family Urticaceae:
This means that if you are interested in identifying the Family Urticaceae, you may define this Family as a filter in this tool and then review and identify these 145 observations.