'Needs ID' pile, and identifications

Good points! I have no idea personally about how the student use of iNat changed in the last two years. I guess we’ll see what happens in the coming third year of the pandemic.

1 Like

It leads to frustration in most people, every time I accidentally check some US observations and there’s that Pyrus planted near roads and there’re hundreds and thousands of “wild” RG observations, or even hundreds of magnolias on lawns, nobody cares to mark them, not observers, not iders. That’s the reason I can’t do unknowns, they’re full of cultivated plants.

1 Like

No they won’t. Just went through a random page of 130 Prunus cerasifera (highly invasive species in many parts of the world) and 120 of them were just planted trees (and often misID’ed).

2 Likes

Comparing the 2017 Year in Review to stats from a current query of observations from 2017:
Research Grade went from 1,937,915 to 3,027,907
Needs ID went from 1,331,312 to 1,287,178
Casual went from 358,772 to 541,715
Total went from 3,627,999 to 4,856,800 (note the lower number on the Year in Review page excludes casual observations)

That was searching by date observed. So, it looks like the Year in Review stats actually are a snapshot in time. It also shows that over a million 2017 observations were added after 2017 and that there are now over a million more research grade observations for 2017 than there were at the end of 2017.

Tightening the filter to observations both taken in and added in 2017:
Research Grade went from 1,937,915 to 2,215,306
Needs ID went from 1,331,312 to 947,291
Casual went from 358,772 to 389,452
Total went from 3,627,999 to 3,552,049 (Doesn’t quite add up)

So, despite over a million extra 2017 observations being added after 2017, the 2017 needs ID pile that was present at the end of 2017 has gone down by 384,021 observations (29%) in the past 4 years.

These are the kind of stats I’d like to see in the Year in Review and/or a special identification progress report. It’s cool to see what has actually been accomplished.

6 Likes

Great, that was a smart comparison!

not sure. there may be some Covid effect, but there are probably many factors. i suspect the biggest thing is just that there are only so many additional new people you can expect to reach at some point. you can’t have geometric growth in users forever.

it’s not user counts, but i think the observation growth in this old post shows this more clearly: https://forum.inaturalist.org/t/welcome-to-the-club-new-zealand/21711/21. note that early-adoption places like the United States, California, Texas still have huge absolute growth, but their relative growth is lower than in other places.

you can also compare growth of observations over different time periods using different variants of this map:

… and here’s a high-level summary of the above:

when i zoom in on the interactive maps, looking at the absolute change layer, it looks to me like – at least in (Southern) California and Texas – folks might be starting to observe less farther away from home (or possibly that usage is dropping in poorer areas, rising in wealthy areas?). so something like that could be a Covid effect. not sure…

i’m not sure how much school projects contribute to the overall numbers of observers and identifiers (compared to, say, the numbers of folks using iNaturalist just once to identify a random plant). i wouldn’t be surprised if there was a relatively high number of of small-volume users, including students being directed by their instructors. i also am not surprised that it would be easier to grow the observer base than the identifier base because it probably takes a higher level of general (naturalist) interest to make an identification than to make an observation. also, there are probably technical things that contribute to lower identifier growth (ex. differences in mobile-to-desktop usage/availability among different groups / locations, expansion of the iNaturalist product line to include Seek, etc.)

with these counts accumulating over time, i’m not surprised that identifiers:observers would trend down over time.

at the end of the day, i think you have to revisit the question that i think underlies your search for statistics, which i think is: are there enough identifiers to handle the load? just based on comparison of https://jumear.github.io/stirfry/iNat_obs_counts_by_iconic_taxa.html against previous snapshots, it looks like the metrics on this page are getting better ever so gradually. so to me, it does seem like the identifier community is able to handle the load (at least keeping things status quo).

to me, then the question becomes: can we still do better? and the answer to something like that will always be yes. but i suspect the desired results are going to require more (ongoing) grassroots knowledge-sharing and recruiting efforts, rather than high-level data analysis driving top-down / centralized action. (i’m not poo-pooing number-crunching, but just trying to tie it to a specific goal.)

5 Likes

I really appreciate your thoughts and data-crunching on all this. I agree with you that the underlying question is are there enough identifiers to handle the load? Specifically, I think, are there enough identifiers so they don’t feel overwhelmed by the ever-growing Needs ID pile? Frankly, the data in that last table you linked to seems very encouraging to me, with none of the vertebrates having less than 78% of verifiable observations at Research Grade. All of the other taxa, including Plants, where I think I do most of my IDs, aren’t anywhere close to that, but for understandable reasons.

As for identifiers feeling overwhelmed by the task and irritated by lousy photos, cultivated/captive organisms, no ID whatsoever, and all that - well, some days I’m right there in the overwhelmed and irritated category. But most days I feel like I’m contributing to people’s understanding and appreciation for the natural world, and maybe even contributing a bit to science. (IDing a pile of poop as North American Porcupine gave me a particular thrill recently!)

5 Likes

Last year the blog post about reaching a million observers talked quite a lot about identifier load:
“One stat where we didn’t break records was the number of identifiers (people who added an identification to someone else’s observation). How is it that last month under 23,000 identifiers working with over 2.7 million observations from over 177,000 observers were able to add enough identifications to tally over 89,000 distinct species? I thought I’d spend this post exploring this in more detail.”

5 Likes

I read that - and then completely forgot about it, obviously! Thanks for digging that up. It’s quite sobering to see how few people do so many identifications.

This is why I usually go with either ascending or random order. I mean, in the checkout line at the supermarket, the cashier doesn’t start with people at the back of the line who just got there; they start with the front of the line who were there the longest (ascending). But on iNaturalist, we don’t want to discourage people by making them wait in line behind 100K other observations, even if those 100K did get there first, so random gives me a good mix of observations from years ago and from today.

9 Likes

We can create one

  1. More ids from each member
  • agitations of observers to try iding, masterclasses of iding process

  • agitation for broad ids

  • attraction of professional to iNat as new core team members

  1. Clearing out impossible to id observations
  • agitation to use of “can’t be improved” by those who are sure (knowledgable) it’s a correct statement

  • changing system so Family level observations can get RG status → more use in hopeless cases like Poaceae or Sciaridae

  1. Agitation for upload of clear photos
  • guides to which photos are needed for each group and which photos won’t be ided for sure
  1. More work on observations with disagreements

  2. More work on creating or supporting existing projects of narrow groups with already existing experts on iNat, e.g. spending time going through Arthropods and Insecta, adding observations to Galls of NA and Leafminers of NA, where experts are looking through observations in projects and will miss those out of them (likely)

8 Likes

iNat doesn’t have to send them to GBIF, I don’t see any reason to not being able to filter them out. I don’t know what you mean by finer id, RG at family level means someone marked it “as good as it can be”, and it’s not casual, but getting RG, it doesn’t happen just with two ids, just go and look at how many pages of Needs ID at Poaceae there where all that is shown is leafblades and experts ided it at family level years ago and it’s impossible to go any further.
You can use your own words, but those are mine, agitation have many meanings, and that’s one of them, political agitation, etc.

2 Likes

I think if it’s implemented, it can be discussed if GBIF needs those records or not, iNat can code it as they wish, now they can’t be RG, but there’re many cases where family is the best possible level, but only a few people mark them so or agree on being marked, e.g. I don’t want to have them casual, it means they’re not shown on the map and don’t exist as real observations, and they’re not even captive, just impossible to id further from photos presented, they’re perfectly fine observations, same as genus-level, and if, as you say, they can be ided further, then they should be marked “as good as it can be”.

4 Likes

It is a great list and and good words. I tripped a little over the use of agitation as well - it is a fantastic word but it is not a commonly used word in this context and sometimes has negative connotations in different cultures. Possibly, if someone else were to substitute by using promote or encourage if they were getting sidetracked when reading agitate.
https://books.google.com/ngrams/graph?content=agitate%2Cpromote%2Cencourage&year_start=1800&year_end=2019&corpus=26&smoothing=3
&
https://thesaurus.plus/related/agitate/promote

8 Likes

I thought the same thing as far as English is concerned.

1 Like

So… now that we have the list of of countermeasures against the Needs ID problem…what do we do with it?

2 Likes

Good question! Here are a couple of things I’m going to do:

  • Prod my real-life iNat friends to start IDing, including going so far as to Zoom with them, share my screen, and walk them through how I do IDs one step at a time.
  • In late February, I’m going to hold an ID-a-thon for New England plants, with the dual aims of getting plants IDed and getting new identifiers hooked.
  • I’ve been a gardener for over half a century, so I know a lot of common garden plants, at least to genus. I’m going through Needs ID plants and Unknowns and being ruthless in marking them Not Wild, once I’ve given them a decent enough ID.
  • I’m doing the same for Unknowns that are photos of people, pets, rocks, etc. I want them them out of the Needs ID pile sooner rather later.
  • Some day soon, I’ll start going through Unknowns with multiple photos where the photos are of different species, and I’ll leave a comment asking the observer to split the photos into different observations (with links to helpful tips on doing that). I’ll give those observers at least a month to respond; if they don’t, I’ll give the observations the lowest ID that applies to all of the photos and mark the DQA to say the ID can’t be improved.
  • I’ll like to get to the point where I can confidently mark high-level IDs as cannot be improved - say, blurry photos of some conifer at a distance, for example.
  • As I think of examples, I’ll write journal posts about what kinds of photos are helpful for which groups of organisms, where I know what to say.

I’m mulling over whether it’s a good idea to give feedback to new observers about iNat being primarily for wild organisms, one species per observation, etc. - the usual things new observers often miss when they skim the how to documents.

6 Likes

I don’t think you need to wait a month, if you leave a comment that you have marked the DQA and that they should be sure to leave a comment saying that they have separated the photos after they have done so.

4 Likes

I made up a trick for this recently-it’s clunky but it works. You might remember some time ago I wrote a tutorial on how to split up photos; several identifiers like to link that tutorial when asking the observer to split photos. So if I do a sitewide comments search, putting the URL of the tutorial as the search term, I can get back a list of observations that probably still need splitting. (Rarely the observer successfully splits the photos, but most of the time they don’t try. Or a few people say they tried and failed.) Then I can look at the list and see the ones still at Needs ID, and vote those ones as cannot be improved.

I have the search bookmarked on my computer, but am not at home now. I’ll share it here when I get back to it.

5 Likes

Do you know if there’s any way to search for Needs ID observations that already have multiple identifications? Most of the time it’s just 1 wrong ID holding up the RG status, so they’re pretty quick to clear out when I can find them.

2 Likes