I have created a project, which no one needs to join because it is a collection project unless that helps to bookmark it, selecting the countries and territories from the Year In Review 2021 that each have had a total of less than 1% of the total growth by country with the US removed. (apologies if I missed anyone)
I’m curious to know how other people feel about this, but after spending some time coming back to IDing this holiday weekend the vast majority of needs IDs I found for plants in an area I know very well were either Tough Groups that I don’t feel fully comfortable sorting through right now, and a few power users who submit a lot of observations of dubious quality that likely will never be IDed to species such as the senesced stems of Geum or out of focus roadside trees. The problematic IDs from new users is very small issue in comparison. I wonder if the ID burden would be more effectively reduced by trying to convince power users to be a little more selective with the observations they upload or contributing more IDs than Obs.
Just this weekend I needed to bring the needs ID pile I was working through (plants in one county) down from ~14,000 to less than 10,000 to work in reverse, and could only get there by removing the observations from one user (more than 5,000 needs IDs in that search alone).
I don’t know how well that translates to other peoples IDing experiences but it takes a lot of new users who haven’t been onboarded right to make 5000 observations.
Rather than try to modify the actions of other users, especially those that are “power users” who use the site to document their experience in the natural world their way, I think it is easier to modify my identifying experience.
Two modification to the parameters for identifying that might help you with your experience is to add
It sounds like you may have done this. Or limit to new users
modify as needed.
Just add the user names of those users to your notepad and sprinkle when needed into you identify URL query. You can also use these parameters in the same query. For instance if it is two new power users.
I don’t like reviewing things that are IDable but I’m not IDing right now, so instead of skipping to the next page and missing a few observations, or reviewing all of the observations (which I feel like I haven’t) I can reverse my filters and skip to the end. I think I originally read the concept in this thread, and it works for how I’m using it.
I agree with keeping my experience on the site how I like to use it, and I do not mean to cast judgement on anyone. Every time the needs ID pile is mentioned the problem is talked about like it is predominantly new users, people who post some observations but no IDs and maybe the solution would be better onboarding or a few more IDers. I don’t remember really seeing discussion asking if the growth in needs IDs coming is not from new user growth but from use growth. The first query I really started whittling away at may be very unrepresentative, but it has raised the question for me.
This wiki is so important to how I use the site that its on my profile so I can’t forget the link.
Because it’s true, no power user can be compared with a bunch of students, and maybe you see someone who posts blurry photos, but generally power users make both more observations and better quality ones.
Perhaps it depends on your geographic location of choice? In my area there definitely are more Needs ID observations from students and new users that quit after 1-5 observations than there are from power users. But I could see how an area with less of a user base could be dominated by one zealous observer.
The only problem with this representation is that it only counts observations from accounts created greater than or less than 52 weeks from the current date. That means all the accounts that were created, used for a few days or a month, then abandoned, won’t be counted unless they were created within the past year. Therefore it can’t provide very useful data on those accounts.
I think even number of observations would be enough as a first view on the “problem”.
Though I think it’s more complicated as power user can generate just more needs ID, but not more unidentifiable needs ID, which is hard to find out and calculate.
I wanted to compare the needs ID creation from the most active users compared to the site as a whole to get a sense of how it affected identification load for identifiers. I used the 2021 leaderboard as a list of 10 of the most active users, and using that list is not in any way to imply that I disagree with how they use this site. In fact I think these users are really the most valuable members of the community and many of them have contributed far more IDs than they have observations.
That being said and truly meant 10 of the most active users (the 5 highest observers and 5 highest species observers for 2021) have observed a significant percentage of all the active needs IDs observations, both in the USA and in globally, at least in reference to this miniscule group of 00.0006% of users. (disclaimer I used page count to get approximate numbers so your numbers may vary slightly. I’d rather not post links because I had to filter by user name to generate these lists and I don’t want to look like I’m calling people out, which is not at all something I want to do. This is just the most dramatic slice of established users, which I will arbitrarily define as 1k observations to acknowledge that I am part of this growing pile of Needs IDs)
Top 10 Observers Needs ID
Total Needs ID
% Contributed by top 10
Needs IDs Globally
Needs IDs in USA
I do not know how to figure this out but, it raises the question for me what is the contribution to the needs ID pile of the 100 or 1000 most active observers? I would be surprised if people with more than 1000 observations did not account for a lot (a majority?) of active needs IDs. Those users are not the people who would be touched by improved onboarding and if the needs ID pile is going to be significantly reduced these users have to be considered in any strategy to reduce it.
It seems to me that these higher output active users would be easier to get on board with IDing, both because they engage with the site and have some knowledge with identifying and the natural world. They also may represent much more bang for the buck than thousands of students who use the app once then walk away.
I’m sure most active users also suffer from lack of iders, I have pretty close in size piles of needs id on species level, genus level, and higher than genus level, there’re just either no people at all who id groups I observe or they exist on iNat, but so busy that they can’t keep up with number of coming observations. Part observations are just stuck with no ids, I see it on mine that 1/4 is left each year:
I have the feeling I’ve seen this statistic before, but I can’t remember where: as a percentage of total observations, how has the number of Needs ID changed over the years?
Right now, 39% of the verifiable observations worldwide are Needs ID. What was the percentage five years ago? Or every year since iNat started?
Similarly, how has the percentage of identifiers out of all observers/users changed? Currently, if we look only at verifiable observations, about 11.6% of the number of observers are identifiers. (Caveat: There are some people who are identifiers who have never made an observation, but I’m ignoring that here.)
I wondering this because, as an active identifier, I am sometimes discouraged by the sheer number of Needs ID observations out there. However, if the percent of Needs ID observations has not been going up over time, I would be much less discouraged.
Thank you, but unfortunately I am so unskilled at using the API that I cannot figure out how to reach into the past, so to speak, and use it to figure out what the ratio of Needs ID observations to Verifiable was for a date in the past. I can use the Explore tab in iNat proper to figure out the current ratio for observations made before a certain date (and the ratio is less than 39%, for the couple of dates I checked), but I would expect that ratio to decline over time anyway, as identifiers get around to older observations.
Maybe we should ask @pisum to contribute their considerable expertise with the API to this question?