Maverick first or Maverick last?

Is there a way to locate observations where the most recent ID given is the Maverick?

The idea is that this situation is worth looking more closely at because it usually reflects an expert going through and noticing something that others missed. It could be as simple as correcting a mistake made by earlier identifiers, or it could reflect a subtlety that might warrant further research, such as a new subspecies or species.

The situation of a Maverick ID being first is less interesting, because probably those are wrong IDs that were later corrected.

16 Likes

No. :confused:

We would need a filter value “maverick” here in the API:


and then, to answer your question, using a software automatically reviewing the Mavericks, we could populate a traditional project with the Mavericks where the most recent ID given is the Maverick. That would be affordable, updating the project only once a year, or a few times in a year.

This URL filter won’t help: &identifications=some_agree
There are presently about 300,000,000 observations and this filter only reduces the set to 166,000,000 observations (including all the Mavericks):
https://www.inaturalist.org/observations?identifications=some_agree
For instance, this URL provides 3 observations without the filter, only 2 observations with the filter, and 1 of these is a Maverick:
https://www.inaturalist.org/observations?lat=10.71308&lng=-85.03824&on=2023-04-06&radius=5&subview=map&identifications=some_agree

This other URL won’t help much, because there is no corresponding API request that a software could use efficiently to check the Mavericks. Requesting this URL directly provides the HTML code of the page displayed, including all the data already formated in the HTML code:
https://www.inaturalist.org/identifications?category=maverick

3 Likes

probably the closest thing to what you’re looking for is to find identifications that are both maverick and disagreements: https://jumear.github.io/stirfry/iNatAPIv1_identifications?category=maverick&disagreement=true.

using the page above, you can also search for specific taxon_id (the taxon of the identification) and / or observation_taxon_id, and you can also look for a particular user_id (identifier) and place_id, among other paramters.

you can open a specific observation returned for each identification on the page, or you can open the set of observations associated with all the identifications on each page in either the iNat Explore or Identify screens.

you can also exclude disagreement=true to get all mavericks, or you can exclude category=maverick to get all disagreements. personally, i like to look for all disagreements because it finds observations that have disagreements even if the disagreement has not triggered the creation of a maverick ID on the observation yet.

5 Likes

I’m still looking at this, but the combination of “category=maverick” and “disagreement=true” looks like it does what I want.

The next step would be to do more what @jeanphilippeb was suggesting, to create a project of these and then maybe update it occasionally, the way it was done for the Pre-maverick project.

is that really necessary? that kind of process is not super efficient – neither comprehensive, nor fast. wouldn’t you rather get all the data in real-time?

in my opinion, it’s much easier for me to use my page to find disagreements in real-time than to wait until something maybe, eventually hits one of the pre-maverick projects.

(There is only one Pre-Maverick project.
You are thinking of the Phylogenetic projects?)

@gordonh, I agree with @pisum, as long as you can filter observations as you need.

I missed the trick with the filter disagreement=true
https://api.inaturalist.org/v1/identifications?category=maverick&disagreement=true

This is the only drawback. You need to go to another website and to copy an iNat URL that provides a subset of all target observations, then repeat with the next page to get another subset, and so on.

every workflow has pros and cons. i think a major benefit of using my workflow is that you get to see identifications by date they were made, as opposed to the submit or observed date of the observation.

to mitigate the drawback that you noted, i would just pre-open in new tabs however many sets you want to work on that day. you can also add per_page=200 to the filter criteria to get as many observations as possible in each set. so for example, if you wanted to do up to 1000 IDs that day, you could open up 5 pages worth of 200 observations in advance into Identify screens in other tabs, and then when you were done with those, you would know that you had potentially done up to 1000 IDs.

2 Likes

I can work with pisum’s system just fine, but I’m not just thinking about me. A project is visible within iNat itself and easily understood by anyone who visits the project home page. There is no need for advanced knowledge and manual URL construction.

It sounds like the project update algorithm makes it hard to keep a project continuously updated. How feasible is it to have an automatically running process that updates projects in real time?

That’s what I do for the project Placeholder backup, it is updated every 5 or 10 minutes (and we got an alert on the forum when nobody was running the software that updates the project). In this case, the trigger is new incoming observations (observations with ID higher than the highest ID previously checked by the software).

Beware that a project populated in ~real time has absolute priority over all others projects I manage, and inevitably slows down populating all the other projects. So, I populate a project in ~real time only if it is technically mandatory. In the case of the project Placeholder backup, the constraint is to pick up the species_guess data field (containing the identification placeholder) before it is overwritten by the common name (in whatever language) of an identification added by the observer themselves. And I have no way to check if the species_guess content is a placeholder from the observer or a common name originating from an ID (ID possibly deleted meanwhile), so the only solution to get a true placeholder from the species_guess is to retrieve it in ~real time.

We don’t have such a constraint for populating a project related to Mavericks, or to Pre-Mavericks.

In the case of Mavericks, the trigger for populating a project in ~real time would be new incoming identifications. That’s possible, there is a filter id_above in the API for searching identifications, so that a software would be able to check repeatedly the new incoming identifications, only those whose ID is higher than the highest ID previously checked by the software.

At the present time, we are only 2 persons running day and night the sofware populating the projects I manage, so it’s not possible to add an additional ~realtime process in the software, unless a 3rd or 4th person joins the effort for running the software day and night. The software we run is already very busy and the workload (new obervations in a day) is always increasing, but every person running the software is limited to 10,000 requests to the iNat API per day. The software is a Windows desktop application.

2 Likes

people have to know that a particular project exists, too. i don’t think they’re going to just stumble upon it by accident.

regarding manual URL construction, i actually think that’s good skill for folks to develop, but even for those who can’t or won’t learn how to do it, someone else can help them construct an appropriate URL, and then they can just bookmark it for future use.

i do agree that the ideal solution would exist inside the system itself, and i do wish that iNat’s standard identification page had the same sort of functionality as exists in my page (so that my page would not be necessary). and i’ve argued in the past for a screen in the mobile app that would help folks find disagreements easily. but in the absence of those improvements, i think my identifications page is the next best thing. there are actually many things you can do with the page besides finding mavericks. so i think it’s a useful tool for power users to be aware of anyway.

2 Likes

Right, the project wouldn’t advertise itself, there would have to be some discussion of it on the forum, but at least it’s easy to promote with a simple link without having to explain anything more technical.

I don’t have capacity to implement this now, so I guess we’ll leave the idea for the moment. Maybe later I could imagine a cloud service of some sort running agents that do the actual work, but that would have be financed and developed, so not realistic in the short term, unless iNat provides an agent hosting service, which of course would be pretty far down their list of priorities and there are many technical questions on how to make such a thing happen.

if you’re going to ask iNat to put effort into something that’s not one of the things i already mentioned, you should just ask them for a new observation filter that would be something like includes_maverick_id=true/false/disagremeent/any.

1 Like

I can implement it and give it to you (I have already the technical basis, it would “just” be a different project to populate, with different inputs and rules), and you run it day and night, on your PC.

And, once a day (or better, twice), you would have to get a new API token from https://www.inaturalist.org/users/api_token and give it to the software. This takes only 5 seconds, but you have to be there to do it, otherwise the software will stop populating the project(s) after 24 hours (and that wouldn’t be an issue, it would just delay the update of the project(s)).

So, it cannot be a service that you set up, launch and forget, except if someone can explain how to automate this either, but I would then ask if it is compliant with iNat’s policy.

2 Likes

Thanks, right now I probably shouldn’t commit to keeping it running. However, if you decide to implement it, or something pretty close, let me know. I eventually would like to get into this type of work.

I went through pisum’s page with filters set to plants and then went through looking for any that were about taxa that I knew something about. I did find several observations where the disagreeing maverick was right and everyone else was wrong (in my judgment). I added some IDs there and others who had previously identified it reconsidered in a couple of cases. So, this effort is worthwhile. It’s also interesting in its own right to see what some of the difficulties are. There are some individuals who enter dozens of maverick IDs on their specialty areas. There are also people who do things like ID some other target organism than what everyone else is ID’ing–genuinely maverick behavior. And sometimes, it’s the original observer who “corrects” what they intended everyone to focus on, and they then become their own maverick.

1 Like

I usually ID by Date Ascending - so I get ‘today’s batch together’.
There is a glitch where one person will have everything IDed to something weird and wrong all the way - fish. Or bats. Or snowflakes.

Because I recognise the glitch - I am sometimes that maverick who says - no - the beetle is here, this one is for the host plant.
Not a fish.

I’ve had similar experiences where, after seeing a Maverick ID, I went back and realized there was a deeper nuance in the identification that others hadn’t noticed.

It’s also interesting how, like you mentioned, Maverick behavior can come from different places – it could be an expert catching something subtle or even the original observer trying to correct their own work. I’ve definitely encountered those situations where people are adamant about their own observations, and in some cases, they end up being the ones to correct the record.

I think keeping an eye on Maverick IDs, even if it’s not always easy to track them systematically, is worthwhile. It’s also a good way to learn and stay sharp on the details of a given taxon. Have you found any patterns in the types of observations where Maverick IDs are more likely to show up? I wonder if certain types of data or species groups tend to have more “Maverick moments.”

1 Like

there are the endless homonyms which end in Kingdom Disagreement which can be found among these 17 K.

2 Likes

@Thincomme, in your quotation of my post, I think you made a formatting error that makes it look like I said that last paragraph “I’m a developer, …”. Just want the record to be clear that those are not my words. Just to clarify, I have some dev skills, but I’m not a professional software developer, I’m a technical writer on software tools and so I write software documentation and code examples, but not software itself.

One pattern I observed is taxon splits where all the organisms were at one time in one species, but a taxon split occurred where now, observations of two distinct species in reality were not separated on iNat’s observations, and so some identifiers go through existing RG observations and mark them with the (hopefully) correct new separated taxon. This is almost always a Maverick ID and would remain so unless another identifier (or more) supports their revision.