So one option would be to comment placeholder text along with adding a broad taxonomic ID.
I’m an admin for a project collecting angiosperms, and we have a corresponding collection project collecting unknowns in the target countries. I’ve written a macro program which can add placeholder ID as a comment, and then looks at the visually similar suggestions for the observation, pulling down the top 10. It then applies taxonomic filters (‘Animalia’, ‘Fungi’, ‘Plantae’, ‘Angiospermae’) in turn and pulls down the top 10 visually similar suggestions for each in a loop. If all of the top 10 match the top 10 with no taxonomic filters, it assumes that filter is correct. e.g. if Animals match up then observation is an animal; if plants match then check angiosperms and if angiosperms also match then is an angiosperm, if not then it is a plant. It can then add that ID (the broad taxonomic filter which was correct) with a comment saying it is a broad taxonomic ID based on iNaturalist suggestions.
Philosophically this feels alright to me - we’re retaining placeholder ID as a comment and are increasing the likelihood that a user gets their observation ID’d by increasing the chance that someone will see it. It’s not ID by AI as much as increased chance of human ID thanks to AI.
Concerns about this approach:
If multiple taxa are present in a photo it doesn’t consider which the user may want ID’d (but then the observer can always add their own ID e.g. to plant if I’ve ID’d it as an animal).
If there are issues with the observation (e.g. observer has submitted multiple photos of completely different organisms), it would rely on this having been detected by different taxonomic suggestions. - suggested fix: only suggest IDs for observations lacking any comments (under the assumption that a kind user will have commented asking the observer to delete the observation and repost as multiple separate observations).
I should stress that I have not run this at all yet, and will not do so until I have some feedback from the community as to whether you think this is (a) a good idea, or (b) have any further checks you’d like to see implemented before it is run. When I do run it I would be testing on older observations in the unknown collection project I’m admining.
Welcome to the forum @matthewlewis896! The concerns you listed about
…wouldn’t worry me so much if the suggestion farther above could be implemented to make it…
This addresses both your concerns, and any concerns about potential errors even on high-level IDs. If the automated suggestion is wrong or inappropriate on a problematic observation, at least this will be more likely to be discovered and corrected by a human identifier. If that then retracts the automatic ID at the same time, the observation is on its way through the ID process like any other newly non-Unknown, and the auto-ID is prevented from contributing to any subsequent “State of Matter Life” limbo.
Am I correct that you would be doing this via the iNat API? If so, I would encourage you to contact help@inaturalist.org first and ask them to make sure that what you have in mind is a permissible API use. One question I would have is, to whom does an automated ID get attributed? In your scenario I would have to assume to your own user account?
We should probably also continue discussion on what an appropriate “age” would be before an Unknown observation would be subject to an automated ID. There has been a suggestion to measure against date last updated instead of date posted, so that if there is any ongoing exchange of comments or other activity on the Unknown, that wouldn’t get interrupted by an automated ID.
As for taxonomic level, my opinion would be to not go any lower than Phylum, to maximize likelihood of a correct ID. Subphylum might work well enough in the case of Angiosperms, but maybe not so well across other Kingdoms and Phyla.
i don’t disagree that what you’re suggesting might be useful in some cases on a very small scale, but i think on a larger scale, it probably needs to be done by changing the system itself rather than as a third party side project, which is not to say that you couldn’t offer your services to change to the core source code, since it is open source after all, i believe.
i am curious about your implementation. i knew that there was an API endpoint for computer vision suggestions, but i didn’t think that regular guys like us could access that outside of the standard iNat UIs. what are you doing to get visually similar suggestions?
i think the way i would approach updates is to provide a little button to manually trigger a refresh of a given observation’s CV ID. i think i would just stamp the CV ID with a date and maybe a CV version number so that someone could see if it was recent or not. i would bet that background mass updates of the CV IDs would not provide a lot of additional value.
so that the bot’s ID is automatically retracted when a real person IDs the observation
This is possible - it’s a bit heavier computationally and so would rely on periodic checks as to whether observations have had updated IDs. Do you think to keep the initial ID if the downstream ID agrees (e.g. ID to animal → downstream ID is gecko) or to blanket remove it whenever any new ID is submitted?
I’m also unsure that your suggestion counters this problem:
If an observation has multiple organisms and another user comments instead of IDing then the bot ID would still stand, which would be problematic. Unless bot ID is removed when any new comments are added, which could work, but would be a little trigger happy.
Actually not - looking at the API reference I couldn’t see capacity to access visually similar suggestions on the API, so I carried on without it. Let me know if you know of this functionality existing! The program essentially mimics how a person would access iNaturalist. It goes through the frontdoor and clicks on things. It’s slow and inefficient cf using the API, but it works.
Currently - yes. Am thinking of dissociating the two and creating a separate account though. The program will work with any account signed in on a browser and attribute IDs to that account.
So I agree with you on this, I think going beneath subphylum is a bad idea, and generally phylum/kingdom is a good bet. Going to test extensively how well we can do subphylum for angiosperms and see how it goes.
Actually not - looking at the API reference I couldn’t see capacity to access visually similar suggestions on the API, so I carried on without it. Let me know if you know of this functionality existing! The program essentially mimics how a person would access iNaturalist. It goes through the frontdoor and clicks on things. It’s slow and inefficient cf using the API, but it works.
Completely agree with you here! I don’t intend this to be broad scale - I don’t have the time or free computers for one thing. Just gauging response to me trying this for my project
Unfortunately not possible with my approach - but agree would be useful for future approaches. Best I can do is comment with the ID the date the CV ID was added
The intent is to have any automated ID carry zero weight in the eventual community ID, since it did not come from an actual member of the community. It’s sole purpose is to get the observation out of the Unknown pile and in front of the community, to encourage human IDs. Once that happens its purpose has been fulfilled, but not before then, so it should remain when other things like comments are added, otherwise it would revert back to the Unknown pile.
To be clear, if a human adds an “automated ID” (Computer Vision suggestion), that would count as part of the community ID and wouldn’t be removed. By automated ID I mean one added directly by the system (or a system) based on Computer Vision.
@matthewlewis896 Thank you for asking the community first, it is much appreciated. However, iNat’s ToS explicitly forbid the use of machine generated content, which this would fall under, so please don’t run this macro.