I have a pretty specific question about the Computer Vision. I apologize if it’s been dealt with before, but I can’t find a precise answer and there seem to be a lot of assumptions about the CV that aren’t true.
The genus Sphagnum (peat mosses) is found worldwide and has a lot of species and a LOT of iNat observations. There are a handful of Sphagnum experts who can reliably identify some species in the field, but the ONLY “official” (i. e. from keys and original descriptions) way to ID Sphagnum species is with microscopy, often with stained sections. In addition, habitat is very important for Sphagnum. For this reason, images of microscope slides and habitat shots are essential for good Sphagnum IDs, although nearly all iNat observations do begin with a field closeup as for most other organisms.
I’ve seen it said that the CV gets confused by habitat shots and micro slides, but in this case providing good information for a human ID seems more important than pleasing the CV. This wouldn’t be a problem anyway if the CV only uses the front image for training. I have noticed that the CV has gotten better at ID’ing Sphagnum over time and there should be some snowball effect in the future, but it can only improve if the IDs provided are correct, and they can only be judged correct with those micro images.
Should I continue as I have done, adding the habitat and micro images, or is there some compromise that would be better? And are there other taxa that have a similar issue?
I don’t think that there’s an issue with a “habitat shot” as long as it includes the organism itself. So even if the moss on a rock/bark/ground is a small part of the picture, that’s fine. The CV may use background info to identify sometimes even with shots that contain an organism reasonably close up - I see this with shots of trees/bark for which the CV suggests bark anoles (even though one isn’t present). Though I’ll note my brain does this too while walking around in anole habitat - I think “Oooh, that’s a good spot for an anole!” - maybe the CV is effectively doing the same thing! We don’t know exactly what “line of reasoning” the CV will pick up on in any given set of photos. Posting habitat shots that do not include focal organisms, however, has been discouraged by staff.
For microscopy, I have also heard that these could “confuse” the CV model training. However, I think that they do meet the criteria for inclusion on iNat in that they do contain the organism itself. To my mind, since they meet the guidelines that photos must include the organism and have serious value, I would include them. Including them definitely does help identifiers and is an important learning route for other users. Training the CV is important, but not the end goal of iNat. But I’m open to hearing other thoughts on that.
Why microscopy photos will “confuse” CV? I don’t think it will confuse.
That’s just technical issue that already resolved. In case of presence two or three sets for particular species (micro photo set, common view photo set, habitat photo set, etc.) CV will be trained on each set separately, pointing to the same Sphagnum.
We have this situation everywhere. For example, how Moose photo look like:
To be honest, it’s not easy task to create and train Computer Vision system (meaning Neural Network) to deal with many photosets pointing to same result. Especially for such big ones. But that’s just technical issue… so let’s give iNaturalist Data Scientists crew the interesting task?
What would really be awesome, @janetwright is if we accumulated enough microscopy-verified IDs of Sphagnum and other organisms like this so that the computer vision could reliably pick up on the subtle differences of color/texture in the macroscopic images that are difficult to put into human-readable words, but probably do occur between species. Akin to when the experts ID something “by gestalt”.
I don’t know if this is achievable, but it would be cool.
It seems theoretically possible, @ddenism! I have wondered if we trained the CV on field photos of Sphagnum that had been identified on the back end microscopically, if it could get better than a human at identifying the field photos. We can dream!
I think it is unfortunate that habitat photos are discouraged. As well as being useful for identification, it is simply very interesting to see where things live. And saying habitat photos are OK so long as they include the organism doesn’t really work if you caught an invertebrate by pond net.
Is a photo really necessary for this? The host and habitat can easily be recorded in the description and/or observation fields, which would seem sufficient in most cases. It’s also possible to zoom in on the map in satellite view to see the surrounding environment (assuming the location isn’t obscured). The Details popup also provides a link to a full Google Map (which may offer a 360° ground-level view), and Macrostrat (which shows the local geology).
My understanding is that CV training is not limited to the first image from each observation. For a particular taxon, CV uses up to 1,000 images that I believe are randomly selected from RG observations of that taxon. So if you include microscope slides or habitat shots, those should have an equal chance of being part of the training material.
When it comes to providing CV suggestions, in some circumstances these are based only on the first photo in the observation (e.g. most places where suggestions are offered in the web interface). At other times, CV can be asked to suggest IDs for a particular image (e.g. in the Android app).
That’s not the way it has worked in the past. Research Grade is not a requirement for training. Two IDs are required for a Community ID, but that’s not a requirement for training either. Even Casual (not wild) observations are eligible for training. At least this is the way it used to be, I don’t know if it has changed recently.
Conversely, rather than hoping the observer is able to describe the host and habitat and has done so, and rather than trying to work out the habitat from the blurred aerial photo or Google Map, which require the observer to have put the marker in just the right place, and rather than hoping the Google Map photos were taken at the same time of year as the observation, it would be really helpful if the observer had included a habitat photo.
The ability to add tags and annotations to individual photographs might be a good way to improve CV and add useful information for people. For example a photo that has a “habitat” tag could be excluded from the CV training model, or one that is tagged “micrograph” could be placed in a separate model run (assuming there were enough to make that worthwhile).
I am pretty sure this has been the topic of previous discussion and maybe a previous feature request, although I did not find it in a quick search.
And for what it is worth, I am in the camp of thinking it is better to provide photos that provide useful information for human viewers and not to worry about how it will effect CV. I like to think that iNat is still the dog and CV is just the tail.
Ohh, non-RG pictures being considered by the CV could explain a mysterious CV experience I had a few days ago. I’ve been taking a lot of lichen pictures the past few months (barely any are RG though). And with foliose lichens I often include a macro closeup of a tiny piece of a lobe upside down on my hand like this: https://www.inaturalist.org/photos/257806483
Using the Android app to id each picture those shots would always identify as random things like insects or plants and I didn’t think much about it. Then a few days ago I noticed that it suddenly correctly identified a small speck of black on my hand as “Phaeophyscia adiastola” and I couldn’t really explain it since there isn’t a way to identify from that (all it shows is that the underside is black making it a Phaeophyscia instead of a Physcia). I had several explanations, like maybe it just re-learned the RG pictures in a different way, or takes location or user into account in a different way now - but if it took some of my non-RG pictures I may have inadvertently taught it that speck-of-black-on-a-hand is Phaeophyscia adiastola
I still don’t see why an observation photo is needed for this (which would be just as subject to the hypothetical issues you describe as any other source of information). iNat observations should always record one taxon, so adding photos of additional taxa (such as the surrounding vegetation), runs contrary to the guidance.
The question here isn’t about whether information about the host/habitat can be helpful - it’s about where such information should ideally go. iNat already makes some provision for this (e.g. the observation fields and description), but if some people feel that’s not enough, it would be best to try to make improvements to those rather than encouraging users to risk dilluting the effectiveness of the observation photos (from a CV point of view).
PS: another problem with using observation photos to record host/habitat is that the information won’t then form part of the metadata, so cannot be looked up or used in a search filter. If the host/habitat is worth recording, it should be done explictly using a field that is dedicated for that purpose.