City Nature Challenge - issues & suggestions for improvement

Is there currenly a way to easily flag a user who spams “different” observations of the same organism or of every bird in a flock? I cannot flag a duplicate, no one can do anything with duplicates, not even curators. I recently asked one user to merge two obs with identical scene and he stopped answering. What should I do, manually find all his duplicates to flag the account with a list of those? What if there are a thousand users like this? If there is no anti-duplicate and anti-spam policy, there should be no incentive to push observations to RG. It’s tedious enough to exclude “power spam users” from the Identify page now, if they get applauded for pushing RG on duplicates, iNat would be better removed from all serious datasets.

Duplicates are a problem least regulated by iNat guides, as is clearly seen from CNC cheating.

1 Like

Very true. I, of course, have no idea how much trouble the relevant programing would be. Makes it easy for me to suggest anything.

I fought for the DQA for Single Subject.

There are open request for Duplicates. Going back years.
You can make a feature request. Checksum has been suggested.

https://forum.inaturalist.org/t/curator-guide-policy-on-duplicate-flags-lets-change-it/62066

https://forum.inaturalist.org/t/duplicate-prevention-notify-observers-if-their-image-checksums-match-others-on-the-site/258
From Feb 2019 with 103 votes. We Have A Problem, we have an issue with duplicates.

Some identifiers are happy to ID duplicates. Not me. A new Duplicates DQA that pushed to Casual would work for me. iNat is too big to say that, and that, and that, are not a problem.

4 Likes

I don’t spend much time on the forum, so it is good to know I’m not the only one struggling with some of the issues mentioned in this thread. As an active Identifier, I agree with @mftasp in that the competitions (CNC and GSB in my case), do reward bad behavior and add a lot of ‘noise’ to the data - my pet peeve being duplicates.

My thoughts on three of the issues raised above:

Duplicates: My understanding is that once the URL of the original observation is added to the ‘Duplicate Observation’ field of the particular duplicate observation, that that particular duplicate observation will not be uploaded into GBIF. If this is the case, then would it not be simple then to include a feature that would automatically flag the observation as ‘Casual’ if a URL is added to the ‘Duplicate Observation’ field? It is tedious, but I plod away at it over time.

Falsified dates: Changing the ‘Observed Date’ to fit into the competitions time parameters. I’m not sure how to explain this clearly, as I’m not very clued up with coding (or language!), but if iNat automatically picks up the observed date from the EXIF(?) data, is it not therefore possible to have the EXIF data (date in particular) as an option in the query when exporting observations? This date could then be compared against the ‘Observation Date’ using Excel for a quick glance as to the prevalence of falsified dates. One would still have to go and mark each under the DQA ‘Dates incorrect’ but it will at least give an idea as to who the culprits are and then this can be addressed by the organizers/mentors for the particular project in which the user participates.

Cultivated plants: I do think that they should be excluded from the tallies. I realise there are good reasons for both sides to this argument though.

7 Likes

Wrong dates: Finding them automatically seems like a good idea but should be used with an awareness that a camera that hasn’t had the date set properly can assign wrong dates to photos. (Happened to me when I borrowed a camera, didn’t realize date hadn’t been set.) Incongruous dates sometimes don’t indicate intentional cheating.

4 Likes

Yeah, if I recall correctly, I have at least one season’s worth of pre-iNat photos where the file dates are all wonky because my system of date organization was offloading the photos into a date-labeled folder on my computer every night and I just hadn’t bothered to fix the camera date/time.

I assume cheaters would be uploading photos with widely varying dates, while the honest people with funky camera settings would be uploading photos with sequential date-times in a two-day date range, even if they’re the wrong two days. So I doubt it can be totally automated, but it should be pretty easy to detect the difference unless a cheater has a bunch of saved-up photos from another 2 day period that they use.

4 Likes

That is a workaround, but not in iNat guidelines.
I no longer use that - I leave a comment instead. That is also more obvious to future identifiers.
A suitable DQA would be simple and effective.

So many smart and thoughtful comments…

Observation/report: I was fortunate enough to be able to attend a CNC event in the Presidio with son visiting down from Canada and I came away very impressed at just how well organized it was. There were about twenty or so attendees, some from the neighborhood, some from other parts of the city, lots of kids (all ages) and a group of women who came up from the Mission representing some organization like “Latino Naturalist Community”–I forget the exact name–, group leader, a couple interns, a couple native plant people and an odd entomologist or so. It was a fun slow saunter along the old Sutru choo choo train route. I think everyone had a good time and everyone learned something (I found out about marsh flies). Not a shred of useful scientific data was collected, but lots of qualitative added value. Of course, cheating is bad bad bad, but at the end day the real value resides in enabling of community-nature connections, esp. in an urban context. I-nat is a wonderful tool with its open, participatory structure and invitation to further exploration. I hope the CNC keeps on going…

Another point hardly worth mentioning but which hasn’t been mentioned so far is that the overall CNC concept/movement has been a tremendous success!. If the initial challenge was “How to talk about nature/biodiversity in an urban context?” that challenge has been overcome and indeed taken for granted at this point. This was not self-evident even a decade ago.

As I see it, the bioblitz-type approach–go forth and find as much diversity as you can in any direction–was always something of a “conceit,” a way of thinking about local biodiversity, rather than intended to provide much actual useful data, even if were possible to weed out all the cheaters, ban casual observations, set up disciplinary tribunals, and expel the Paraguayans, among some of the remedies suggested. Rather than thinking about the contest as “which city has the most biodiversity?” (SF beats Wichita hands down!) another way way to think about the “challenge” would be as a “frame” or interactive lens to reflect about connections between diverse communities of observers and local biodiversity with real-time natural data along nodes of site-specific or neighborhood specific community–nature interactions. The i-nat framework is somewhat unorthodox as it foregrounds the role of the observer (rather than try to eliminate as is the case with traditional approaches), but the idea–as far I understand it-- is to conjoin resilient nature in with resilient communities with feedback loops in ongoing processes of regeneration and co-evolution. Without getting too theoretical, rhetorical or (heaven forbid!) eschatological, that convergence seems to me the great task of the anthopocene era, for which active participation engaged communities of observers will be an essential component for achieving any sort of reversal of the downward trend of biodiversity loss–even if some cheating is to be expected.

3 Likes

What about punting the CNC over to Seek and turning off iNaturalist auto-uploads?

6 Likes

The country of Paraguay has not been mentioned anywhere that I have seen. Could you point me to where this was discussed?

5 Likes

I’d like to see it from both local and global organizers. The global organizers also have not participated at all in the cleanup efforts - at a minimum it’d be nice to see them spending some time in the trenches with the rest of us. (I checked - between the 9 people listed on the CNC page, they have done a total of two copyright flags. Ever. Both by the same person, in 2018 and 2019 respectively).

Your number is a bit too kind to them - they currently have over 1100 observations with copyright flags - and I’ve barely even had time to look at that project, so there’s probably thousands more that still need flagging.

And that doesn’t even count the ones that don’t make it into the project - just now I flagged 6 copyrighted photos, then checked the user’s observations and found 95 more that were also stolen, but didn’t make it into the project because the user forgot to add a date to them.

I have encountered literally thousands of these this CNC… there’s one user in particular with over 3k uploads and I swear every single one is copyright infringement but I just cannot find where any of them come from.

And in the time it can take to track down a single photo, the user can already have uploaded 70 more.

Even a lot of the blurry houseplant photos are actually downloaded from facebook groups, which is bizarre!

I agree it is a very small number. Plus I have run across at least 5 or 6 accounts who didn’t get suspended last year despite a bunch of copyright violations, then reactivated this CNC to upload MORE copyright violations! Not exactly the kind of user persistence we’re after!

Has it, though?

How would this be quantified?

Those I know who care about nature at all have never struggled to understand it in an urban context. And those who don’t care, are the ones uploading a million stolen photos and AI generated images - they certainly are not learning anything about nature in the process.

7 Likes

@graysquirrel Sadly, it appears new users upload photos that are copyright infringement at a rate of over 1% of all observations by new users. I imagine this number holds up for CNC and is probably higher due to the competitive nature of it. Really sad because the flagging process is time-consuming and cheaters hurt dedicated observers who care about contributing to science, especially during CNC.

4 Likes

I agree that this issue needs to be approached with care, and not with a punitive mindset. Whilst I don’t ordinarily look for observations with wrong dates, the ones I did find were very clearly an attempt to push numbers as evidenced by the observers inordinate number of duplicates!

3 Likes

The main reason I use the ‘Duplicates Observation’ field is to keep the observation out of GBIF, even if it does get to RG. My thought is that it would be useful if these same observations were bumped to ‘Casual’ as I have found that I usually notice when a supporting ID doesn’t take an observation to RG. I found one the other day where an observer had accidentally clicked one of the DQA fields.

1 Like

Checksums are useless against blatant spamming or duplicating. What I desribed was different photos of one organism or 7 photos of bullfinches (which are flying in flocks) from one place at one time. The latter not against rules, btw. But it should be at least mentioned as an undesirable practice.

Of course I already voted on those requests. My vote was 102th. The features are not here now, though. And no anti-spam rules are in guides, as far as I know. So no incentive to make X% of observations RG should be in CNC. It just would not work as desired. Maybe if new naturalists would be blocked altogether from IDing during CNC.

As was mentioned in those topics, DQA is not a good solution, as different people could mark different observations, rendering Casual all duplicates including the original. And the user can just upvote their obs back and continue this behaviour. I think it should be a separate flag reviewed after the user have at least 3 of them, for example. Like next to “mark as inappropriate” could be “excessive observation of the same species at the same time”.

It’s good to remember about annotations though, as some people create separate 2-3 observations for a male, a female and a juvenile, and it’s valuable for browser mode.

1 Like

This is not the case. Observation fields should have no effect on whether an observation is added to GBIF.

3 Likes

I wonder if it was confusion about La Paz (Bolivia).

1 Like

I’ve just come across an observer who has uploaded a bunch of photos which I am sure are not their own, but I don’t know how to prove it or what to do. The photos are all taken during the day, but the times of the observations start at just after 10:00PM for the first, and go to 10:50PM for the last. So they created 100 observations at a rate of 2 per minute, using what I am sure are photos taken by other CNC participant/s.

As an identifier, how should these observations be treated? They are nearly all casual anyway as they are mainly garden plants. But it is frustrating to see this sort of blatant behaviour.

If you want to private message me their username, I’d be happy to take a look. You can also always flag one of the observations using the “other” category and write something like “suspect copyright infringement by this user but not sure, can someone help take a look?” - there are quite a few curators who are very good at tracking those things down.

3 Likes