Estimating species populations from number of users

i think this is only partly true. some users record way more observations than others, some users record more tiny things than others, some users record more arthropods than others, etc.

i would think you would get a better relative count of ticks by comparing relative numbers of ticks against all observations at a given time and place.

you could do this comparison at a county/parish level, since iNat’s “standard” places go to that level. (they also loaded town-level places, but only in certain states in the US.) GET /observations (observation counts) and GET /observations/observers (observer counts) could both be filtered using place.

an unusual alternative could be to use UTFGrids (GET /grid/{zoom}/{x}/{y}.grid.json) to get observation counts within an approximate grid. the downsides with this approach are that grid is not totally uniform in coverage, and it would be possible to get observation counts only (not user counts). here’s an example of that UTFGrid approach: https://forum.inaturalist.org/t/looking-for-inaturalist-observation-map-visualisation-suggestions/7322/22. (EDIT: i’m thinking about this more, and rather than UTFGrids, it might be better to get the actual coordinates from the iNat export, the GBIF export, or the AWS Open Data set, depending on what you’d like to do with the data. then aggregate / cluster the data yourself.)

generally the location of an observation should be where the organism was observed. however, this could be especially tricky for a subject like ticks because there could be a lot of cases where, say, someone observed the tick back at home after hiking all day at a large park. i don’t know how you resolve the difference in this kind of data, except to assume that the location will generally represent the original source/home of the tick, not the home of the observer. there are also cases where the coordinates might be obscured or have large positional error – so you may or may not want to deal with that.

that said, since this is for an undergrad (i assume?) CS (=computer science?) degree, not a biology or ecology degree, i don’t know if these kinds of considerations really matter. (i would think demonstrating your ability to retrieve, transform, and visualize data is probably more important than getting all the statistics and science exactly right.)

you might also try other sources like GBIF, which aggregates data from multiple sources, including iNaturalist. that might give you more data to work with in your sample set, and you might find some sources that have superior data for this purpose there.

9 Likes