I am thinking about a side project for augmenting automatic species separation. It would be nice to marry some observation data of a genus with added variables such as population density, elevation, rainfall, temperature, etc. I think elevation and population would not be that hard. I live in Oregon, United States and the Census Bureau has good data sets for population by census tract. The USGS has a very high resolution elevation data set at: https://catalog.data.gov/dataset/usgs-national-elevation-dataset-ned.
Climate data on the other hand is a lot spottier. Official NOAA stations are usually at airports in cities. Oregon has lots of federal and state lands where there are not even houses. Interpolated data is most likely useless as there a no weather stations in the mountains, so no community weather stations either. The highest resolution from NOAA seems to be from this page: https://psl.noaa.gov/data/gridded/ with some of the data at .25 degree resolution (17.3 square miles). That leaves a lot to be desired.
On top of that, coastal Oregon suffers from a low amount of observations (N) even in the towns. Lots of derived data would have to be used to normalize predictors. It makes me think that there would be lots of noise and not a whole lot of signal to be found.
As as side bar, there is also the issue that none of inat observations are a random sample anyway. People take pictures of what they want, where they are at, and what comes into focus in their cameras. So we degrade ourselves with the art of polling instead of the science of statistics. But, we know from experience that polls can be reasonably predictive.
So, wondering if anyone has endeavored on such a project and if they found any good results from it. Population and elevation seem like an obvious win, at least for some species. Another interesting spatial data set might be a discrete variable describing habitat (i.e. urban, marsh, montane, etc.).