Question about exporting data

I have been playing around with some data from iNaturalist and when I go to export data I see a large yellow box with text in it with other options for getting data. Specifically there is the first line that reads:

Large exports slow down our infrastructure and make it harder for us to release new changes. Here are some other options to consider:

I am curious as to what it means that it “make(s) it harder for us to release new changes”. I make an export and then I delete the export off of iNaturalist when I am done. I have looked at GBIF and the other methods for getting the data I want but it has been much easier to get the data from iNat.

I thought maybe there was something I am missing and someone can explain it better.


i think the resource suck probably occurs when the export is actually generating. (i don’t know exactly why it is a problem, but i’m guessing that a large query that hits many different parts of the database or many records in a particular commonly used table probably encounters a lot of situations where it has to wait for for other processes on the system to complete before it can retrieve certain data, but in the meantime, everything else it has retrieved will be locked up so that other queries can’t access it, and so forth…)

i suspect it was probably a bigger issue back in the day when a single person could kick off multiple large exports that could potentially collide with each other, but i suspect it’s probably less of a problem now that a single person can kick off only one export at a time (a recent change), although i still would try to kick off large exports no more than necessary…

This topic was automatically closed 60 days after the last reply. New replies are no longer allowed.