ok. i agree with the general idea being discussed here, but i wonder if it could be accomplished by updating the existing loader so that it can recover from flaky connections and maybe even crashed browsers?
@pisum, that sounds way too sensible!
@glmory you might want to vote for your own feature request (it doesn’t automatically do that :-)
(and thanks for the link to pyinaturalist, I didn’t know about that)
Digression here; I found that waiting until absolutely all the metadata has loaded, on all photos, before starting to work on them has significantly reduced freezing/crashing the browser. The other day I uploaded a batch of 22 observations with no problems(!) Of course it can take 5-10 minutes for everything to upload, but it’s better than redoing all that work.
I’m using Firefox in Sierra (Mac).
I can confirm this - other than uploading the photos, the main bottleneck seems to be loading the metadata. I found that I had to limit my batches to about 25 observations to make the process workable. But even then, I still found it was taking far longer than it really should do, and every now and I would have to abort it due the browser locking up.
So in the end, I went down the same route as @glmory and wrote my own uploader. It also uses python, but not pyinaturalist, since I wanted more control over how the inat apis are used. Doing things this way dramatically reduces the total upload time, and I can just run it in the background while I do other things. It also allows me to add annotations at the same time and simplifies the process of grouping multiple photos.
Even if the issue with loading metadata was somehow fixed, I doubt whether I would want to use the web interface for uploading again. The only major thing missing from the current apis seems to be ID suggestions using inat’s computer vision - but I very rarely use that, so that’s not a big deal for me.
The metadata issue is so time-consuming that I have been trying to find a way of stripping all metadata except the date-time. There is a program for Linux - exiftools - that may be able to do it, but I haven’t tried it yet (in the queue of Things To Do).
If anyone has another way of doing this, I’d be thrilled to know about it!
3 hours for 13 pictures in 8 observations … where can i get your uploader bazwal?
I don’t know that it is actually real metadata (eg EXIF strings) that is slowing it down, if I upload large GIF images with no meta-data it seems to take a long while searching for non-existent meta-data.
That seems unusually excessive - what sort of internet connection do you have? It might be worth trying some of the online speed-testers to see what your upload speed is like (download speed is less relevant). And what is the average size (in megabytes) of the photos you are uploading? Do you crop them first?
My uploader is part of a bigger program at the moment so I’m not able to share it. But in any case, I don’t think it would help you much if you don’t have a reasonably good internet connection.
If you can send some samples to firstname.lastname@example.org (or make them available here via a Dropbox link or something similar - they just need to have their metadata intact) that would allow us to take a look and see if there’s anything about the photos themselves that are causing the slowdown, although RAM/internet speed are the likely culprits.
The system has to resize images that are too large, and that will contribute greatly to the processing time of an upload. You could test it by doing a similar size batch but with the photos all pre-scaled to the iNat maximums. I know my camera is set to take photos that are nearly double in size (or 4x the qty of pixels!) to the iNat max. But even then, mine would not take more than 30 mins tops to upload 100 obs with approx 200 photos.
I think bottlenecks are going to be (in order of impact):
- upload speed
- activity on the site at time of upload
- image size
I think the upload might be staggered, meaning that it doesn’t all get processed at once and the site come to a grinding halt for everyone else while it processes. This might explain why an external uploader might be getting faster results. If this is the case, do we really want large processing volume burdens being developed? It’s fine for occassional situations, but if the number of people using it grows, then you might be getting faster processing of your uploads, but causing unacceptable delays in other parts of the system such as in just loading observation views! I would encourage you to contact the developers to make sure your impact will not be detrimental.
You are totally mistaken about all of this. If the developers were really worried about it, they wouldn’t have provided two fully documented public APIs and an interface for registering third-party applications. To quote from the API documentation I Iinked to earlier:
It seems pretty clear to me that the developers know what they’re doing and have already thought all this through - so please don’t start spreading FUD about third-party uploaders.
The whole point of using these APIs is to consume fewer resources when uploading. A third-party uploader does all the pre-processing client-side and uses the minimum number of operations to upload the final data to the server. By contrast, the web interface does all the pre-processing server-side, and uses a whole bunch of costly interactive features that many people neither need nor want. So, for the same number of records, a well-behaved third-party uploader should usually put less load on the server whilst also being more efficient for the client/user.
If you have questions about how iNat works, please just ask. Regarding the website uploader, we do not resize images in the client. Well, actually we do, but only to make a thumbnail to upload for vision suggestions. The final cutdowns happen on the server, and yes, there are performance implications for the site as a whole. Improving server-side image processing is definitely on our radar, though frankly, it only creates serious problems during periods of extreme usage, like the CNC or the Penang Incident. A third party client that cuts everything down to 2048x2048 before upload might feel a bit faster due to reduced server-side processing time… or it might be faster b/c you’re uploading less data. We also only upload a max of 3 observations at a time in the website uploader, so if a 3rd party uploader creates more simultaneous upload requests it might get the whole job done faster, but could theoretically function as a Denial of Service attack if it swamped all our server processes.
I don’t know why the uploader is so slow for some people. I’m sure we all have our theories, but what would help us address the problem are concrete examples including excruciating amounts of contextual info: what browser you’re using, what your internet connection speed is like, specifically what files you’re trying to upload, etc. Some issues might be solved by a third party uploader, but not things like connection issues. While diagnoses might be helpful, reproducible examples will always be more helpful. There are a lot of variables.
Finally, we’re not going to make a standalone uploader, or at least not any time soon. We’ve toyed with the idea of a desktop app a lot, but ultimately, of the many potential uses for such an application (offline use, backups, etc.), faster image upload has never been on our radar. If you have the bandwidth to upload and the website’s not working, we should fix the website, not make more software that works better.
@bazwal someone is! And it’s not FUD… that is a competitive thing in advertising to chop down a competitiors product. I am not competing with you, but I am interested in identifying and preventing potential future problems! note:
@kueda sorry, I do tend to speculate on what is inside the black box a bit. I’ll try and phrase questions rather than make speculations
@kiwifergus FUD isn’t restricted to advertising, it has a broad range of applications. And I simply asked that you “don’t start spreading FUD”, which appeared likely to me since you didn’t seem to be aware of the terms of service for using the inat APIs. Hopefully my post made it clear why there is no need to have any concerns regarding third-party uploaders that abide by those terms.
From your wiki quote: “Fear, uncertainty, and doubt (often shortened to FUD ) is a disinformation strategy used in sales, marketing, public relations, politics, cults, and propaganda. FUD is generally a strategy to influence perception by disseminating negative and dubious or false information and a manifestation of the appeal to fear.”
I dunno… I think I was spot on, rather than “totally mistaken”
Sounds like we should close this feature request?
yes, please do! :)
For folks interested in continuing to talk about standalone uploaders, using the API, or if you have other questions about rules around 3rd party apps feel free to check the documentation, message email@example.com, or ask/discuss in the General category after reading up at:
*and, please remember to keep discussions on topic. If you want to have a one on one unrelated discussion, you can always directly message someone privately or start a new topic.