Box Move owned files API is synchronous call - box

Box move owned files api call is synchronous (https://api.box.com/2.0/users/user_id/folders/folder_id). If the user has large number of files then it will take hours sometimes. we are planning it to implement such a way that make the call and timeout. then periodically check the number of items in root folder of the user, if it is empty, then assume that transfer is done. can we rely on this approach?
Another question is that if a file or folder is shared with user, will move owned items api call have any effect on the shared folder?

This approach does not work if user has shared files/folders. Move owned items call does not move unowned items. Hence this approach won't work.

You can easily do this using some asynchronous framework like NodeJS, where it can upload multiple files simultaneously upto your enterprise set connection limit. https://github.com/box/box-node-sdk

Related

SwiftyDropbox batchUploadFiles vs upload

I am uploading a large amount of photos and annotations to dropbox using the swiftyDropbox sdk. I want to update the UI to reflect the upload status of each item which is stored in coreData. My understanding of batchUpload is that you pass it an array of URLs and it uploads them asynchronously. I would like to use batch upload but I am not sure how to tell when a certain item is finished with batchUpload since it is operating on an array of URLs. Is there a way that I can use batchUpload, versus just iterating over the array with the upload function?
It seems that upload will would be the correct solution as I can just add each item to background thread asynchronously and update each one as they finish. Looking for arguments to persuade me either way.
The batchUploadFiles method in the SwiftyDropbox method is advantageous as only has to take one lock to upload the entire batch of files. It only calls the response block once the entire batch is done though, and the files are committed in a batch, so you wouldn't see individual uploads being completed one by one. You instead get the result for each all together at the end.
If you do need to be able to see individual file uploads completed one by one for whatever reason, you would need to use individual upload calls, but that has the disadvantage of not batching the uploads, so you're more likely to run in to lock contention.

How to sync Core Data with referenced files?

Just started to read various posts how to sync files or core data using iCloud. The app I'm currently developing stores data inside core data and filenames as references to the image files stored in documents app sandbox. So, a related file (photo) is also created in documents dir every time a user makes a record inside a database.
Everything looks fine if we would need to sync files OR core data, however I'm looking a way how to sync core data AND files. So, I'm worried about the case if new core data records will arrive earlier than image files of those records. In that case, data integrity will be broken. Actually, I would prefer all new related files would come first, and then all core data updates. Is it possible to do that?
Not really, no. You send data to the cloud, but you have no way to control when it appears on other devices. iCloud is going to bring over your managed objects whenever it feels like it, regardless of the state of the external files. The only way you could make this happen would be to find and download any external files, wait for the download to finish, and only then bring up your Core Data stack. But that would mean locking the user out of the data store until the downloads finish, which is not a good idea.
When I faced a similar situation, I handled it like this:
Initiate downloads of all the external files and bring up the Core Data stack.
Modify the getter method for the image to check whether the file exists and has been downloaded.
If yes to both, proceed normally
If no, display a "loading..." UI element. This could be a spinner or a progress indicator. Listen for a custom "download complete" notification.
Whenever an external file finishes downloading, post that "download complete" notification. Re-check the file, and if it's ready, replace the "loading..." UI with the image.

Downloading multiple items in package

I need to allows users to download multiple images in a single download. This download will include an sql file and images. Once the download completes, the sql will execute, inserting text into an sqlite database. This text will include references to the download images. The text and images are rendered in a UIWebView.
What is the best way to download everything in a single download? I was thinking to use a bundle since it can be loaded at runtime but not sure of any limitations/restrictions in this scenario. I have tested putting the bundle into the Documents folder and then accessing resources inside of it. That seems to work fine in a simple test.
You're downloading everything through a socket, which only knows about bytes, so a bundle, or even a file, doesn't "naturally" transfer through, the server side opens files and encodes and sends them into the connection, the client reads from the socket and reconstructs the original file structure.
Assuming the application has a UI for picking which items needs to be transferred, it could then request all items to the server, and the server could then send all the items through the single connection with some delimitation you invent, so that the iPhone app can split the stream back into the individual files.
Or another options is that the client could just perform individual HTTP requests for the different files, through pretty straightforward use of NSURLConnection.
The former sounds like an attempt to optimize the latter. Have you already tested and verified that the latter is too slow/inefficient? It definitely is more complex to implement.
There is a latency issue with multiple HTTP connections that you run in a sequence, however you can perhaps mitigate it by running multiple downloads connections in parallel -- for example through an NSOperationQueue with a limit of 2 to 5 concurrent download operations.

Patterns for accessing remote data with Core Data?

I am trying to write a Core Data application for the iPhone that uses an external data source. I'm not really using Core Data to persist my objects but rather for the object life-cycle management. I have a pretty good idea on how to use Core Data for local data, but have run into a few issues with remote data. I'll just use Flickr's API as an example.
The first thing is that if I need say, a list of the recent photos, I need to grab them from an external data source. After I've retrieved the list, it seems like I should iterate and create managed objects for each photo. At this point, I can continue in my code and use the standard Core Data API to set up a fetch request and retrieve a subset of photos about, say, dogs.
But what if I then want to continue and retrieve a list of the user's photos? Since there's a possibility that these two data sets might intersect, do I have to perform a fetch request on the existing data, update what's already there, and then insert the new objects?
--
In the older pattern, I would simply have separate data structures for each of these data sets and access them appropriately. A recentPhotos set and and a usersPhotos set. But since the general pattern of Core Data seems to be to use one managed object context, it seems (I could be wrong) that I have to merge my data with the main pool of data. But that seems like a lot of overhead just to grab a list of photos. Should I create a separate managed object context for the different set? Should Core Data even be used here?
I think that what I find appealing about Core Data is that before (for a web service) I would make a request for certain data and either filter it in the request or filter it in code and produce a list I would use. With Core Data, I can just get list of objects, add them to my pool (updating old objects as necessary), and then query against it. One problem, I can see with this approach, however, is that if objects are externally deleted, I can't know, since I'm keeping my old data.
Am I way off base here? Are there any patterns people follow for dealing with remote data and Core Data? :) I've found a few posts of people saying they've done it, and that it works for them, but little in the way of examples. Thanks.
You might try a combination of two things. This strategy will give you an interface where you get the results of a NSFetchRequest twice: Once synchronously, and once again when data has been loaded from the network.
Create your own subclass of
NSFetchRequest that takes an additional block property to
execute when the fetch is finished.
This is for your asynchronous
request to the network. Let's call
it FLRFetchRequest
Create a class to which you pass
this request. Let's call it
FLRPhotoManager. FLRPhotoManager has a method executeFetchRequest: which takes an
instance of the FLRFetchRequest and...
Queues your network request based on the fetch request and passes along the retained fetch request to be processed again when the network request is finished.
Executes the fetch request against your CoreData cache and immediately returns the results.
Now when the network request finishes, update your core data cache with the network data, run the fetch request again against the cache, and this time, pull the block from the FLRFetchRequest and pass the results of this fetch request into the block, completing the second phase.
This is the best pattern I have come up with, but like you, I'm interested in other's opinions.
It seems to me that your first instincts are right: you should use fetchrequests to update your existing store. The approach I used for an importer was the following: get a list of all the files that are eligible for importing and store it somewhere. I'm assuming here that getting that list is fast and lightweight (just a name and an url or unique id), but that really importing something will take a bit more time and effort and the user may quit the program or want to do something else before all the importing is done.
Then, on a separate background thread (this is not as hard as it sounds thanks to NSRunLoop and NSTimer, google on "Core Data: Efficiently Importing Data"), get the first item of that list, get the object from Flickr or wherever and search for it in the Core Data database (carefully read Apple's Predicate Programming Guide on setting up efficient, cached NSFetchRequests). If the remote object already lives in Core Data, update the information as necessary, if not insert. When that is done, remove the item from the to-be-imported list and move on to the next one.
As for the problem of objects that have been deleted in the remote store, there are two solutions: periodic syncing or lazy, on-demand syncing. Does importing a photo from Flickr mean importing the original thing and all its metadata (I don't know what the policy is regarding ownership etc) or do you just want to import a thumbnail and some info?
If you store everything locally, you could just run a check every few days or weeks to see if everything in your local store is present remotely as well: if not, the user may decide to keep the photo anyway or delete it.
If you only store thumbnails or previews, then you will need to connect to Flickr each time the user wants to see the full picture. If it has been deleted, you can then inform the user and delete it locally as well, or mark it as not being accessible any more.
For a situation like this you could use Cocoa's archiving facilities to save the photo objects (and an index) to disk between sessions, and just overwrite it all every time the app calls home to Flickr.
But since you're already using Core Data, and like the features it provides, why not modify your data model to include a "source" or "callType" attribute? At the moment you're implicitly creating a bunch of objects with source "Flickr API", but you can just as easily treat the different API calls as unique sources and then store that explicitly.
To handle deletion, the simplest way would be to clear the data store each time it's refreshed. Otherwise you'd need to iterate over everything and only delete the photo objects with filenames that weren't included in the new results.
I'm planning to do something similar to this myself so I hope this helps.
PS: If you're not storing the photo objects between sessions at all, you could just use two different contexts and query them separately. As long as they're never saved, and the central store doesn't have anything in it already, it would work just like you describe.

Detect a file in transit?

I'm writing an application that monitors a directory for new input files by polling the directory every few seconds. New files may often be several megabytes, and so take some time to fully arrive in the input directory (eg: on copy from a remote share).
Is there a simple way to detect whether a file is currently in the process of being copied? Ideally any method would be platform and filesystem agnostic, but failing that specific strategies might be required for different platforms.
I've already considered taking two directory listings separaetd by a few seconds and comparing file sizes, but this introduces a time/reliability trade-off that my superiors aren't happy with unless there is no alternative.
For background, the application is being written as a set of Matlab M-files, so no JRE/CLR tricks I'm afraid...
Edit: files are arriving in the input directly by straight move/copy operation, either from a network drive or from another location on a local filesystem. This copy operation will probably be initiated by a human user rather than another application.
As a result, it's pretty difficult to place any responsibility on the file provider to add control files or use an intermediate staging area...
Conclusion: it seems like there's no easy way to do this, so I've settled for a belt-and-braces approach - a file is ready for processing if:
its size doesn't change in a certain period of time, and
it's possible to open the file in read-only mode (some copying processes place a lock on the file).
Thanks to everyone for their responses!
The safest method is to have the application(s) that put files in the directory first put them in a different, temporary directory, and then move them to the real one (which should be an atomic operation even when using FTP or file shares). You could also use naming conventions to achieve the same result within one directory.
Edit:
It really depends on the filesystem, on whether its copy functionality even has the concept of a "completed file". I don't know the SMB protocol well, but if it has that concept, you could write an app that exposes an SMB interface (or patch Samba) and an API to get notified for completed file copies. Probably a lot of work though.
This is a middleware problem as old as the hills, and the short answer is: no.
The two 'solutions' put the onus on the file-uploader: (1) upload the file in a staging directory and then move it into the destination directory (2) upload the file, and then create/upload a 'ready' file that indicates the state of the content file.
The 1st one is the better, but both are inelegant. The truth is that better communication media exist than the filesystem. Consider using some IPC that involves only a push or a pull (and not both, as does the filesystem) such as an HTTP POST, a JMS or MSMQ queue, etc. Furthermore, this can also be synchronous, allowing the process receiving the file to acknowledge the content, even check it for worthiness, and hand the client a receipt - this is the righteous road to non-repudiation. Follow this, and you will never suffer arguments over whether a file was or was not delivered to your server for processing.
M.
One simple possibility would be to poll at a fairly large interval (2 to 5 minutes) and only acknowledge the new file the second time you see it.
I don't know of a way in any OS to determine whether a file is still being copied, other than maybe checking if the file is locked.
How are the files getting there? Can you set an attribute on them as they are written and then change the attribute when write is complete? This would need to be done by the thing doing the writing ... which sounds like it isn't an option.
Otherwise, caching the listing and treating a file as new if it has the same file size for two consecutive listings is the best way I can think of.
Alternatively, you could use the modified time on the file - the file has to be new and have a modified time that is at least x in the past. But I think this will be about equivalent to caching the listing.
It you are polling the folder every few seconds, its not much of a time penalty is it? And its platform agnostic.
Also, linux only: http://www.linux.com/feature/144666
Like cron but for files. Not sure how it deals with your specific problem - but may be of use?
What is your OS. In unix you can use the "lsof" utility to determine if a user has the file open for write. Apparently somewhere in the MS Windows Process Explorer there is the same functionality.
Alternativly you could just try an exclusive open on the file and bail out of this fails. But this can be a little unreliable and its easy to tread on your own toes.