I have my server maintaining the content with a file-system(i mean folder structure). The same folder structure is also maintained in my iPhone client application bundle too.
Now if there is a change in my server file system(Add,Delete,Update of a file in some folder in the hierarchy) i need to update the file system accordingly at the client. This means that i need a protocol to be followed b/w the server and the client.
Can any one suggest how can this be done?
--
Thanks and Regards,
U'suf
From what I can tell, there is no easy way. I was looking for an rsync equivalent, but I haven't found one.
In my case, I'm manually walking the tree asking the server for differences after a certain date and I remember the last successful sync date.
Not pretty. Could spend lots of time coming up with something sophisticated.
Related
Im considering taking web server from China to reduce site loading times from China/China users. Problem is, how to sync/keep same data between two sites? When editing content in the site it should update these changes to site in China server.
Server is running Linux, Apache and MySQL. Website is using WordPress.
FYI I'm already using CDN and site loading speed is still too long from China.
Basically your solution would need to...
Copy the entire contents of your http'd directory from the main server to the Chinese server.
Copy the entire contents of your MySQL database from the main server to the Chinese server.
Perform these tasks at a regular interval without manual intervention.
I can guide you to references that will help with each task and sometimes can show you a quick example. However, if you want to get it to work and especially if you want to optimize the process, you're going to have to look through the references yourself.
If I didn't do it this way this answer would get even more horrendously long that it already is.
Before we start you should remember...
Thing 0 - Please Try Not to be Intimidated by the Length of this Answer
I know I've written a lot, perhaps more than I should have, but I guarantee you are capable of implementing this in no more than a day. I have tried to be thorough but that does not mean that what I'm describing is particularly complicated.
Thing 1 - Shutdown your Chinese Server During Transfer
This transfer of data is going to make your Chinese server unusable while it's in progress, as you might have guessed. You need to make sure that you're Chinese server is not operational during the transfer. Otherwise the server might have only partial data available which could cause problems for both client and server, particularly in relation to MySQL.
Thing 2 - Use Compression as much as You Can
As time consuming as compression and decompression can be for large amounts of data, believe me it is nothing compared to the time you will waste sending the uncompressed data to China. Network usage, not processor time, is really going to be the limiting factor in getting the transfer done quickly. Try to send compressed files whenever possible.
Thing 3 - Try to Use Checksums
Sending all your data, particularly in compressed format, will leave it vulnerable to corruption in transit. Whenever you send a file I encourage you to use some kind of checksum on the data to verify that it has not been corrupted. For brevity I will not be showing you how to do this but I'm sure you're smart enough to figure out how to pepper in some verification.
In case you're not familiar with checksums, the Wikipedia article about them is pretty straight forward. The most commonly used are the MD5 and the SHA-1, but both of those are somewhat collision prone. I would recommend the SHA-2 (also called SHA-256/512) or the very new SHA-3.
Copying your Http'd Directory to the Chinese Server
As far as I know (and I could be wrong) there is no built in way to transfer files from one Apache server to another...so you're going to have to write your own script for this.
You're also going to need to have two separate scripts: one for the main server and one for the Chinese server. Here's a breakdown of what each script needs to do.
On your main server...
Log in as you're Apache server's user. (Reference for switching users.)
zip/gzip/tar.gz your http'd directory's contents. (Reference for zip. Reference for gzip. Reference for tar.)
scp (secure copy) the compressed file to your Chinese server. Make sure to copy it to the username that Apache runs under. (Reference for scp.)
Delete the compressed file.
Initiate the Chinese server's script (this will be discussed later).
You will likely be using a shell script for all of this, so I hope you're familiar with the terminal. A simple example would look like this.
#!/bin/sh
## First I'll define some variables to explain this better.
APACHE_USER="whatever your Apache server's username is (usually it's www-data)";
WWW_DIR="your http'd directory relative to ~ (usually it's /var/www)";
CHINA_HOST="the host name/IP address of your Chinese server"
CHINA_USER="Apache's username on the Chinese server";
CHINA_PWD="Apache's user password on the Chinese server";
CHINA_HOME="the home directory of the Apache user on your Chinese server";
## Now to the real scripting. I will be using zip for compression.
su - "$APACHE_USER";
zip -r copy.zip "$WWW_DIR";
scp copy.zip "$CHINA_USER#$CHINA_HOST:$CHINA_HOME" < echo $CHINA_PWD;
rm copy.zip;
## Then you initiate the next step of the process.
## Like I said this will be covered later.
On your Chinese server...
Log in as the Apache user.
Delete the content of the http'd directory (probably /var/www relative to ~).
Decompress the scp'd file (this will change depending on how you compressed it).
Copy the decompressed directory to the http'd directory (this step is unnecessary if you choose to compress with zip).
Deleted the compressed, scp'd file.
Notify main server to continue next step (again, will be discussed later).
This is pretty straight forward and I don't think you need another example for this part.
Copying the MySQL Database Contents
You can find a good reference for how to do this in this article from the MySQL website. Basically copying database contents is a built in feature. Try to make use of the compression options!
Performing these Tasks at Regular Intervals without Manual Intervention
Ok this is where things get kind of complicated.
The first thing you need to know is how to schedule tasks at regular intervals on Linux. This is done with a command line tool called crontab. You can see good examples for setting up cron jobs in this article, and the full crontab documentation here.
However what will take more skill than just scheduling the job at regular intervals will be synchronizing the data transfer. If you simply set one server to send data at a certain time and the other to receive it at a certain time, you will get many bugs. Be sure of that.
My recommendation would be to create a socket in the Chinese server that listens for instructions from the main server.
This can be done in a variety of languages. Because you're using Linux I would recommend doing this in C, but it can be done in almost any language including Bash.
A full example would be too much but basically this will be the flow of what you have to do.
Socket in China listens for connections.
Cron job in main server connects to China socket.
Main server authenticates itself.
Chinese server stops Apache, stops accepting requests.
Chinese server acknowledges authentication approved.
Main server scp's website contents to Chinese server.
Main server tells Chinese server that scp is complete.
Chinese server replaces Apache's http'd directory's contents with the data that has been scp'd.
Chinese server announces success to main server.
Main server copies MySQL data.
Main server tells Chinese server process is complete.
Chinese server resumes Apache service.
Chinese server notify's main server that service is resumed.
Socket is closed.
Chinese server goes back to listening for connection from main server.
I hope this helps!
It seems rather pointless to have everybody creating the same client for a project in Perforce, so, is there any one one could create a "public" client in Perforce from where everybody could sync from?
Edit: I meant clients like the ones you create in Perforce from a client spec
It's easier to understand the architecture, I believe, if you use the term 'workspace' rather than 'client'. Perforce applications manage files in a designated area of your local disk, called your workspace. As the name implies, your workspace is where you do most of your work. You can have more than one client workspace, even on the same workstation.
Since two different users are generally working independently, on separate workstations or laptops, they each need their own copy of the code, and they each need their own workspace so that they can control when they sync up with the changes in the server.
If you and I try to share a single copy of the code, on a single workstation, we'll find ourselves quickly confused about whose changes are whose; it's much easier for us to work independently, and to merge our changes as separate submissions to the server.
If the issue in your case is that client definitions are complex, with very intricate view definitions, then you may wish to investigate the 'template client' feature: set up a single master client with the view and options that you prefer, and then your other users can use 'client -t' to create workspace definitions that copy the view and options details from the template client.
It's possible to do this, but not advisable. Since Perforce keeps a server-side record of what files are synced to each client, you could run into a situation where:
User Fred syncs using the shared client and gets a fresh set of files.
Before any changes are committed, user Jim syncs using the shared client and gets nothing because the Perforce server thinks that the client already has an up to date set of files.
Jim could get around this using "p4 sync -f" which will force all the latest files to be synced to his workspace, but that's a kludge around the way Perforce is designed to be used.
Perforce clients are very lightweight in terms of the resources they take up on the server, so it's better not to have shared clients.
I tried to find a more complete explanation of why clients should not be shared in the online Perforce documentation, but it's not very helpful. The book "Practical Perforce" has the best overview I've seen if you happen to have a copy around.
Use a template workspace as Bryan mentioned, or consider using streams. In the streams framework you define the stream view (composition) once, and workspaces are generated automatically.
p4 sync -f is too slow. Because firstly it will delete all the files in your local and then reload the files from central depot! there is a tricky way to do. It is to create a havelist and do sync, when wanting do sync -f. details is 1,get the clientspec, 2, save it to local. 3, delete the client 4, create a same client using the saved clientspec. Therefore we save the time for delete local files.
I am building an internal iOS application (so - it won't ever be in the app store), and I need to keep a directory of content synchronized between a server and each of the instances of the iOS application. This would be easy enough if I just wanted to delete and re-download this content each time, but I would rather use something similar to rsync to only download the elements that have changed.
I haven't found any good way to utilize rsync. I considered looking at Objective-Git as a possibility here, but at a quick glance it looked like there is still a lot of the support for remote repositories that isn't supported yet.
As a final note, while this won't be in the app store, I will not be jailbreaking these devices and I would prefer to not rely on any private API's (although if there was an elegant solution that utilized private API's I might consider it).
Thoughts?
ADDITIONAL NOTE: This needs to be an isolated solution. I won't be relying on outside services (like Dropbox, Box.net, etc...). This needs to work solely between the device and the server (which is on a local network with the device).
Use HTTP to list the contents of each folder on the server.
Compare last modification time of each file with those on the device, and identify added/removed files.
Get added and modified files, remove deleted files.
It sounds like you're maybe asking for a library that already does this, but if you don't find one it's obviously moderately easy to write this from the ground up using stat(2) on the server and the same or a higher-level equivalent on the iOS devices. Have the iPhone send a tree of files with their modification date to the server and get back a list of insert/delete/update operations to do with the url (or whatever) for each one so you can do them incrementally on a background thread. Have the information from the server for new/updated files include the mod date that the server has so you can set it to be the same on the iOS device and send that when asking the server for the status of each file (kind of hack using the file system to store that, but it works).
Why not just set up a RESTful interface and do it across HTTP; that way you could query the modification times easily enough to determine whether client or server files need to be updated. You might also want to keep track of what files on the client have been synced, so you can easily know which files to add or delete. This can be done with a simple .sync file or using a plist / sqlite / etc.
If you'll consider FTP, there are some pretty advanced client libraries available.
For example, the iOS Chilkat bundle includes an FTP client library that supports synchronization in both directions. It's not free, but it's pretty cheap -- and you get a ton of other stuff that will likely prove useful someday. Here's an example of iOS pulling down all additions and changes (mode 2):
http://www.example-code.com/ios/ftp_syncLocalTree.asp
One caveat -- judging solely from the example, it doesn't appear to synchronize deletions. If this is a requirement, you could do it yourself without too much effort immediately following a sync.
acrosync (see https://acrosync.com/library.html) seems like a good fit given the initial question, however I haven't used it myself yet.
I'm testing out Microsoft Sync Framework to try and see if it'll be suitable for a task that I'm working on. One of the things I'd like to be able to do is to have the option to not just send changed files, but instead to send all of the files (for example, if I'm syncing to a client machine for the first time, and so want to send all files).
I can't seem to find an example of this in the documentation, so any advice would be welcome.
if you're synching for the first time, then there is nothing special to configure as it will sync everything.
if you've already synched and want to re-send all files regardless of whether they've changed or not, just delete the metadata file and that should remove all knowledge of what has been synched.
I'm writing an application that monitors a directory for new input files by polling the directory every few seconds. New files may often be several megabytes, and so take some time to fully arrive in the input directory (eg: on copy from a remote share).
Is there a simple way to detect whether a file is currently in the process of being copied? Ideally any method would be platform and filesystem agnostic, but failing that specific strategies might be required for different platforms.
I've already considered taking two directory listings separaetd by a few seconds and comparing file sizes, but this introduces a time/reliability trade-off that my superiors aren't happy with unless there is no alternative.
For background, the application is being written as a set of Matlab M-files, so no JRE/CLR tricks I'm afraid...
Edit: files are arriving in the input directly by straight move/copy operation, either from a network drive or from another location on a local filesystem. This copy operation will probably be initiated by a human user rather than another application.
As a result, it's pretty difficult to place any responsibility on the file provider to add control files or use an intermediate staging area...
Conclusion: it seems like there's no easy way to do this, so I've settled for a belt-and-braces approach - a file is ready for processing if:
its size doesn't change in a certain period of time, and
it's possible to open the file in read-only mode (some copying processes place a lock on the file).
Thanks to everyone for their responses!
The safest method is to have the application(s) that put files in the directory first put them in a different, temporary directory, and then move them to the real one (which should be an atomic operation even when using FTP or file shares). You could also use naming conventions to achieve the same result within one directory.
Edit:
It really depends on the filesystem, on whether its copy functionality even has the concept of a "completed file". I don't know the SMB protocol well, but if it has that concept, you could write an app that exposes an SMB interface (or patch Samba) and an API to get notified for completed file copies. Probably a lot of work though.
This is a middleware problem as old as the hills, and the short answer is: no.
The two 'solutions' put the onus on the file-uploader: (1) upload the file in a staging directory and then move it into the destination directory (2) upload the file, and then create/upload a 'ready' file that indicates the state of the content file.
The 1st one is the better, but both are inelegant. The truth is that better communication media exist than the filesystem. Consider using some IPC that involves only a push or a pull (and not both, as does the filesystem) such as an HTTP POST, a JMS or MSMQ queue, etc. Furthermore, this can also be synchronous, allowing the process receiving the file to acknowledge the content, even check it for worthiness, and hand the client a receipt - this is the righteous road to non-repudiation. Follow this, and you will never suffer arguments over whether a file was or was not delivered to your server for processing.
M.
One simple possibility would be to poll at a fairly large interval (2 to 5 minutes) and only acknowledge the new file the second time you see it.
I don't know of a way in any OS to determine whether a file is still being copied, other than maybe checking if the file is locked.
How are the files getting there? Can you set an attribute on them as they are written and then change the attribute when write is complete? This would need to be done by the thing doing the writing ... which sounds like it isn't an option.
Otherwise, caching the listing and treating a file as new if it has the same file size for two consecutive listings is the best way I can think of.
Alternatively, you could use the modified time on the file - the file has to be new and have a modified time that is at least x in the past. But I think this will be about equivalent to caching the listing.
It you are polling the folder every few seconds, its not much of a time penalty is it? And its platform agnostic.
Also, linux only: http://www.linux.com/feature/144666
Like cron but for files. Not sure how it deals with your specific problem - but may be of use?
What is your OS. In unix you can use the "lsof" utility to determine if a user has the file open for write. Apparently somewhere in the MS Windows Process Explorer there is the same functionality.
Alternativly you could just try an exclusive open on the file and bail out of this fails. But this can be a little unreliable and its easy to tread on your own toes.