What is the best way to transfer files between servers constantly? - file-transfer

I need to transfer files between servers ... no specific move, but continuos moves. What is the best way ?
scp, ftp, rsync ... other ?
Of course if it's a "common" method (like ftp) I would block just to works between the IP's of the servers.
Also I need a SECURED form to transfer files ... I mean, to be totally sure that the files have moved successfully .
Has rsync a way to know the files were moved successfully ? maybe an option to check size or checksum or whatever.
Note: The servers are in different location

try rsync, utility that keeps copies of a file on two computer systems

Related

mutt + offlineimap and only few folder offline

I have been trying to understand offlineimap with mutt configuration but I probably do not. In the end, I realised that what i need is to have offline only e.b. Inbox and Sent. That configuration one can find on internet but I also need to be able to access the other folders in mutt but without having to download them offline.
E.g.
I want all mails in Inbox to be downloaded offline to the computer and mutt accessing them from the local repository. But I also need to access folder Inbox/SomeMore but without having to reconfigure mutt and offlineimap and most importantly without donwloading the whole content of that folder to the computer.
Is this doable? And exactly how?
offlineimap's job is to download mails and make them available in offline situations. There is no way to temporary download the content of some mail folders. It might be possible to go for a hacky solution. Specify the folders you don't want to sync with the folderfilter option and additionally set up mutt to access the other IMAP folders.
You can specify a folder filter that excludes specific folders. Instead of adding the subfolder's name to the list it might be even possible to exclude it like this INBOX/foo (in case of having multiple folders with the same name):
folderfilter = lambda folder: folder in ['INBOX', 'Sent', 'Drafts', 'Junk', 'foo', ...]
PS: If folderfilter is not specified at all, ALL remote folders will be synchronized.

SFTP file uploading and downloading at same time

A cronjob runs every 3 hours to download a file using SFTP. The scheduled program is written in Perl and the module used is Net::SFTP::Foreign.
Can the Net::SFTP::Foreign download files that are only partially uploaded using SFTP?
If so, do we need to check the SFTP file modified date to check copy process completion?
Suppose a new file is uploading by someone in SFTP and he file upload/copy is in progress. If a download is attempted at the same time, do I need to code for the possibility of fetching only part of a file?
It's not a question of the SFTP client you use, that's irrelevant. It's how the SFTP server handles the situation.
Some SFTP servers may lock the file being uploaded, preventing you from accessing it, while it is still being uploaded. But most SFTP servers, particularly the common OpenSSH SFTP server, won't lock the file.
There's no generic solution to this problem. Checking for timestamp or size changes may work for you, but it's hardly reliable.
There are some common workarounds to the problem:
Have the uploader upload "done" file once upload finishes. Make your program wait for the "done" file to appear.
You can have dedicated "upload" folder and have the uploader (atomically) move the uploaded file to "done" folder. Make your program look to the "done" folder only.
Have a file naming convention for files being uploaded (".filepart") and have the uploader (atomically) rename the file after upload to its final name. Make your program ignore the ".filepart" files.
See (my) article Locking files while uploading / Upload to temporary file name for example of implementing this approach.
Also, some FTP servers have this functionality built-in. For example ProFTPD with its HiddenStores directive.
A gross hack is to periodically check for file attributes (size and time) and consider the upload finished, if the attributes have not changed for some time interval.
You can also make use of the fact that some file formats have clear end-of-the-file marker (like XML or ZIP). So you know, when you download an incomplete file.
For details, see my answer to SFTP file lock mechanism.
The easiest way to do that when the upload process is also under your control, is to upload files using temporal names (for instance, foo-20170809.tgz.temp) and once the upload finishes, rename then (Net::SFTP::Foreign::put method supports the atomic option which does just that). Then on the download side, filter out the files with names corresponding to temporal files.
Anyway, Net::SFTP::Foreign get and rget methods can be instructed to resume a transfer passing the option resume => 1.
Also, if you have full SSH access to the SFTP server, you could check if some other process is still writing to the file to be downloaded using fuser or some similar tool (though, note that even then, the file may be incomplete if for instance there is some network issue and the uploader needs to reconnect before resuming the transfer).
You can check the size of the file.
Connect to SFTP.
Check file size.
Sleep for 5/10 seconds.
Check file size again.
If size did not change, download the file, if the size changed do step 3.

FTP transfer server to server using SSH/command line

I have a bunch of vendors that make their FTPs available to download images of their products. Some of these guys like to put them into multiple subfolders, using the collection or style name and then sku. For example, they will make folder structure like:
Main folder
---> Collection A
------> Sku A
----------> SKUApicture1.jpg, SKUApicture2.jpg
------> sku B
----------> SKUBpicture1.jpg, SKUBpicture2.jpg
---> Collection B
------> Sku C
----------> SKUCpicture1.jpg, SKUCpicture2.jpg
------> sku D
----------> SKUDpicture1.jpg, SKUDpicture2.jpg
Until now, I have found it easiest to log onto my server via SSH, navigate to the folder I want, and then log on to my vendor's FTP, at which point I put in the user name a PW and navigate to the folder I want, and then take all the images using mget. If all (or most) of the files are in 1 folder, this is simple.
The problem is mget won't take and folders or subfolders, it will only take files within the given folder. In the above example, my vendor has over 10 folders and each one has 100+ subfolders, so navigating to each one isn't an option.
Also, the industry I deal in isn't tech savy, so asking their "tech people" to enable/allow SCP, SFTP, or rsync, etc., is likely not an option.
Downloading all the images locally and re-uploading them to my server also isn't practical, as this folder is over 10GB.
I'm looking for a command (mget or other) that will enable me to take ALL files and subfolders, as is, and copy straight to my server (via SSH).
Thanks
NOTE: For this particular server I tried rsync, but got an error telling me it wasn't compatible with that command. I doubt I have the command wrong, but if you want to post the proper way to rsync I'll be more then happy to try it again and provide the exact error
Have you tried something like
wget -r ftp://username:password#ftp.example.com/
It should recursively get all the files from the remote ftp.
You can use the lftp:
lftp -e 'mirror <remote download dir> <local download dir>' -u <username>,<pass> <host>
Taken from Copying Folder Contents with Subdirectories Over FTP.
Have you considered using SFTP? You said that FTP works how you want it to work, and SFTP works the exact same way. Your FTP client with SFTP support behaves the exact same way but it's using SSH to connect.

Copying updated Files between Networks

Is there a way that I can copy updated files only from one network to another? The networks aren't connected in anyway, so the transfer will need to go via CD (or USB, etc.)
I've had a look at things like rsync, but that seems to require the two networks to be connected.
I am copying from a Windows machine, but it's going onto a network with both Windows and Linux machines (although Windows is preferable due to the way the network is set up).
you can rsync from source to the use-drive, use the usb-drive as a buffer, and then rsync from the usb-drive to the target. to benefit from the rsync algorithm and reduce the amount of copied data you need to keep the data on the usb-drive between runs.

Need an opinion on a method for pull data from a file with Perl

I am having a conflict of ideas with a script I am working on. The conflict is I have to read a bunch of lines of code from a VMware file. As of now I just use SSH to probe every file for each virtual machine while the file stays on the server. The reason I am now thinking this is a problem is because I have 10 virtual machines and about 4 files that I probe for filepaths and such. This opens a new SSH channel every time I refer to the ssh object I have created using Net::OpenSSH. When all is said and done I have probably opened about 16-20 ssh objects. Would it just be easier in a lot of ways if I SCP'd the files over to the machine that needs to process them and then have most of the work done on the local side. The script I am making is a backup script for ESXi and it will end up storing the files anyway, the ones that I need to read from.
Any opinion would be most helpful.
If the VM's do the work locally, it's probably better in the long run.
In the short term, the ~equal amount of resources will be used, but if you were to migrate these instances to other hardware, then of course you'd see gains from the processing distribution.
Also from a maintenance perspective, it's probably more convenient for each VM to host the local process, since I'd imagine that if you need to tweak it for a specific box, it would make more sense to keep it there.
Aside from the scalability benefits, there isn't really any other pros/cons.