Network file transfer with on-the-fly compression
We often transfer large number and large size files over the network from one computer to another. FTP is the default choice for transferring few files and SCP is the typical choice for transferring large number of files.
If you happen to transfer files from one computer to another over a slow network(such as copying files from home computer to office or vice versa) then the following tip might be helpful. This technique works as follows:
1) Performs on-the-fly compression of files at source computer.
2) Transfer the compressed files over the network.
3) Performs on-the-fly decompression of the files at the target computer.
This technique uses just SSH and TAR commands without creating any temporary files.
Let us assume source computer as HostA and target computer as HostB. We need to transfer a directory (/data/files/) with large number of files from HostA to HostB.
1) Command without on-the-fly compression
Run this command on HostB
# scp -r HostA:/data/files /tmp/
This command recursively copies /data/files directory from HostA to HostB
2) Command with on-the-fly compression
Run this command from on HostB
# ssh HostA “cd /data/;tar zcf – files” | tar zxf –
This command recursively copies /data/files from HostA to HostB a lot faster on slow network.
Let us take a look at this command in detail:
1) ssh HostA “cd /data/;tar zcf – files” | tar zxf – : From HostB connect to HostA via SSH.
2) ssh HostA “cd /data/;tar zcf – files” | tar zxf – : On HostA switch to directory /data/
3) ssh HostA “cd /data/;tar zcf – files” | tar zxf – : Tar ‘files’ directory with compression and send the output to STDOUT.
4) ssh HostA “cd /data/;tar zcf – files” | tar zxf – : Pipe(|) STDOUT from HostA to STDIN of HostB.
5) ssh HostA “cd /data/;tar zcf – files” | tar zxf – : On HostB decompress and untar data coming in through STDIN.
To show how useful this technique is, we transferred 45M worth of files from HostA to HostB over a DSL connection. Here are the results:
1) No compression method: 12min 59 sec
2) On-the-fly compression method: 2min 33 sec
This method will be effective with uncompressed large files or directories with a mix of different files. If the transferred files are already compressed then this method won’t be effective.