Copying files with fast multithreading parsyncfp

Introduction

parsyncfp is a parallel wrapper around rsync that speeds up the transfer of large file collections by running several rsync processes at once. Each process synchronizes its own chunk of files, so available bandwidth and CPU are used more efficiently than with a single rsync stream.

This guide covers when parsyncfp helps, how to install it, and how to run a basic transfer between two servers.

Warning

Parallel copying speeds up transfers only on fast disks and a stable network. On slow disks or network-mounted filesystems (NFS, SMB, FUSE), increasing the number of parallel processes often gives no speed gain and may instead push extra load onto the source, the target, or the link between them. Start with a small number of processes, measure the result, and raise the count only if the disks and the network can keep up.

How parsyncfp works

parsyncfp uses the fpart utility to split the source file tree into chunks of roughly equal size, then launches several rsync processes in parallel, one per chunk. Because fpart splits the tree incrementally, parsyncfp can begin transferring files before the full source tree scan completes, which matters for very large directory trees.

The tool needs to be installed on the source server only. rsync handles the receiving side as usual.

The same author maintains a successor called parsyncfp2 (pfp2), which includes bug fixes and improvements over parsyncfp and additionally supports sending from several source hosts at once. For new deployments, parsyncfp2 is the recommended choice. This guide covers classic parsyncfp; the pfp2 command syntax is largely compatible.

When parsyncfp is useful

parsyncfp gives the biggest gain when:

the source contains millions of small files, where rsync spends most of its time on metadata operations rather than on actual data transfer;
both the source and the target are backed by fast storage (NVMe, SSD, or a well-tuned RAID);
the network between source and target is fast and stable (1 Gbps or more, low packet loss);
a single rsync process is bottlenecked by CPU or by per-stream throughput, not by the disk or the link.

For small directories, slow spinning disks, or a saturated network, plain rsync is usually enough.

The ~/.parsyncfp directory

parsyncfp creates a cache directory called ~/.parsyncfp on the source server. Inside it you will find the fpcache subdirectory, which holds the fpart log, the PID files of the running rsync processes, and the chunk files that list which paths each rsync should copy. Log files are date-stamped and are not overwritten between runs, so previous runs can be reviewed later.

If you want to run several parsyncfp instances at the same time, point each one to a separate cache location with the --altcache option. parsyncfp detects other running instances at startup and warns about them, so accidental overlapping runs become visible early.

Installing dependencies

parsyncfp itself relies on rsync (the transport) and fpart (to split the source tree into chunks). Install both on the source server.

On Debian and Ubuntu:

apt-get install rsync fpart

On RHEL, CentOS Stream, AlmaLinux, and Rocky Linux, fpart is not in the base repositories. Enable the EPEL repository first, then install both packages:

dnf install epel-release dnf install rsync fpart

Some EPEL packages depend on packages from the PowerTools or CRB repository, which is disabled by default. Enable the right one for your version:

# RHEL/AlmaLinux/Rocky Linux 9 and newer dnf config-manager --set-enabled crb # RHEL/AlmaLinux/Rocky Linux 8 dnf config-manager --set-enabled powertools

Info

If your distribution does not ship fpart at all (for example, RHEL 10 or its rebuilds at the time of writing), or you prefer not to use EPEL, build fpart from source. The official repository is the fpart GitHub repository; the README lists a small set of build dependencies (a C compiler, make, and the standard C library headers).

Installing parsyncfp

parsyncfp is a single Perl script hosted on GitHub. Download it, make it executable, and move it to a directory on your PATH so you can call it without typing a full path:

wget https://raw.githubusercontent.com/hjmangalam/parsyncfp/master/parsyncfp chmod +x parsyncfp sudo mv parsyncfp /usr/local/bin/

Verify that the script runs:

parsyncfp --help

If the help screen appears, parsyncfp is ready to use.

Setting up SSH access to the target

parsyncfp runs rsync over SSH, so the source server must be able to log in to the target without a password. Generate a key pair on the source (if one is not already present) and copy the public key to the target:

ssh-keygen ssh-copy-id user@target_host

Replace user@target_host with the user name and the IP or hostname of your target server. After this step, the command below should connect without prompting for a password:

ssh user@target_host

Running a transfer

A typical parsyncfp command looks like this:

parsyncfp --NP=8 --altcache=/dir/local/tmp --startdir=/dir/local/ www/ [email protected]:/var/dir/

This command starts 8 rsync processes, uses /dir/local/tmp as the cache directory, and copies the contents of /dir/local/www to /var/dir/ on the target host 192.168.67.1.

Info

Pay attention to the trailing slash on the source argument: www/ copies the contents of the www directory into the target (as in the example above), while www (without a slash) copies the www directory itself, creating /var/dir/www/ on the target. This is the same behavior as plain rsync.

Key command parameters

--NP sets the number of parallel rsync processes. A reasonable starting point is 4 to 8; increase the count gradually, and only if the disks and the network are not yet saturated. Setting --NP equal to the number of CPU cores is rarely the right choice on its own.
--altcache specifies an alternative cache directory. It is useful when you run several parsyncfp instances at once, or when the default location is on slow storage. Avoid pointing it at a tmpfs-backed location (such as /tmp on many distributions), since the chunk lists for very large source trees can grow large and you do not want them eating into RAM.
--startdir sets the working directory used as the base for the source paths that follow.

The arguments after --startdir are the source directories to copy (one or several, space-separated), followed by the rsync-style target in the form user@host:/path/.

For the full list of options, run parsyncfp --help or see the upstream documentation.

Useful additional options

Once the basic transfer works, a few options are worth knowing for production runs:

--maxbw=500000 caps the total bandwidth used by all rsync processes, in KB/s. Useful when the link is shared with other services.
--maxload=12 pauses spawning new rsync processes once the system load average rises above the given value. Helpful on busy source servers, but see the warning below before using it.
--chunksize=5G controls how large each fpart chunk is. Larger chunks reduce per-chunk overhead on transfers with many small files; smaller chunks improve parallelism on transfers with a few very large files.
--rsyncopts="-a -s -x" lets you pass extra flags directly to the underlying rsync processes. Use it to enable compression (-z), set a custom block size, or apply other rsync tuning. Do not pass --delete through --rsyncopts; see the Limitations section below for why this is unsafe.
--verbose=2 increases logging detail, which makes troubleshooting easier on the first runs.

Warning

The upstream author has documented a known issue with --maxload: when parsyncfp goes through repeated suspend and resume cycles, some source files may not be transferred to the target. If you use --maxload, set the threshold high enough that suspensions never trigger, or verify the result with a separate consistency check after the transfer finishes.

Limitations

Even on suitable hardware, parsyncfp has trade-offs that are worth knowing before you increase the process count:

parsyncfp does not safely support rsync's --delete option. Each parallel rsync process sees only its own chunk of the source, so if --delete is passed through, every process will try to delete files on the target that are not in its chunk, which means files that legitimately belong to other chunks. This can wipe out large portions of the target. Use plain rsync (or a separate cleanup pass) when you need --delete semantics.
parsyncfp does not preserve a global ordering of files across the parallel rsync processes, so the order in which files appear on the target is not predictable. This matters only for workflows that depend on file timestamps or directory listing order during the copy itself.
On HDD arrays, many parallel rsync processes turn what would be a sequential read pattern into a near-random one, which can be slower than a single rsync. SSDs and NVMe drives are far less affected.
On filesystems with very large numbers of small files, the per-file metadata overhead can dominate the transfer time. Parallelism helps up to a point, then plateaus.
Each rsync process opens its own SSH connection to the target, so high --NP values multiply the connection setup overhead. Over WAN links with packet loss or high latency this can make congestion worse rather than better. In that case, fewer processes with a tuned rsync (compression, larger block size) often performs better than many parallel ones.

Copying files with fast multithreading parsyncfp

Introduction

How parsyncfp works

When parsyncfp is useful

The ~/.parsyncfp directory

Installing dependencies

Installing parsyncfp

Setting up SSH access to the target

Running a transfer

Key command parameters

Useful additional options

Limitations

See also

VAT

GPU Servers

Game servers

Cloud Storage

Backup Service

Dedicated Servers

Cheap Servers

VPS

System Administration

Virtualization