Copying files with fast multithreading parsyncfp

Introduction

In today's world of computing, multitasking has become an essential requirement. To efficiently utilize the available resources and improve the performance of our applications, we often rely on multithreading. In this step-by-step guide, we will walk you through the process of fast multithreading using the parsyncfp library.

What is Multithreading?

In programming, multithreading allows you to do multiple things simultaneously. You can make your applications faster and more responsive this way.

Let's define multithreading and how it works. 

Multithreading is when multiple threads run simultaneously in one program. The threads run independently and have their execution context. Multitasking improves the performance and responsiveness of your apps by running numerous tasks in parallel.

What is parsyncfp?

Parsyncfp collects files based on size or number into chunkfiles which can be fed to rsync on a chunk by chunk basis.

Parsyncfp adds a few extra features to parsync, such as the ability to skip files based on size or number, as well as the ability to pause and resume the transfer. It also adds support for more advanced options such as mirroring a directory tree or backing up to a local hard drive. In this way, pfp can transfer files before the complete recursive descent of the source directory has been completed. When dealing with very large dir trees, this feature can be very useful.

Additionally, pfp offers the ability to resume transfers from where they left off, which is especially useful in the event of a system crash. pfp also offers the ability to pause and resume the transfer process, allowing the user to customise the transfer process to their preferences.

The ~/.parsyncfp files

By default, the cache directory contains the fpcache directory, which contains the fpart log, all the PID files, and the chunk files. Because fpart chunking is so fast, parsyncfp no longer provides cache reuse. Log files are date-stamped and are not overwritten. In addition to specifying alternative locations for the cache, you can specify locations for multiple instances so that multiple parsyncfps can run simultaneously. However, they will detect each other's fparts running at startup and question this situation. You will be alerted to rsyncs running on the SEND hosts in the multi-host version.

Copying files with fast multithreading Parsyncfp & Step-by-Step Guide

Step 1. Save on the File Source - pfp needs to be installed only on the SOURCE

Step 2. Copy the key to the remote machine (it may be necessary to make an ssh-keygen first)

ssh-keygen
ssh-copy-id 1.1.1.1

Step 3. Then you have to install

apt-get install fpart

Step 4. If it is not included in packages, you can find it at https://github.com/martymac/fp...running

./parsyncfp --NP=10 --altcache=/dir/local/tmp --startdir='/dir/local/' www 192.168.67.1:/var/dir/

Some Key Terms

  • altcache - where intermediate files will be stored. This option is not required.
  • NP - specifies the number of threads.
  • www - this is the folder that will be copied from /dir/local/ to /var/dir/ - you can specify several folders separated by a space.

Source: https://github.com/hjmangalam/parsyncfp