Parallel file transfer on Linux

Hi all,

I had a request to copy a ton of files from one file system to another,  I know that there are tools that can help with that like rsync but due to some requirements and me wanted to do some scripting I put something together  to help with this request. This is not the 1st time I do something like this but it is the 1st time I share 🙂

What I’m sharing is now what I did for the request I mentioned but you will get an idea

The script will copy a list of files from one server to another. This list I usually create by using find like this

find /Directory_which_I_want_to_copy -type f > file_list.txt

The script will receive some parameters as listed below

parallel_xfer.ksh    

Also a requirement for this to work is that you can ssh to the target server without a password.

It will keep X parallel sessions running at all times until there are new files to start copying it, After all copies are started, it will monitor them until completion. Also the script assumes that the source and target directory destination is the same but this is easily changed if needed.

The logging needs to be improved but it will show the file it started as well their processes count

Hope it helps

Elisson

#!/bin/ksh
DBLIST=${1}
DEST_DIR=${2}
SERVER=${3}
NUM_SESS=${4}
STARTED_COUNT=0
RUN_COUNT=0

trim() {
    local var=$@
    var="${var#"${var%%[![:space:]]*}"}"   # remove leading whitespace characters
    var="${var%"${var##*[![:space:]]}"}"   # remove trailing whitespace characters
    echo "$var"
}


FILE_COUNT="$(cat ${DEST_DIR}/$DBLIST | wc -l)"
cd ${DEST_DIR}
for FILE in $(cat $DBLIST)
do
 STARTED_COUNT=$((${STARTED_COUNT}+1))
 if [ ${RUN_COUNT} -le ${NUM_SESS} ]
 then
   sftp -Cq USER@${SERVER}:${FILE} ${DEST_DIR}/. >/dev/null 2>/dev/null &
   echo "`date` - Transferring file ${FILE} to ${DEST_DIR} - ${STARTED_COUNT}/$(trim ${FILE_COUNT})"
   sleep 5
 fi
 echo "\n"

 RUN_COUNT=$(ps -ef | grep " ${$} " | grep sftp | grep -v grep | wc -l)

 while [ ${RUN_COUNT} -ge ${NUM_SESS} ]
 do
  RUN_COUNT=$(ps -ef | grep " ${$} " | grep sftp | grep -v grep | wc -l)
  echo "`date` - $(trim ${RUN_COUNT}) transfer processes running"
  echo "`date` - Amount of GB transferred `du -sg ${DEST_DIR}`\n"
  sleep 60
 done
done

while [ $(ps -ef | grep " ${$} " | grep sftp | grep -v grep | wc -l) -gt 0  ]
do
 RUN_COUNT=$(ps -ef | grep " ${$} " | grep sftp | grep -v grep | wc -l)
 echo "`date` - $(trim ${RUN_COUNT}) transfer processes running"
 echo "`date` - Amount of GB transferred - `du -sg ${DEST_DIR}`\n"
 sleep 60
done
echo "`date` - Transfered completed"

2 Comments

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.