Bash – Jan's Blog of Bioinformatics Bits

From time to time I find myself in a situation when I feel a particular step in my analysis could run faster. If the software/command doesn’t support multithreading you have several options how to speed it up. The easiest is to use GNU parallel or xargs. However, GNU parallel is Perl-based which doesn’t always work well in virtual environments and xargs is not very flexible.

After I lost my patients with making GNU parallel work in all my environments I searched for an alternative. The best solution ended up being rush (Wei Shen; GitHub). It’s based on gargs (Brent Pedersen; GitHub) written in Go so you won’t have any problems with compatibility like with GNU parallel.

The user interface is very similar to GNU parallel and so far it has been working very well. It also has some nice features such as resuming failed jobs, setting variables (similar to awk), removing suffixes/replacing strings, etc.

For me personally, the most convenient is still echo of the commands to run in parallel to a file and then running rush on this file (one command per line)

threads=5

> cmds.txt # Empty previous cmds file

# Prepare multiplications of numbers 1-10 by a thousand
for i in {1..10}; do 
    echo -n "
        echo \"$i*1000\" | bc"
done > cmds.txt

# Run commands in parallel with rush and save completed commands
cat cmds.txt | rush '{}' -j $threads --verbose -c -C cmds-done.txt

P.S. I just found out the author of GNU parallel lists a lot of alternatives with detailed description of differences here.

$ var="hello.world.txt" $ echo ${var%.*} # Remove all after last "." hello.world $ echo ${var%%.*} # Remove all after first "." hello $ echo ${var#*.} # Remove all before first "." world.txt $ echo ${var##*.} # Remove all before last "." txt

> vari <- "hello.world.txt" > sub(".[^.]+$", "", vari) # Remove all after last "." hello.world > gsub("\\..*", "", vari) # Remove all after first "." hello > sub(".*?\\.", "", vari) # Remove all before first "." world.txt > gsub(".*\\.","",vari) # Remove all before last "." txt

Tag: Bash

Alternative to GNU parallel

Remove everything after last or before first character in Bash and R