IGV batch snapshot from command line

Right now I am processing a lot of public sequencing datasets from SRA. As always in this situation, a proper QC is necessary to unravel all the adaptor sequences, UMIs, … especially for datasets where we care about precise 5’/3′ ends. In my case, it’s Ribo-Seq and we want to play with 5′ ends in a big detail.

For a long time, my QC pipeline was missing a detailed alignment check. I mean really seing the alignment itself in bases, not some summary statistics. This would help me to see extensive and repetetive softclipping, mismatches, etc. which can point out some preprocessing error. Unless I want to open each bam in IGV I needed to think about something more automated. I knew IGV has a batch mode but I never went into a detail.

To make IGV snapshots automatic and you don’t want to open IGV GUI for all the bams you need to get xvfb-run first. xvfb-run will create a fake X11 which will allow you to launch IGV from command line without opening it’s GUI (inspiration from here which links you to IGV Snapshot Automator tool). Once you have this you can start the fun. It’s possible to it without the xvfb-run but the script would open IGV GUI everytime you try to make a snapshot which is not always possible (for example some ssh connections don’t allow X11 export).

Then, you have to make your batch (~config) file which will instruct IGV what to do. A simple example is available at IGV. I would recommend two snapshots targeted to some region of a housekeeping gene(s) – one which captures a exon-exon boundary to see a splicing event and one with a more detailed zoomed look to see the alignment at a single base resolution. For human with Ensembl genome I am using this one:

genome /home/jan/data/hg38/genome/genome.fa
load /home/jan/projects/test/samples/test/test.bam
snapshotDirectory //home/jan/projects/test/samples/test/Screeshots/
preference SAM.SHOW_SOFT_CLIPPED true
goto 12:112022058-112022719
sort position
snapshot splice.png
goto 12:112022150-112022250
sort position
snapshot zoom.png
exit

It instructs IGV to load user reference genome (can be fasta), load my bam, set snapshot directory, load temporary preferences and then to to a specific position, take snapshot and save it with a specific name, go to another location and do the same, and exit the IGV.

The IGV manual doesn’t say much about the preference option nor they list available options (or I haven’t found them). They help you to adjust behaviour of the IGV in the same way as you do in the Preferences tab in IGV GUI. The easiest way to see the possible settings is to read $HOME/igv/prefs.properties. This lists all the preferences you have change manually in the IGV GUI. Then you just have to guess what is what. Some options are listed at the IGV GitHub but the easiest is to copy your favourite preferences and just use those.

The next trick is to change the regular java execution with the xvfb-run. For example, you can just copy-paste the regular IGV launch script at IGV/igv.sh and replace line

exec java -showversion --module-path="${prefix}/lib" -Xmx16g \

for line

xvfb-run --auto-servernum --server-num=1 java -showversion --module-path="${prefix}/lib" -Xmx16g \

And the last step is to launch the IGV itself. Let’s say you have saved the batch/config file to igv-batch.config, and the modified launch script to igv-batch.sh and you can just execute:

sh igv-batch.sh -b igv-batch.config

Now, you can enjoy your beautiful IGV snaphots.

Part of a slice junction with general overview. You can see there is some softclipping at the beginnign and the end of the reads.
After zoom in to a single base resolution we can see there are 4 additional nucleotides at both ends which need to be trimmed (or clipped) away.