Recently, I have fallen in love with samtools depth. It’s fast, easy to use and does exactly what I want it to do – get coverage of individual bases in the BAM file.
After some experiment and figuring out why the heck I am missing some covered regions I have found out that default behavior of is NOT to count any of the following: secondary mapping, qc-fail and read duplicates.
Luckily, in the new version of samtools (>=1.10) you can allow counting of the mappings/reads by setting -g and then either list them separated by comma or using hex or decimal numbers.
For example to get all the regions and count secondary, qc-fail and duplicated mappings/read:
samtools-1.10 depth -a -d 0 -g SECONDARY,QCFAIL,DUP in.bam
Note: Brent Pedersen wrote mosdepth (publication here) which should be faster than samtools depth especially for large mappings (such as genome). From the GitHub: “per-base depth about 2x as fast samtools depth–about 25 minutes of CPU time for a 30X genome.“.
Edit: I have tested very briefly tested mosdepth and it doesn’t seem to be that much faster. I would even say samtools depth is faster. The testing set was transcriptome BAM file.