Differential expression testing of genes with variable expression and/or large test groups

During the regular testing of differential gene expression, the null hypothesis is often that there is no difference between the tested groups (log2 fold-change equals zero). Of course, this is almost never true but we round it up and say they are approximately the same.

However, there are many situations when this doesn’t stand. The simplest one is – The sets are so different that the difference of most genes is different from logFC = 0. This could happen when the tests are very big, highly variable, or the applied treatment is very harsh. Then the question is – what is the level of a difference high enough to be biologically significant? Aka how much the two sets (or genes, for example) have to be different to be really different.

One possible solution is the use of the so-called banded hypothesis test, where the null hypothesis is not exact equality, but that the difference is below a threshold. Wolfgang Huber has a nice Twitter thread about it here, and the cited DESeq2 publication Moderated estimation of fold change and dispersion for RNA-seq data with DESeq2 (section Hypothesis tests with thresholds on effect size). edgeR has its own implementation of a very similar concept, as highlighted by Davis McCarthy here, and the cited edgeR publication Testing significance relative to a fold-change threshold is a TREAT.

The method, if I dare to extremely simplify it, is based on testing whether the differential expression in an RNA-Seq/microarray experiment is greater than a given (biologically meaningful) threshold. A change should be of sufficient magnitude to be considered biologically significant which helps with small, but somehow consistent changes (for large datasets).

The problem with incorrect null hypothesis might be one of the issues in Exaggerated false positives by popular differential expression methods when analyzing human population samples publication where they evaluated the suitability of edgeR and DESeq2 on testing differential expression in population-sized groups. Of course, in this case, the classic null hypothesis doesn’t make any sense and is broken from the very beginning – most of the genes will not have logFC = 0. If you scroll to the methods and check the code you can see they used the standard edgeR/DESeq2 commands which are not designed to work on such tests. I believe if they used the tests above they would come to completely different conclusions.

The problem with this paper is also summarized in Michael Love’s Tweet highlighting problems with the experiment design and the execution itself.

Leave a comment