Recently, the manuscript “Optimally Adaptive Test for High Dimensional Hypotheses via Minimax Deficiency” by Jingkun Qiu, Professor Song Xi Chen, and Associate Professor Yumou Qiu has been accepted for publication in the Journal of the American Statistical Association. The paper introduces a new statistical testing procedure that reliably detects hidden signals in large scale datasets without the need to know specific signal strength or density in advance.
From genomics to climate science to artificial intelligence, modern datasets often track thousands or millions of features at once, leading to far more dimensions than the sample size and far more noise than signals. A basic but crucial question in such settings is whether there is any genuine signal hidden among the noise, and the answer depends on a problem researchers usually cannot foresee: how many signals there are and how strong they are.
Over the past two decades, statisticians have developed three main families of tests for this type of high dimensional or even ultra-high dimensional problem, each tuned to a different kind of signal. The sum-of-squares (L2) test excels when many small signals are spread across the data; the maximum-type (L∞) test shines when only a small fraction of strong effects exist; and the higher criticism (HC) test is built for sparse and faint signals. It has been noticed in the literature that each test is powerful in one scenario but can be weak in others. However, in real applications researchers rarely know beforehand which scenario they are in.
The classical tool for examining the power of such tests has long been the “detection boundary”, which characterizes a phase transition of detectability and undetectability of a test with respect to signal strength and density. The authors show that this criterion cannot distinguish the powers of two competing tests sharing the same detection boundary. To address this, they introduce two sharper measures, the minimax relative deficiency (MRD) and the minimax absolute deficiency (MAD), which contain the higher order power information not shown in the detection boundary.
With the two newly developed measures, the authors introduce an optimally adaptive test that combines all three types of tests through the power enhancement approach. The proposed test is robust to the unknown signal strength and density with sharp optimal relative deficiency and nearly optimal absolute deficiency over the whole signal density regime. The method is also extended to non-Gaussian data with correlated variates, broadening its practical reach.
The approach applies wherever high dimensional signal detection matters, including identifying differentially expressed genes and spotting trends in spatial and temporal records. For instance, the authors conduct a climate change study to detect the impact of human activities (anthropogenic forcings) on two geophysical variables, the sea surface temperature and the precipitation flux, in the North Pacific Ocean. The framework is also closely related to a fast-emerging challenge, namely detecting watermarks that distinguish text generated by large language models from human writing.

Figure: Empirical results of detecting the impact of greenhouse gases (GHG) on precipitation flux (PREC_F) in the North Pacific Ocean. (f) is the proposed 3-in-1 test and (a)-(e) are existing tests in the literature. The tests are performed for each grid, with blue indicating less significant results and red indicating more significant results.
Jingkun Qiu is the first author of this paper. He is a PhD candidate at Guanghua School of Management, Peking University. The co-authors include Professor Song Xi Chen, who serves as Qiu's Ph.D. advisor, and Associate Professor Yumou Qiu. This work was partially supported by Fundamental and Interdisciplinary Disciplines Breakthrough Plan of the Ministry of Education of China, and the National Natural Science Foundation of China.
The paper refer to: