| Title: | Methods Based on the e-Closure Principle |
|---|---|
| Description: | Implements several methods for False Discovery Rate control based on the e-Closure Principle, in particular the Closed e-Benjamini-Hochberg and Closed Benjamini-Yekutieli procedures. |
| Authors: | Jelle Goeman [aut, cre] |
| Maintainer: | Jelle Goeman <[email protected]> |
| License: | GPL-3 |
| Version: | 0.9.5 |
| Built: | 2026-06-01 17:09:17 UTC |
| Source: | https://github.com/cran/eClosure |
Applies the e-Closure version of the Benjamini-Hochberg (BH) procedure.
closedBH() returns the number of rejections at a given level, while
closedBH.adjust() returns adjusted p-values, one per hypothesis.
closedBH(p, alpha = 0.05) closedBH.adjust(p, cap = TRUE)closedBH(p, alpha = 0.05) closedBH.adjust(p, cap = TRUE)
p |
Numeric vector of p-values, one per hypothesis. Values must lie
in |
alpha |
Numeric scalar in |
cap |
Logical. If |
closedBH() returns the size of the closedBH-significant set. The
hypotheses with the smallest p-values form one such set. This gives
the maximum number of hypotheses that can be reported while maintaining FDR
control.
closedBH.adjust() returns the adjusted p-value of each hypothesis: the
smallest level at which that hypothesis is among the rejections
of closedBH(). Because the rejection set is always the hypotheses with the
smallest p-values, a hypothesis of sorted rank is rejected at level
exactly when closedBH() returns at least ; the adjusted
p-value is the smallest such . Consequently
sum(closedBH.adjust(p, cap = FALSE) <= alpha) equals
closedBH(p, alpha = alpha).
All of the heavy lifting (the breadth-first traversal over set sizes and the
built-in next.r recursion) is performed in C++; these wrappers only sort
out the user-facing arguments and forward them. closedBH() runs in
time; closedBH.adjust() runs in time and
memory.
closedBH() returns a single non-negative integer : the
hypotheses with the smallest p-values form a valid rejection set, and a
value of 0 means no non-empty set can be rejected.
closedBH.adjust() returns a numeric vector of adjusted p-values, in the
same order as the input p, each lying in .
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B, 57(1), 289–300.
Xu, Z., Solari, A., Fischer, L., de Heide, R., Ramdas, A., & Goeman, J. (2025). Bringing closure to false discovery rate control: A general principle for multiple testing. arXiv preprint arXiv:2509.02517.
closedBY() for the analogous Benjamini-Yekutieli procedure.
closedSu() for the analogous Su procedure.
closedeBH() for the analogous procedure based on e-values.
p.adjust() for standard non-simultaneous multiple testing corrections.
p <- c( 0.0001, 0.0004, 0.0019, 0.0095, 0.0201, 0.0278, 0.0298, 0.0344, 0.0459, 0.3240, 0.4262, 0.5719, 0.6528, 0.7590, 1.0000 ) closedBH(p, alpha = 0.05) closedBH(p, alpha = 0.10) padj <- closedBH.adjust(p) sum(padj <= 0.05) # matches closedBH(p, alpha = 0.05)p <- c( 0.0001, 0.0004, 0.0019, 0.0095, 0.0201, 0.0278, 0.0298, 0.0344, 0.0459, 0.3240, 0.4262, 0.5719, 0.6528, 0.7590, 1.0000 ) closedBH(p, alpha = 0.05) closedBH(p, alpha = 0.10) padj <- closedBH.adjust(p) sum(padj <= 0.05) # matches closedBH(p, alpha = 0.05)
Applies the closed testing version of the Benjamini-Yekutieli (BY)
procedure. The standard BY procedure controls the false discovery rate (FDR)
at level under arbitrary dependence but only provides a single
set of rejections. The closed BY procedure provides simultaneous FDR
control: for every set of hypotheses, it determines whether that set can
be reported as discoveries while maintaining FDR control at level
, regardless of which other sets are inspected.
closedBY(p, set = NULL, alpha = 0.05, approximate = FALSE)closedBY(p, set = NULL, alpha = 0.05, approximate = FALSE)
p |
Numeric vector of p-values, one per hypothesis. Values must lie
in |
set |
Optional subsetting vector for |
alpha |
Numeric scalar in |
approximate |
Logical. If |
The closed BY procedure is based on a local e-value for every
intersection hypothesis. A set of hypotheses is closedBY-significant —
and therefore a valid simultaneous rejection — if and only if, for every
subset , the local e-value exceeds
.
This guarantees post-hoc FDR control: you may report any closedBY-significant
set as your discovery set without inflating the FDR above ,
even if the choice of set was data-driven.
The function has two modes:
Set-checking mode (when set is supplied): Returns TRUE if the
specified set is closedBY-significant (i.e., can be reported as a valid
simultaneous rejection at level ), and FALSE otherwise.
Discovery mode (when set = NULL): Returns the size of the
largest closedBY-significant set. The hypotheses with the smallest
p-values always form one such set. This gives the maximum number of
hypotheses that can be reported while maintaining simultaneous FDR control.
In particular, the set consisting of the smallest p-values is
closedBY-significant.
Note that closedBY significance is not a monotone property: a set of size
being closedBY-significant does not imply that all smaller sets are as well.
The exact algorithm therefore checks all set sizes, while the approximate
algorithm (approximate = TRUE) uses a faster bisection strategy that may
occasionally underestimate the largest significant set.
If set is supplied: a single logical value. TRUE indicates that the
specified set is closedBY-significant and can be reported as a simultaneous
rejection at FDR level . FALSE indicates it cannot.
If set = NULL: a single non-negative integer . The
hypotheses with the smallest p-values form a valid simultaneous rejection
set. A return value of 0 means no non-empty set can be rejected.
Benjamini, Y., & Yekutieli, D. (2001). The control of the false discovery rate in multiple testing under dependency. The Annals of Statistics, 29(4), 1165–1188.
Goeman, J. J., & Solari, A. (2011). Multiple testing for exploratory research. Statistical Science, 26(4), 584–597.
Xu, Z., Solari, A., Fischer, L., de Heide, R., Ramdas, A., & Goeman, J. (2025). Bringing closure to false discovery rate control: A general principle for multiple testing. arXiv preprint arXiv:2509.02517.
p.adjust() for standard p-value-based non-simultaneous multiple testing corrections,
including the BY procedure (method = "BY").
closedeBH() for the analogous procedure based on e-values.
set.seed(42) # 20 null hypotheses (p ~ Uniform(0,1)) and 10 non-nulls (p ~ Beta(0.1, 1), smaller on average) p <- c(runif(20), rbeta(10, 0.1, 1)) # --- Discovery mode --- # Find the maximum number of simultaneous rejections at FDR level 5% r <- closedBY(p, alpha = 0.05) cat("Largest simultaneous rejection set:", r, "\n") # The r hypotheses with the smallest p-values form a valid discovery set discovery_set <- p <= sort(p)[r] cat("P-values in discovery set:", round(sort(p[discovery_set]), 4), "\n") # --- Set-checking mode --- # Check whether a researcher-defined set is a valid simultaneous rejection candidate_set <- p < 0.01 closedBY(p, set = candidate_set, alpha = 0.05) # --- Exact vs. approximate --- r_exact <- closedBY(p, alpha = 0.05, approximate = FALSE) r_approx <- closedBY(p, alpha = 0.05, approximate = TRUE) cat("Exact:", r_exact, " Approximate:", r_approx, "\n")set.seed(42) # 20 null hypotheses (p ~ Uniform(0,1)) and 10 non-nulls (p ~ Beta(0.1, 1), smaller on average) p <- c(runif(20), rbeta(10, 0.1, 1)) # --- Discovery mode --- # Find the maximum number of simultaneous rejections at FDR level 5% r <- closedBY(p, alpha = 0.05) cat("Largest simultaneous rejection set:", r, "\n") # The r hypotheses with the smallest p-values form a valid discovery set discovery_set <- p <= sort(p)[r] cat("P-values in discovery set:", round(sort(p[discovery_set]), 4), "\n") # --- Set-checking mode --- # Check whether a researcher-defined set is a valid simultaneous rejection candidate_set <- p < 0.01 closedBY(p, set = candidate_set, alpha = 0.05) # --- Exact vs. approximate --- r_exact <- closedBY(p, alpha = 0.05, approximate = FALSE) r_approx <- closedBY(p, alpha = 0.05, approximate = TRUE) cat("Exact:", r_exact, " Approximate:", r_approx, "\n")
Applies the closed testing version of the e-BH (e-values Benjamini-Hochberg)
procedure. The standard eBH procedure controls the false discovery rate (FDR)
at level but only provides a single set of rejections. The closed
eBH procedure provides simultaneous FDR control: for every set of
hypotheses, it determines whether that set can be reported as discoveries
while maintaining FDR control at level , regardless of which
other sets were inspected.
closedeBH(e, set = NULL, alpha = 0.05, approximate = FALSE)closedeBH(e, set = NULL, alpha = 0.05, approximate = FALSE)
e |
Numeric vector of e-values, one per hypothesis. E-values must be non-negative; each is interpreted as the evidence against its null hypothesis. The e-values should have expectation at most 1 under the null hypothesis. |
set |
Optional subsetting vector for |
alpha |
Numeric scalar in |
approximate |
Logical. If |
The closed eBH procedure is based on the concept of mean consistency. A
set of hypotheses is mean consistent — and therefore a valid
simultaneous rejection — if and only if:
is satisfied jointly for all , when hypotheses are tested.
In practice, this condition guarantees
that is a valid closed-testing rejection, providing post-hoc FDR
control: you may report any mean-consistent set as your discovery set without
inflating the FDR above , even if the choice of set was
data-driven.
The function has two modes:
Set-checking mode (when set is supplied): Returns TRUE if the
specified set is mean consistent (i.e., can be reported as a valid
simultaneous rejection at level ), and FALSE otherwise.
Discovery mode (when set = NULL): Returns the size of the
largest mean-consistent set. The hypotheses with the largest
e-values always form one such set. This gives the maximum number of
hypotheses that can be reported while maintaining simultaneous FDR control.
Note that mean consistency is not a monotone property: a set of size
being mean consistent does not imply that all smaller sets are as well.
The exact algorithm therefore checks all set sizes, while the approximate
algorithm (approximate = TRUE) uses a faster bisection strategy that may
occasionally underestimate the largest consistent set.
If set is supplied: a single logical value. TRUE indicates that the
specified set is mean consistent and can be reported as a simultaneous
rejection at FDR level . FALSE indicates it cannot.
If set = NULL: a single non-negative integer . The
hypotheses with the largest e-values form a valid simultaneous rejection
set. A return value of 0 means no non-empty set can be rejected.
Benjamini, Y., & Hochberg, Y. (1995). Controlling the false discovery rate: a practical and powerful approach to multiple testing. Journal of the Royal Statistical Society: Series B, 57(1), 289–300.
Wang, R., & Ramdas, A. (2022). False discovery rate control with e-values. Journal of the Royal Statistical Society: Series B, 84(3), 822–852.
Xu, Z., Solari, A., Fischer, L., de Heide, R., Ramdas, A., & Goeman, J. (2025). Bringing closure to false discovery rate control: A general principle for multiple testing. arXiv preprint arXiv:2509.02517.
p.adjust() for non-simultaneous multiple testing corrections.
set.seed(42) # 20 null hypotheses (e ~ Exp(1)) and 10 non-nulls (e ~ Exp(0.1), larger on average) e <- c(rexp(20, rate = 1), rexp(10, rate = 0.1)) # --- Discovery mode --- # Find the maximum number of simultaneous rejections at FDR level 5% r <- closedeBH(e, alpha = 0.05) cat("Largest simultaneous rejection set:", r, "\n") # The r hypotheses with the largest e-values form a valid discovery set discovery_set <- e >= sort(e, decreasing = TRUE)[r] cat("E-values in discovery set:", round(sort(e[discovery_set], decreasing = TRUE), 2), "\n") # --- Set-checking mode --- # Check whether a researcher-defined set is a valid simultaneous rejection candidate_set <- e > 3 closedeBH(e, set = candidate_set, alpha = 0.05) # --- Exact vs. approximate --- r_exact <- closedeBH(e, alpha = 0.05, approximate = FALSE) r_approx <- closedeBH(e, alpha = 0.05, approximate = TRUE) cat("Exact:", r_exact, " Approximate:", r_approx, "\n")set.seed(42) # 20 null hypotheses (e ~ Exp(1)) and 10 non-nulls (e ~ Exp(0.1), larger on average) e <- c(rexp(20, rate = 1), rexp(10, rate = 0.1)) # --- Discovery mode --- # Find the maximum number of simultaneous rejections at FDR level 5% r <- closedeBH(e, alpha = 0.05) cat("Largest simultaneous rejection set:", r, "\n") # The r hypotheses with the largest e-values form a valid discovery set discovery_set <- e >= sort(e, decreasing = TRUE)[r] cat("E-values in discovery set:", round(sort(e[discovery_set], decreasing = TRUE), 2), "\n") # --- Set-checking mode --- # Check whether a researcher-defined set is a valid simultaneous rejection candidate_set <- e > 3 closedeBH(e, set = candidate_set, alpha = 0.05) # --- Exact vs. approximate --- r_exact <- closedeBH(e, alpha = 0.05, approximate = FALSE) r_approx <- closedeBH(e, alpha = 0.05, approximate = TRUE) cat("Exact:", r_exact, " Approximate:", r_approx, "\n")
Applies the closed testing improvement of the Su (2018) procedure. The
standard Su procedure controls the false discovery rate (FDR) at level
under the PRDN assumption but only provides a single set of
rejections. The closed Su procedure provides simultaneous FDR control:
for every set of hypotheses, it determines whether that set can be reported
as discoveries while maintaining FDR control at level ,
regardless of which other sets are inspected.
closedSu(p, set = NULL, alpha = 0.05, approximate = FALSE)closedSu(p, set = NULL, alpha = 0.05, approximate = FALSE)
p |
Numeric vector of p-values, one per hypothesis. |
set |
Optional subsetting vector for |
alpha |
Numeric scalar in |
approximate |
Logical. If |
A set of hypotheses is closed-Su-significant — and therefore a
valid simultaneous rejection — if and only if, for every
and , the combined sorted
vector of the largest p-values
in and the largest p-values outside satisfies
where is the Lambert W correction
factor (Xu et al., 2025, Section 6.2).
The function has two modes:
Set-checking mode (when set is supplied): Returns TRUE if the
specified set is closed-Su-significant, and FALSE otherwise.
Discovery mode (when set = NULL): Returns the size of the
largest closed-Su-significant set consisting of the smallest
p-values.
If set is supplied: a single logical (TRUE/FALSE).
If set = NULL: a single non-negative integer (0 = no rejection).
Su, W. J. (2018). The FDR-Linking Theorem. arXiv:1812.08965.
Xu, Z., Solari, A., Fischer, L., de Heide, R., Ramdas, A., & Goeman, J. (2025). Bringing closure to false discovery rate control. arXiv:2509.02517.
set.seed(42) p <- c(runif(20), rbeta(10, 0.1, 1)) # Discovery mode r <- closedSu(p, alpha = 0.05) # Set-checking mode closedSu(p, set = p < 0.01, alpha = 0.05)set.seed(42) p <- c(runif(20), rbeta(10, 0.1, 1)) # Discovery mode r <- closedSu(p, alpha = 0.05) # Set-checking mode closedSu(p, set = p < 0.01, alpha = 0.05)
Provides functions for applying the closed e-Benjamini-Hochberg (eBH) and closed Benjamini-Yekutieli (BY) procedures within the e-Closure framework for simultaneous FDR control in multiple hypothesis testing.
The two core functions are:
closedBH — applies the closed BH procedure
closedeBH — applies the closed eBH procedure
closedBY — applies the closed BY procedure
closedSu — applies the closed Su procedure
Jelle Goeman
Xu, Z., Solari, A., Fischer, L., de Heide, R., Ramdas, A., & Goeman, J. (2025). Bringing closure to false discovery rate control: A general principle for multiple testing. arXiv preprint arXiv:2509.02517.