Fit k-mer probe set models

Given a PBMExperiment of probe-level summaries returned by probeFit and a list of k-mers, this function applies probe set aggregation to obtain k-mer level estimates of affinity and variance on the scale of log2 signal intensities. Additionally, if contrasts = TRUE, effect size and variance estimates are also returned for differential k-mer affinities against a baseline condition specified with baseline=.

The output can be passed to kmerTestContrast, kmerTestAffinity, kmerTestSpecificity to perform various statistical tests with the estimated k-mer statistics.

kmerFit(
  pe,
  kmers = uniqueKmers(8L),
  positionbias = TRUE,
  method = c("dl2", "dl"),
  contrasts = TRUE,
  baseline = NULL,
  outlier_cutoff = stats::qnorm(0.995),
  outlier_maxp = 0.2,
  verbose = FALSE
)

Arguments

pe	a PBMExperiment object containing probe-level summarized intensity data returned by `probeFit`.
kmers	a character vector of k-mers. (default = `uniqueKmers(8L)`)
positionbias	a logical value whether to correct for bias due to position of k-mer along probe sequence. (default = TRUE)
method	a character name specifying the method to use for estimating cross-probe variance in each k-mer probe set. Currently, the two-step DerSimonian-Kacker ("dl2") and non-iterative DerSimonian-Laird ("dl") methods are supported. (default = "dl2")
contrasts	a logical value whether to compute contrasts for all columns against a specified `baseline` column. (default = TRUE)
baseline	a character string specifying the baseline condition across `pe` columns to use when calculating contrasts. If not specified and set to NULL, the baseline value is guessed by looking for ``ref" in the column names of `pe`. If a unique matching value is not found, an error is thrown. This parameter is ignored when `contrasts = FALSE`. (default = NULL)
outlier_cutoff	a numeric threshold used for filtering probes from k-mer probe sets before fitting each k-mer level model. The threshold is applied to the absolute value of an approximate robust studentized residual computed for each probe in each probe set and can be turned off by setting the value to NULL. By default, approximate 0.5 (default = `stats::qnorm(0.995)`)
outlier_maxp	a numeric threshold on the maximum proportion of probes to filter for each k-mer probe set according to `outlier_cutoff`. This should be set to a reasonably small value to avoid over-filtering based on the approximate residual threshold. (default = 0.2)
verbose	a logical value whether to print verbose output during analysis. (default = FALSE)

Value

SummarizedExperiment of estimated k-mer affinities and differences with some or all of the following assays:

"affinityEstimate": k-mer affinities.
"affinityVariance": k-mer affinity variances.
"contrastDifference": (optional) k-mer differential affinities with baseline condition.
"contrastAverage": (optional) k-mer average affinities with baseline condition.
"contrastVariance": (optional) k-mer differential affinity variances.

If computed, the values of the "contrast" assays will be NA for the specified baseline condition.

Details

By default, probe intensities are corrected within each k-mer probe set to account for biases introduced by where the k-mer lies along the probe sequence. Bias correction is performed such that the mean cross-probe intensity for each k-mer is (typically) unchanged. This bias correction step only serves to reduce the cross-probe variance and improve downstream inference for each k-mer.

For many low affinity k-mers, probe sets may include several probes with high intensity due to the k-mer sharing a probe with a separate high affinity k-mer. These probes do not provide meaningful affinity information for the lower affinity k-mer. To adjust for this possibility, outlier probes are filtered from each k-mer probe set prior after position bias correction, but before aggregation. Probes with large approximate studentized residuals are filtered from each probe set according to a user-specified threshold (outlier_cutoff). However, to prevent overfiltering, a maximum proportion of probes to filter from any probe set should also be specified (outlier_maxp).

After bias correction and probe filtering, a meta analysis model is fit to each probe set. Under this model, cross-probe variances are estimated using either the DerSimonian and Kacker (2007) or DerSimonian and Laird (1986) estimator. The estimated k-mer affinities and variances are included in the returned SummarizedExperiment as two assays, "affinityEstimate" and "affinityVariance".

If contrast = TRUE, k-mer differential affinities, the corresponding variances, and average affinities are also returned as three assays, "contrastDifference", "contrastVariance", and "contrastAverage". Positive differential affinities indicate higher affinity relative to the baseline condition.

References

If using method = "dl2" cross-probe variance estimator:

DerSimonian, R., & Kacker, R. (2007). Random-effects model for meta-analysis of clinical trials: an update. Contemporary Clinical Trials, 28(2), 105-114.

If using method = "dl" cross-probe variance estimator:

DerSimonian, R., & Laird, N. (1986). Meta-analysis in clinical trials. Controlled Clinical Trials, 7(3), 177-188.

Cross-probe variance estimation code adapted from:

Viechtbauer, W. (2010). Conducting meta-analyses in R with the metafor package. Journal of Statistical Software, 36(3), 1-48. URL: http://www.jstatsoft.org/v36/i03/

Arguments

Value

Details

References

See also