Universal PBM experiments are often performed with several conditions of interest, e.g. allelic variants, assayed on separate arrays of the same plate with few replicates. Within and across plates, probe intensities can vary for biologically uninteresting reasons, such as concentration differences. To explicitly correct for these differences, normalization is performed in two steps.
First, normalization is performed within replicates (plates) with the assumption that biologically uninteresting differences only affect probe intensities multiplicatively. Normalization factors are estimated for each sample relative to a baseline condition on each plate. The baseline should ideally be a replicate wild type or other natural reference condition included in each replicate (plate). This function includes approaches for performing this step of normalization.
Second, normalization is performed across replicates (plates). More detail on this procedure
can be found in the normalizeAcrossReplicates
documentation.
The approaches to normalization implemented in this function make a fundamental assumption that lower-tail probe intensities are distributed similarly across the conditions being normalized. This assumption is generally satisfied for allelic variants of the same transcription factor or transcription factors with similar binding affinities. However, this assumption may not always hold, e.g. if comparing proteins of completely different families. In these cases, normalization should be performed with caution, and analyses and plots comparing the distributions of lower-tail probe intensities should be explored.
normalizeWithinReplicates( pe, assay = SummarizedExperiment::assayNames(pe)[1], method = c("tmm", "quantile"), q = 0.6, qlower = 0, qdiff = 0.2, group = "id", stratify = "condition", baseline = NULL, verbose = FALSE )
pe | a SummarizedExperiment object containing GPR intensity information. |
---|---|
assay | a string name of the assay to normalize.
(default = |
method | a string specifying the method to use for normalization. Must be one of
|
q | a percentile between 0 and 1 specifying either the upper quantile of probes to include
for normalization when |
qlower | a percentile between 0 and 1- |
qdiff | a percentile between 0 and 0.5 specifying the additional fraction of lower-tail
probes to filter based on the deviation from the baseline condition when |
group | a character string specifying a column in |
stratify | a character string specifying a column in |
baseline | a character string specifying the baseline condition in the |
verbose | a logical value whether to print verbose output during analysis. (default = FALSE) |
Original PBMExperiment object with assay containing within-replicate normalized intensities
("normalized"
) and a new column added to the colData, "withinRepScale"
,
containing the inverse of the scaling factors used to normalize intensities.
If an assay with the same name is already included in the object, it will be overwritten.
The trimmed mean of M-values ("tmm"
) method implemented in this function for cross-sample normalization
within replicates is based on the popular TMM method for RNA-seq data included
in the edgeR
package. Very simply, a normalization factor is estimated as the trimmed mean
of probe-level log-scale differences between the baseline condition and sample using the lower
[qlower, q]
percentile probes. Probes are ordered by the log-scale average intensity across
the baseline condition and sample. The trimmed mean is calculated excluding the top and bottom qdiff
probes.
Unlike RNA-seq expression estimates, PBM data show near-constant variance in log-scale differences as a function of the log-scale mean intensities. Therefore, a simplified variant of the original TMM method is used, where precision weights are not introduced.
The quantile-based ("quantile"
) method should not be confused with what is commonly referred to
as ``quantile normalization." Here, quantile-based normalization computes scaling factors across