Universal PBM experiments are often performed with several conditions of interest, e.g. allelic variants, assayed on separate arrays of the same plate with few replicates. Within and across plates, probe intensities can vary for biologically uninteresting reasons, such as concentration differences. To explicitly correct for these differences, normalization is performed in two steps.
First, normalization is performed within replicates (plates). More detail on this procedure
can be found in the normalizeWithinReplicates
documentation.
Second, normalization is performed across replicates (plates) with the assumption that
biologically uninteresting differences between replicates affect probe intensities
both multiplicatively and additively on the log-scale. A single log-scale multiplicative normalization
factor is first estimated for all samples within a replicate. Then, a log-scale additive normalization
is estimated such that the median intensities of the baseline
samples in each replicate
are equal. More details on this calculation are provided below.
normalizeAcrossReplicates( pe, assay = SummarizedExperiment::assayNames(pe)[1], group = "id", stratify = "condition", baseline = NULL, verbose = FALSE )
pe | SummarizedExperiment object containing GPR intensity information. |
---|---|
assay | a string name of the assay to normalize.
(default = |
group | a character string specifying a column in |
stratify | a character string specifying a column in |
baseline | a character string specifying the baseline condition in the |
verbose | a logical value whether to print verbose output during analysis. (default = FALSE) |
Original PBMExperiment object with assay containing cross-replicate normalized intensities
("normalized"
) and new columns added to the colData, "acrossRepMultScale"
and "acrossRepAddScale"
,
containing the inverse of the log-scale multiplicative and additive scaling factors used to normalize intensities.
If an assay with the same name is already included in the object, it will be overwritten.
The following procedure is used to estimate the log-scale multiplicative factor
for each replicate. First, a cross-replicate reference is computed for each baseline condition
(specified by stratify=
and baseline=
) by taking the cross-replicate mean quantiles
of the observed log2 intensities. Next, a per-replicate log multiplicative scaling factor is
computed by taking the median ratio of the rank-ordered and median-centered log-probe intensities
between the baseline samples in each replicate and the reference distribution. Visually,
this can be interpreted as the approximate slope of the quantile-quantile (QQ) plot generated
using log-scale intensities. To reduce the impact of outlier probes, scaling factors are estimated
using only the middle 80
After log-scale multiplicative factors have been estimated to correct for differences in
log-scale variance across replicates, a second log-scale additive factor is estimated
for each replicate to correct for differences in log-scale shift. A "global median" intensity
is first calculated across replicates by taking the geometric mean of the median
intensities in all baseline
samples across replicates. This "global median" is computed
using the input probe intensities, i.e. without any cross-replicate normalization.
The log-scale additive factor estimated as the difference between the median normalized probe
intensity of the baseline
sample in each replicate and the "global median".
While the log-scale additive factor is estimated using only baseline
samples, the normalization
is applied to all samples in the replicate.
Cross-replicate normalization is first carried out for replicates containing a baseline sample as
described above. Replicates without a baseline sample are then normalized to already normalized
replicates using overlapping conditions in the stratify=
column.