Statistical Analysis of the Kdo2-Lipid A timecourse experiments
The experiments and pre-processing of data
For most lipid classes studied in the KDO timecourse experiments, 3 biologically replicated experiments were performed, with each of them having 3 technical replicates (3x3 design). Control and KDO treated samples were collected for measurement at 0.5, 1, 2, 4, 8, 12, and 24hrs, and control samples were collected at time 0. Data from time 0 are not included in the analysis. Negative and zero values of measurement are removed before analysis.
Statistical analysis is performed on the normalized measurements of lipid amount (pmol/ug DNA) for each individual lipid analyte. The analysis consists of 2 steps: model selection and significance test. For model selection, we start with a "maximal" model.
log(Xijkl) = m + aKi + cjTj + dkRk + eijkl
where Xijkl is the Treatment/Control Ratio of measured lipid amounts, with subscripts i, j, k, and l indexing, in sequence, the KDO treatment, time points, biological replicates (i.e., cell batches), and technical replicates. The independent variables are: Ki = 0 for without or 1 for with KDO treatment, Tj = 0 or 1 for not being or being at the jth time point, and Rk = 0 or 1 for not being or being the kth biological replicate. eijkl denote the random errors of experiments. Constants a, b, cj, and dk are to be estimated from the data by linear regression. Note that the model regards time as a categorical instead of a numerical variable and hence permits the response (the right hand side) to be a non-linear function of time.
This "maximal" model is simplified by successively removing individual terms (independent variables) until the Akaike Information Criterion
AIC = 2k - 2ln(L)
is minimized, where k = number of parameters, and L = the maximized value of the likelihood function for the estimated model.AIC is a balance between the conflicting attempts of improving the goodness of fit and reducing the complexity of model.For example, in the case of 15-deoxy-PGJ2 the model minimizing AIC is
log(Xijkl) = m + aKi + cjTj + eijkl
This simplified model is used in significance test for this lipid.
The statistical significances (p-values) of the individual terms in the simplified model are assessed by ANOVA, with the sums of squares (SS) and mean squares (MS) of the terms computed using the hierarchical method, i.e., each term in the ANOVA is adjusted for all the terms that precede it in the model. All test statistics (F values) use the MS of residuals from the full simplified model as the denominator.
All computation was done using the statistical computing package R.