Statistical Analysis of the Kdo2-Lipid A/compactin time course experiments

The experiments and pre-processing of data

3 biologically replicated experiments were performed in the KDO/compactin timecourse experiments. Control, KDO treated, compactin treated, KDO+compactin treated samples were collected for measurement at 0.5, 1, 2, 4, 8, 12, and 24hrs, and control samples were collected at time 0. There are no technically replicated measurements except for the duplicated glycerophospholipid measurements of the first experiment, in which case the average of the 2 technically replicated measurements are used in the statistical analysis. Data from time 0 is not included in the analysis, and negative and zero values of measurement are removed before analysis.

Treatment/Control Ratios

Statistical analysis is performed on the treatment/control ratios for each individual lipid analyte. The analysis consists of 2 steps: model selection and significance test. For model selection, we start with a "maximal" model.

log(Xijkl) = 0 + aKi + bCj + ckTk + dlRl + eijkl

where Xijkl is the Treatment/Control Ratio of measured lipid amounts, with subscripts i, j, k, and l indexing, in sequence, the KDO treatment, compactin treatment, time points, and biological replicates (i.e., cell batches). The independent variables are: Ki = 0 for without or 1 for with KDO treatment, Cj = 0 for without and 1 for with compactin treatment, Tk = 0 or 1 for not being or being at the kth time point, and Rl = 0 or 1 for not being or being the lth biological replicate. eijkl denote the random error of experiment. Constants a, b, ck, and dl are to be estimated from the data by linear regression.Note that the model regards time as a categorical instead of a numerical variable and hence permits the response (the right hand side) to be a non-linear function of time.

This "maximal" model is simplified by successively removing individual terms (independent variables) until the Akaike Information Criterion

AIC = 2k - 2ln(L)

is minimized, where k = number of parameters, and L = the maximized value of the likelihood function for the estimated model.AIC is a balance between the conflicting attempts of improving the goodness of fit and reducing the complexity of model.For example, in the case of cholesterol the model minimizing AIC is

log(Xijkl) = 0 + aKi + ckTk + dlRl + eijkl

This simplified model is used in significance test for this lipid.

The statistical significances (p-values) of the individual terms in the simplified model are assessed by ANOVA, with the sums of squares (SS) and mean squares (MS) of the terms computed using the hierarchical method, i.e., each term in the ANOVA is adjusted for all the terms that precede it in the model. All test statistics (F values) use the MS of residuals from the full simplied model as the denominator.

All computation was done using the statistical computing package R.

Lipid Amount Measured

Statistical analysis is also performed on the measured amounts of lipid for each individual lipid analyte.The approach is the same as that for treatment/control ratios described above, except for that the "maximal" model is defined to be

Log(Xijkl) = m + aKi + bCj + ckTk + dlRl + eijkl

where Xijkl is the measured lipid amount and m the overall mean; the meaning of other variables and constants are similarly defined as in the analysis of ratios.