File Name: linear and nonlinear models fixed effects random effects and mixed models .zip
- A brief introduction to mixed effects modelling and multi-model inference in ecology
- Select a Web Site
- Fixed effects model
Skip to search form Skip to main content You are currently offline.
A brief introduction to mixed effects modelling and multi-model inference in ecology
Thank you for visiting nature. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser or turn off compatibility mode in Internet Explorer.
These results can inform researchers in the selection of CCLs to include in drug studies. Additionally, this study illustrates the need for assessing the dose-response functional form and the use of NLME models to achieve more stable estimates of drug response parameters. Over the past decades, the cancer cell lines CCLs have been widely used to study the biological processes in cancer, as well as in vitro drug screening for discovering and assessing the effectiveness of anticancer therapeutics 1.
Moreover, using in vitro CCL models to study cancer pharmacogenomics 2 can be helpful to understand the resistance and sensitivity to therapy currently in use in cancer treatment, explore genomic factors associated with drug response, and develop more anticancer drugs 3.
Recently, two independent large-scale studies, the Cancer Cell Line Encyclopedia CCLE 4 and the Genomics of Drug Sensitivity in Cancer GDSC 1 , 5 , were completed in which drug response information was collected on a number of therapeutic agents in addition to extensive molecular information i.
The use of large-scale drug studies on CCLs depends on the reliability and reproducibility of drug response assessments. Despite the wide use of CCLs for drug response studies, inconsistency in drug-response data and poor concordance between mutational profiles of CCLs compare to patient tumors have been reported 6 , 7.
Multiple factors are likely to contribute to these observed inconsistencies in the drug-response data, including methodological and analytical challenges due to the differences in assay types, maximum tested drug concentration, range of tested drug concentrations, and drug sensitivity measurements employed by different studies 6 , 8.
In response, the authors of CCLE and GDSC reported a significantly better agreement for the differences between these two pharmacological data by incorporating both analytical and biological considerations 9. However, discrepancies in the measured drug sensitivities still persist Furthermore, consistency can be achieved when biologically grounded analysis methods are incorporated using the standardization of assay methods and laboratory conditions 11 , Testing of drug sensitivity has been a routine procedure in clinical and laboratory researches.
Dose-response data collected on CCLs often are sigmoidal in shape and thus nonlinear logistic models are often used to model the data for each cell line individually 4 , 5 , 10 , 12 , Analysis of dose-response data typically focuses on the EC50 i. However, the dose-response curves can differ in other aspects, such as the slope of the curve or the area under the curve. Analyzing each cell line individually across a drug combination is highly problematic since the variation exhibited across the cell lines is ignored and this information could be harnessed inform the response profile of any single cell line i.
Incorporating variation across all the cell lines in dose-response curve fitting can reduce noise and lead to more reliable inferences for any single CCL. Recently, the nonlinear mixed-effects NLME model has become an important approach to improve the accuracy of EC50 estimates or similar parameters , through the borrowing of information across all CCLs 15 , Such models allow one to account for the repeated measures aspect in the data i.
The objectives of this study are first to determine difference in model functional form e. Then, for each cancer type and drug combination, a NLME model was fit to the drug-response data collected on the CCLs that had the same nonlinear functional form. Using the CCL specific estimated random effects from the NLME model, a set of cell lines were determined to be consistently sensitive or resistant to a large number of drugs.
These findings can aid cancer researchers in the selection of cell lines to include in their experiments by eliminating CCLs that are always sensitive or resistant to drugs as results may not be generalizable. Moreover, this study illustrated the need for assessing model functional form for drug-response data and the ability to model all cell lines simultaneously using a NLME model to provide more stable estimates of drug response parameters. In total, the analysis consisted of in vitro drug response data for 8-point dose were collected on 24 compounds assessed on cancer cell lines.
Supplemental Fig. The drug-response data were generated over 9 drug concentrations 2-fold dilution series or 5 drug concentrations 4-fold dilution series. However, it should be noted that the two studies used different experimental protocols such as differences in the pharmacological assay types and the range of drug concentrations 6.
Typically, an individual dose-response model is fitted to the in vitro dose response data for each CCL individually for a given drug. From this nonlinear model, the EC50 is estimated, as either the inflection point of the sigmoidal curve e. The absolute and relative EC50 will be the similar in the settings in which the top and bottom asymptotes of the dose-response curve are close to and 0, respectively.
However, the absolute EC50 is not estimable in all cases, while the relative EC50 is estimable in all cases, with the relative EC50 being recommended for use by Sebaugh Thus, we have chosen to use the relative EC50 which we refer to as the just the EC50 in the remainder of the paper.
The most commonly used parametric nonlinear function for the observations on a given cell line is either a three-parameter 3P or four-parameter 4P logistic regression Although the functional form of the model remains the same for all cell lines e. Recently, nonlinear mixed-effects NLME model 15 , 17 for repeated measures dose-response data have become popular due to their flexible covariance structure which allows for the joint modeling of multiple in vitro measurements taken off a set of CCLs.
An advantage of using the NLME model is that both within-cell line and between-cell line variation are accounted to improve the statistical estimation of parameters.
Moreover, the problem of extreme estimates caused by considering a single drug-cell line is reduced by using the NLME model where the information across cell lines is borrowed, with individual cell line parameters shrunk towards the population-level parameters 15 , 16 , Let x ij denote the corresponding j th drug dose assayed for the i th cell line. The relationship between the dose-response data and the drug doses can be described by the parametric nonlinear model,.
In Eq. One commonly used functional form is the 4P logistic regression model,. Under the classical assumption of normally distributed response values, the estimation of parameters in the 4P logistic regression model simplifies to nonlinear least squares approach, where the nonlinear least squares estimates are obtained by minimizing the weighted residual sum of squares However, in this study all responses are weighted equally.
A Nelder-Mead derivative-free optimization algorithm is used to minimize the sum of squares of the residuals Choice of the 4P logistic regression model is fairly typical in dose-response data analysis, while for some dose-response data, the 4P logistic regression model does not provide an adequate fit. The nonlinear 3P and 4P logistic regression models, described in Eqs 1 and 2 , can be fit using drc package in R Due to the behavior of the data, it may happen that neither 4P and nor 3P logistic regression models provide an adequate fit to the data.
In such a case, other linear or nonlinear models can be fit to the data. The best fitting model was then determined for each CCL drug combination. Then, for each cancer site and drug combination, the model that fit most of the CCLs the best was selected as the best fitting model. NLME model that is a generalization of both the linear mixed-effects model and the standard nonlinear fixed-effects model 15 , To model the drug response data for all CCLs for a given drug simultaneously, a NLME model is considered where the function form was either 3P or 4P logistic regression model.
This can be accomplished with a hierarchical model framework by adding the following second-level model to the nonlinear model outlined in Eqs 1 and 2. The within-cell variation in Eq. Here, for a given drug, the regression parameters refer to the EC50 and slope can vary from cell line to cell line, thus resulting in two random effects for each parameter Supplemental Fig.
As a different NLME model was fit for all cell lines of a given cancer type and drug, we standardize the estimated cell lines random effects for each drug to enable comparison across drugs. Similar to Haibe-Kains, Benjamin, et al. For a given drug, we fit two nonlinear logistic regression models 3P or 4P logistic and a linear model LM to the dose-response data for each cell line, with the best fitting model determined based on AIC Supplemental Fig.
From the NLME model, the estimated random effect for the EC50 was utilized to identify the outlier cell lines, as described in the following section.
To determine the cell lines that are consistently sensitive or resistant to a number of drugs assayed in the CCLE and GCSC, and therefore potential cell lines to remove from future drug studies, the estimated random effects were examined for each cell line and drug combination determined from the NLME model, as outlined in previous section A summary of the model fits across cancer types and drugs was presented in Fig.
In contrast, the 3P logistic model fit the best for the majority of the drugs for the biliary track and salivary gland CCLs. Heatmap of the proportion of cell lines where the best functional model using AIC was either the 3P logistic nonlinear model, 4P logistic nonlinear model, or the linear model LM computed across 24 drugs and 23 cancer types in the CCLE.
The grey color represents situations where no drug treatment for a given cancer type. At the drug level i. Assessment of modeling assumptions was verified for 32 NLME models, as outlined and presented in the Supplemental Methods. The estimated cell lines random effects EC50 for each drug were standardized to enable comparison across drugs. Two proposed approaches were considered to identify the outlier cell lines using SREs for EC50 parameter. For the CCLE data, NLME modeling was completed for 15 out of 23 cancer types with more than 10 cell lines fitting the most common functional form for this cancer type and drug.
As the NLME model was fit to cell lines that fit a 3P or 4P model for a given drug, all cell lines were not included in each model. As an example, Fig. The distribution of SREs for these three cell lines are also above 0 and thus look to be resistant to the majority of drugs.
Similar to analysis completed on the CCLE, we completed the identification of outlier cell lines in the GDSC across 29 out of 54 cancer types with more than 10 cell lines.
The goals of this study were to first determine if there was a common functional form for in vitro drug response data generated on cell lines and second to determine using a nonlinear mixed-effects model cell lines that appear to be sensitive or resistant to a majority of the drug tested in either the CCLE or the GDSC studies. As the model for curve fitting has a direct impact on the measurement of drug potency e.
We observed that the best fitting dose-response relationships is often a 3 or 4 parameter logistic 3P logistic or 4P logistic nonlinear model. However for some dose-response data, neither the 4P or 3P logistic models provided an adequate fit to the data. These results illustrate the need to assess the functional form for in vitro studies to ensure the most accurate and precise estimates of drug response, such as the EC50 or the area under the dose response curve.
Further work is needed to assess other non-linear models besides the 3P and 4P logistic, such as the Cedergreen-Ritz-Streibig five-parameter model used in Moyer et al. The use of NLME model allows the joint modeling of all the CCL drug response data collectively with the ability to model the cell line-to cell line variation along with the within-cell line variation.
Moreover, the flexible covariance structure allows both within-cell line and cell line-to-cell line variations are accounted to improve the statistical analysis. Such NLME model uses the information across the cell lines to adjust the extreme estimates caused by considering a single drug-cell line i. Therefore, a NLME model was used to detect sensitive or resistant cell lines with the SREs for the EC50 parameter using two approaches, one based on determination of outliers using the boundaries of confidence intervals and one based on the distribution of SREs for a CCL.
As passage information for the cell lines was not available in the dataset, we were not able to assess if this was a factor that might explain why these CCLs were outliers. In conclusion, the results from this study can aid basic scientists in the selection of cell lines to include in experiments to ensure that results from the in vitro drug screens are generalizable. Additionally, this study illustrated the need for assessing functional form for the drug-response data and the ability to model all cell lines simultaneously using a NLME model to provide more accurate estimates of drug response parameters.
Garnett, M. Systematic identification of genomic markers of drug sensitivity in cancer cells. Nature , — Wang, L. Genomics and drug response.
Select a Web Site
Generalized linear mixed models or GLMMs are an extension of linear mixed models to allow response variables from different distributions, such as binary responses. Alternatively, you could think of GLMMs as an extension of generalized linear models e. The general form of the model in matrix notation is:. To recap:. So our grouping variable is the doctor. Not every doctor sees the same number of patients, ranging from just 2 patients all the way to 40 patients, averaging about
Linear and Nonlinear Models: Fixed Effects, Random Effects, and Mixed Models. Pages · · MB · Downloads· English. by Grafarend E. W.
Fixed effects model
Documentation Help Center. A mixed-effects model is a statistical model that incorporates both fixed effects and random effects. Fixed effects are population parameters assumed to be the same each time data is collected, and random effects are random variables associated with each sample individual from a population.
Linear Mixed Effects models are used for regression analyses involving dependent data. Such data arise when working with longitudinal and other study designs in which multiple observations are made on each subject. Some specific linear mixed effects models are.
The following information was supplied regarding data availability:. The use of linear mixed effects models LMMs is increasingly common in the analysis of biological data. Whilst LMMs offer a flexible approach to modelling a broad range of data types, ecological data are often complex and require complex model structures, and the fitting and interpretation of such models is not always straightforward.
In statistics , a fixed effects model is a statistical model in which the model parameters are fixed or non-random quantities. This is in contrast to random effects models and mixed models in which all or some of the model parameters are random variables.