Linear regression and nonlinear regression are identical in their goal of finding optimized model parameter values that minimize the sum of square error between modeled and observed values of the dependent variable. You might even say that linear regression is simply a special case of nonlinear regression, where linear regression allows for simpler mathematics to be derived for calculating the optimized parameter values.
This module will walk through solving a linear regression with a nonlinear regression tool, to give confidence in interpretation of the statistics returned by an iterative nonlinear regression algorithm. Then, a nonlinear hyperbolic saturation model is derived and used in examples of performing a nonlinear regression on a constructed data set with known statistical properties. As with the linear regression exercises in the previous modules, this material is intended to build scaffolding toward understanding the propagation of error through nonlinear regressions using Monte Carlo techniques.
If you worked through the review of linear regression, you should have code for a single iteration of a Monte Carlo analysis something like the following. Material here will start with this code, so you might want to start with a copy of your script or cut and paste from below to follow along.
The primary difference between linear and nonlinear regressions is that the algorithm for finding optimized parameter values for a nonlinear model often requires an iterative trial-and-error approach. An iterative approach is needed because the mathematics for calculating the statistics often cannot be solved analytically, and thus need to be solved numerically. Another benefit of the numerical approach is generality, in that it should work for well-constrained models regardless of whether or not the mathematics could be solved analytically. For example, when applied to the same data set, an iterative numerical solution algorithm applied to a linear model (e.g., using nls()) should provide nearly the same statistics as a simple linear regression analysis (e.g., using lm()). We can convince ourselves that these tools provide almost identical results by performing a non-linear regression with a linear model, and seeing if we get the same results as a linear regression with the same data. (17:00 min).
After an introduction to the use of nls(), we can start thinking about an example of a nonlinear model with which to use it. Part of having an intuitive feel for a given nonlinear equation may start with understanding how it is derived. Understanding the derivation also often helps with intuition about how the curve will change when the parameter values are changed. This module will use a hyperbolic saturation function as an example. Hyperbolic saturation models are frequently applied to enzyme kinetics (Michaelis Menten equation) and the growth of microbial biomass (Monod equation). Let's develop some intuition for the hyperbolic saturation function by deriving it from the most basic definition of a hyperbola (9:23 min).
Perhaps this would be a good time to build some numerical intuition for a hyperbolic saturation function by writing a script that will let you quickly visualize a comparison of two hyperbolic saturation curves with differing parameter values. Let's see what happens when you change the value of the half-saturation parameter without changing anything else. Here is a script to get you started, but maybe this is a good opportunity to see if you can write a script like this on your own? You will want to create a vector of values for your independent variable. Apply the mathematics of the hyperbolic saturation model to the independent variable twice, each time applying a different value for the half-saturation constant to generate a different vector of predictions for the dependent variable. Then, plot both vectors of predictions for the dependent variable vs. the vector of values for the independent variable on the same axes for comparison.
When working with a new model, playing with it numerically in this fashion is a good idea before you try to fit the model to data with tools like nls(). If you have observations you will eventually be trying to fit, you might even use trial and error with the parameter values to manually fit the data using visualizations similar to those generated by the code above. Having an idea of what kind of numbers to expect from a nonlinear regression tool based on a visual fit to the data is very helpful in developing intuitive feel for the model, as well as intuitive feel for what you expect to see from tools like nls().
After developing some intuition for nls() and for the hyperbolic saturation model, you are now ready to test whether a nonlinear regression will behave as expected based on artificial data with known statistical properties. This requires the same code as running a single iteration of a Monte Carlo propagation of error for a nonlinear regression (14:36 min).
Click this link to download the MS PowerPoint file
The embedded Google viewer below sometimes provides poor renderings of Microsoft files. Use the link above to download the original file with proper formatting.
Example of a script that compares results from an analytical linear regression solution (e.g. lm()) to results from a nonlinear regression algorithm applied to a linear model (e.g., nls()).
Example of a script that performs a nonlinear regression on a single Monte Carlo realization constructed with a hyperbolic saturation model.
We will be covering the details of R data structures like vectors, lists, matrices, and data frames as we need them. However, the relevant details on R data structures we will use the most have been compiled into a single document. These notes attempt to cover the nature of R data structures that I have seen cause the worst misconceptions or the hardest to find bugs in student's code.
Notes on the basic R data structures used in this class (link to the full page HTML version)
Rmarkdown source code for detailed notes on basic R data structures (download the Rmd file)
Notes on the basic R data structures used in this class (download the postscript PDF file)Â
Most exercises will not require sophisticated graphing skills, and the materials for this class provide examples using base R graphics. However, base R graphics provide an incredibly flexible graphing tool and understanding just a few fundamentals gives you the capacity to tweak graphs to look exactly as you would like. The following is a document I have started (generated by Rmarkdown) that at least initiates a deeper dive into graphing with base R.
A deeper dive into graphing in base R (link to the full page HTML version)
A deeper dive into graphing in base R (download the fully encapsulated HTML version)
Rmarkdown source code for a deeper dive into graphing in base R (download the Rmd file)
A deeper dive into graphing in base R (download the postscript PDF file)