Any science needs to faithfully connect its observations and measures with a sciences' systematic implications.
Analysis methods are a filter and a lens in the process leading from observations to knowledge and value.
In the communication between the data generating process and knowledge, analysis methods can be low- or high-fidelity. The details of how this connection is performed matter.
All but the most trivial of scientific questions will require modeling. The numerous decisions and assumptions made in the process of regression modeling and estimation determine the accuracy of any signal or pattern that is the target of scientific interest.
Modeling is not simply the plumbing between data and conclusions. Most observational data in health science are generated by largely nonrandom and poorly understood mechanisms of subject selection, exposure assignment, measurement error and missing data; and are analyzed to evaluate effects that are latent under these mechanisms.
Modeling well is a craft: much more than just facility with a toolkit of regression techniques. Rigorous understanding of the specific purposes of the research, the specific analytic strategy and modeling tools employed, a thorough understanding of the raw material (data and assumptions) that are fed to the model and the processes that generated the data, are all part of a successful program of scientific modeling.
Modeling well begins with asking good questions.
A principled, well integrated and coherent, consistent, and reliable process translating data into meaning is the craft of good science.
Current areas of focus:
GoodScience is advising on modern applied methods for observational data analysis
Using subject-matter based structural causal models to guide model building and causal inference
Longitudinal analysis methods for observational studies
Design and evaluation of rigorous inference using external comparators
Simulation and prototyping of evidence generation to optimize evidence generation strategies
A "systems-thinking" perspective is crucial for the analytic design connecting the data-generating process with the inference-generating process. By identifying the numerous sources of information loss or distortion (noise) in the health research process and applying the best solution set, we can create a more efficient, reliable and productive evidence generation system. This has very pragmatic implications for optimizing value generation for any evidence generation endeavor; and ultimately, for improving outcomes for patients.
(view in presentation mode)