Post date: Aug 26, 2012 3:41:56 PM
People commonly classify empirical studies as experiments, quasi-experiments, field studies, archival studies, and simulations. In turn, field studies can be further divided into surveys and ethnographies.
The distinguishing feature of experiments and quasi-experiments is the use of an intervention or treatment of some kind, whose effects you are trying assess. Typically, they also involve pre and post measurements of the dependent variable. See the first half of the design elements page for a detailed discussion.
Experiments are seen as our best chance for determining causality, for two main reasons. First, because the independent variables are manipulated. One basic concept of cause is that if changing X changes Y, then X is a cause. (As opposed to simply observing that people with X=1 tend to have Y=1, while people with X=2 have Y=2.) If I kick a ball down the field, it sure seems like I caused the ball's new location. This applies to both true experiments and quasi-experiments. Second, and this applies to true experiments only, random assignment of cases to treatment groups controls for other factors. If you allow independent variables to be assigned naturally (as in, say, a study comparing health outcomes of smokers and non-smokers), there are a thousand other factors that go with that assignment, and there is no way to know which are creating the effect on Y. In other words, people who smoke are probably doing other things different as well, such as eating fast food, not exercising, and spending a lot of time outside of buildings in the middle of winter congregating with other smokers. In addition, there is a host of factors that led certain people to smoke and let others not to smoke, all of which could be related to outcomes.
But there is more to it than that. Suppose you look at whether taking classes on passing standardized tests like the GMAT actually improves test scores. The correlation between taking the classes and test scores provides some evidence. But suppose the reason that people study for the GMAT is that they previously took the GMAT and did poorly. This provides motivation for doing well on the second try that might not have been there before, and this is what drives performance, rather than the classes. In an observational study, this can't be ruled out. But if people are randomly assigned to class or not class, then we know that nothing associated with performance caused people to belong to one group or the other. There could not be endogeneity (where the independent variable is influenced by factors that also affect the dependent variable).
Unfortunately, while there could not be endogeneity in the experiment, it doesn't mean there isn't endogeneity in the real world. The experiment would convincingly show that, at least in one test, the classes made a difference. But, in the real world, it could still be that past performance determines present performance.
Note: treatments in experiments tend to be categorical variables, such as being assigned diet 1, diet 2, or diet 3. But in principle, they don't have to be. For example, you might give experimental subjects a quiz, then randomly assign fake grades from 0 to 100 to see how it affects their confidence. So a treatment is basically the same thing as an independent variable, except that you know that values of the independent variable were not caused by anything, since they were randomly assigned by the experimenter.
Field studies are observational studies. You go out into the "field" (broadly defined to include any natural setting or existing data) and observe nature as it happens. Field studies can often mimic experiments in the sense that one might observe several different situations that (appear to) vary by a single variable, very much like a treatment. For example, you compare communities that use electronic voting with communities that don't. Field studies range from doing surveys to participant observation and ethnographic interviewing. See the second half of the design elements page for more information.
Much research aims at identifying causality, though some studies are primarily descriptive or exploratory. The challenge with field studies is that when we fit a model explaining Y as a function of X (Y = b0 + B1X1 + B2X2 ..), we don't know what caused the X variables. It could be that whatever causes them causes Y directly, so the role of X is more of a bystander.
Archival studies / secondary data analysis
Archival studies analyze existing records and documents that were created for purposes other than research. Rather than generating new data through experiments, surveys, or observations, researchers examine pre-existing materials such as company reports, government statistics, newspaper articles, meeting minutes, financial records, or historical documents.
The key advantage of archival studies is that the data already exist, making them relatively inexpensive and allowing researchers to study phenomena over long time periods or examine events that would be impossible to recreate. For example, you might study organizational decision-making by analyzing decades of board meeting minutes, or examine economic trends using government census data from the past century.
However, archival studies face unique challenges. Since the data were generally not collected for research purposes, they may not perfectly match what the researcher needs. Important variables might be missing, definitions may change over time, or the records might be incomplete. Additionally, researchers must consider potential biases in how the original data were created and preserved - for instance, companies might selectively retain documents that make them look good.
Like other observational studies, archival research struggles with establishing causality. When analyzing the relationship between variables in historical records, researchers cannot control for all possible confounding factors or determine the direction of causation. Nevertheless, archival studies remain valuable for understanding long-term patterns, testing theories with real-world data, and investigating questions where experiments would be impractical or unethical.
Simulation studies tend to be controversial on a number of grounds, and some people do not regard them as empirical studies at all. A simulation creates a world (say, in a computer) that has certain rules, and then the researcher observes how these rules play themselves out. Experiments are done by varying the rules. In my view, simulations are empirical in the sense that, since we cannot analytically calculate the outcomes of a given set of rules, we observe and record what happens. But simulations are not empirical in the sense of the real world being examined. See the simulations page for a full discussion.
People commonly classify empirical studies as experiments, quasi-experiments, field studies, and simulations. In turn, field studies can be further divided into surveys, ethnographies and archival studies.
Experiments (and quasi-experiments)
The distinguishing feature of experiments and quasi-experiments is the use of an intervention or treatment of some kind, whose effects you are trying assess. Typically, they also involve pre and post measurements of the dependent variable. See the first half of the design elements page for a detailed discussion.
Experiments are seen as our best chance for determining causality, for two main reasons. First, because the independent variables are manipulated. One basic concept of cause is that if changing X changes Y, then X is a cause. (As opposed to simply observing that people with X=1 tend to have Y=1, while people with X=2 have Y=2.) If I kick a ball down the field, it sure seems like I caused the ball's new location. This applies to both true experiments and quasi-experiments. Second, and this applies to true experiments only, random assignment of cases to treatment groups controls for other factors. If you allow independent variables to be assigned naturally (as in, say, a study comparing health outcomes of smokers and non-smokers), there are a thousand other factors that go with that assignment, and there is no way to know which are creating the effect on Y. In other words, people who smoke are probably doing other things different as well, such as eating fast food, not exercising, and spending a lot of time outside of buildings in the middle of winter congregating with other smokers. In addition, there is a host of factors that led certain people to smoke and let others not to smoke, all of which could be related to outcomes.
But there is more to it than that. Suppose you look at whether taking classes on passing standardized tests like the GMAT actually improves test scores. The correlation between taking the classes and test scores provides some evidence. But suppose the reason that people study for the GMAT is that they previously took the GMAT and did poorly. This provides motivation for doing well on the second try that might not have been there before, and this is what drives performance, rather than the classes. In an observational study, this can't be ruled out. But if people are randomly assigned to class or not class, then we know that nothing associated with performance caused people to belong to one group or the other. There could not be endogeneity.
Unfortunately, while there could not be endogeneity in the experiment (because randomly assigning treatments guaranteed the independent variable was causally independent of every other variable), it doesn't mean there isn't endogeneity in the real world. The experiment would convincingly show that, at least in one test, the classes made a difference. But, in the real world, it could still be that past performance determines present performance.
Note: treatments in experiments tend to be categorical variables, such as being assigned diet 1, diet 2, or diet 3. But in principle, they don't have to be. For example, you might give experimental subjects a quiz, then randomly assign fake grades from 0 to 100 to see how it affects their confidence. So a treatment is basically the same thing as an independent variable, except that you know that values of the independent variable were not caused by anything, since they were randomly assigned by the experimenter.
Field studies (including surveys and secondary data analyses)
Field studies are observational studies. You go out into the "field" and observe nature as it happens. Field studies can often mimic experiments in the sense that one might observe several different situations that (appear to) vary by a single variable, very much like a treatment. For example, you compare communities that use electronic voting with communities that don't. Field studies range from studying archival data to doing surveys to participant observation and ethnographic interviewing. See the second half of the design elements page for more information.
Note that the term field study is used very broadly to include studies in which the data are from secondary sources, as in studying competitive actions of organizations as a function of market share, size and so on. So, under field studies, we include survey studies, archival studies, and ethnographies/case studies.
All research aims at identifying causality. The challenge with field studies is that when we fit a model explaining Y as a function of X (Y = b0 + B1X1 + B2X2 ..), we don't know what caused the X variables. It could be that whatever causes them causes Y directly, so the role of X is more of a bystander.
Simulations
Simulation studies tend to be controversial on a number of grounds, and some people do not regard them as empirical studies at all. A simulation creates a world (say, in a computer) that has certain rules, and then the researcher observes how these rules play themselves out. Experiments are done by varying the rules. In my view, simulations are empirical in the sense that, since we are too dumb to calculate the outcomes of a given set of rules, we observe and record what happens. But simulations are not empirical in the sense of the real world being examined. See the simulations page for a full discussion.