statrefs home‎ > ‎Main‎ > ‎Methods‎ > ‎Regression‎ > ‎Regression Supporting Concepts‎ > ‎

Qualitative (Dummy) Variables


 
See also "Equality of Slopes Model" for an example of the use of dummy variables.


Qualitative or categorical variables are incorporated into a regression model using dummy or indicator variables.

(let DV = Dummy Variable)


Reference Level = the level against which the other levels are compared.


EXAMPLES of Parameterization for Sigma-Restricted models

Example for two-level categorical variable:

    Level    DV
    L1        0    (reference level)
    L2        1

    model equation:  Y = b0 + b1*DV1


Example for three-level categorical variable:

    Level    DV1    DV2
    L1        0        0        (reference level)
    L2        1        0
    L3        0        1

    model equation:  Y = b0 + b1*DV1 + b2*DV2


Software packages will generally assign a default reference level.  Some will allow the user to specify which level is the reference level.


EXAMPLES of Parameterization for Overparameterized models

Example for three-level categorical variable:

    Level    DV1    DV2    DV3
    L1        1        0        0
    L2        0        1        0
    L3        0        0        1


    model equation:  Y = b0 + b1*DV1 + b2*DV2 + b3*DV3



Interpretation of Coefficients

Equality of Intercepts  (see also this page)
In the model equations above, the coefficients for the dummy variables modify the model intercept term (b0). 
  • This tests the hypothesis that the intercept terms for each level of the categorical variable are equal.
  • The intercept coefficient is with respect to the reference level.

Equality of Slopes  (see also this page)
If an interaction term with another model explanatory variable (X) is included, then the interaction term coefficient would modify the slope term for that explanatory term (X). 
  • This tests the hypothesis that the slope terms for each level of the categorical variable are equal.
  • Statistica offers a specific regression model routine called the "homogeneity of slopes" model.
  • Coding of this model using scripted packages (SAS, R) is relatively straightforward.

    Example model equation with two-level DV:   Y = b0 + b1*X + b2*DV + b3*X*DV



 
Comments