Associations for objects described by qualitative and inhomogeneous variables
Symmetrical Gower coefficient (S15)
Coefficient of Estabrook and Rogers (S16)
This coefficient may be used for heterogeneous data sets (i.e. data sets including numerous variable types). It calculates a partial similarity value of two objects for each variable describing them. The final similarity score is the average of all partial similarities. Binary, qualitative and semi-quantitative, and quantitative variables are treated differently.
Binary variables can be evaluated symmetrically or asymmetrically.
Qualitative and semi-quantitative variables will have a similarity score of "1" when their values are equivalent between two objects, and "0" otherwise.
For quantitative variables, a dissimilarity is calculated by dividing the absolute difference between a given variable's values describing two objects by the range of this variable across all objects. The one-complement of this dissimilarity is then taken as the similarity value.
Missing data may be accounted for by integrating Kronecker's delta into the implementation.
Similar to the Gower coefficient, this coefficient computes the average similarity between all variables where information for the two objects under consideration exists. The approaches to calculate partial similarities are somewhat different, and can be defined on a per variable basis. Further, the partial similarities are not only defined by the differences between variable values or states, but also by a user-defined parameter. Should a difference exceed this parameter (which may be variable specific), that partial similarity is set to zero.