Correlation Extravaganza

Post date: May 30, 2011 4:58:58 AM

From the Stats With Cats blog, here is the chart accompanying the post Secrets of Good Correlations which offers the most comprehensive collection of correlation coefficient possibilities that I have ever encountered!

Other highlights included Grasping at Flaws and Ten Tactics used in the War on Error.

Finally, I won't forget Fifty Ways to Fix Your Data, which includes these lyrics:

(Sing to the tune of “Fifty Ways to Leave Your Lover” by Paul Simon)

The problem is all about your scales, she said to me

The R-squares will be better if you’ve matched ’em mathematically

It’s just a way to make your model fit nicely

There must be fifty ways to fix your data

She said it’s really not my preference to transform

‘Cause sometimes, the new scales confuse, overfit, or misinform

But I’ll Box-Cox ’em all if it means they’ll fit the norm

There must be fifty ways to fix your data

Fifty ways to fix your data

Take the tails for a trim, Kim

Try a replace, Grace

You can use the rank, Hank

Just try ’em and see

Make it more smooth, Suz

Lots of functions you can choose

A higher degree, Dee

Will get you more fee.

followed by various data quality-related pointers and tips, and concluding with another grand chart summarizing techniques for data transformations:

To quote the Stats with Cats author, "government statistician" Charlie Kufs:

When your instructor gave you a dataset in Statistics 101, that was it. You did what the assignment called for, got the desired answer, and you were finished. But it doesn’t work that way in the real world overflowing with data but lacking in wisdom. Sometimes you have to put more effort into making sense of things.

Here is the best part of all!

Statistics is the mortar that brings data and metadata together to make building blocks of information into a temple of wisdom. Transformations are like mason’s tools. They can smooth, reshape, adjust, add texture, augment, condense, and on and on. Suffice it to say that with transformations, there must be at least fifty ways to fix your data.