Correlation Extravaganza
Post date: May 30, 2011 4:58:58 AM
From the Stats With Cats blog, here is the chart accompanying the post Secrets of Good Correlations which offers the most comprehensive collection of correlation coefficient possibilities that I have ever encountered!
Other highlights included Grasping at Flaws and Ten Tactics used in the War on Error.
Finally, I won't forget Fifty Ways to Fix Your Data, which includes these lyrics:
(Sing to the tune of “Fifty Ways to Leave Your Lover” by Paul Simon)
The problem is all about your scales, she said to me
The R-squares will be better if you’ve matched ’em mathematically
It’s just a way to make your model fit nicely
There must be fifty ways to fix your data
She said it’s really not my preference to transform
‘Cause sometimes, the new scales confuse, overfit, or misinform
But I’ll Box-Cox ’em all if it means they’ll fit the norm
There must be fifty ways to fix your data
Fifty ways to fix your data
Take the tails for a trim, Kim
Try a replace, Grace
You can use the rank, Hank
Just try ’em and see
Make it more smooth, Suz
Lots of functions you can choose
A higher degree, Dee
Will get you more fee.
followed by various data quality-related pointers and tips, and concluding with another grand chart summarizing techniques for data transformations:
To quote the Stats with Cats author, "government statistician" Charlie Kufs:
When your instructor gave you a dataset in Statistics 101, that was it. You did what the assignment called for, got the desired answer, and you were finished. But it doesn’t work that way in the real world overflowing with data but lacking in wisdom. Sometimes you have to put more effort into making sense of things.
Here is the best part of all!
Statistics is the mortar that brings data and metadata together to make building blocks of information into a temple of wisdom. Transformations are like mason’s tools. They can smooth, reshape, adjust, add texture, augment, condense, and on and on. Suffice it to say that with transformations, there must be at least fifty ways to fix your data.