Avoiding pitfalls and wrestling with dirty data