Learning Over Dirty Data