Data Cleanup Tips for Microsoft Excel

Post date: 17-Jan-2011 06:50:51

 

Clean Up Extra Spaces or Characters Using Find and Replace

This is simple; it just takes a series of tasks. If you have what you believe is multiple spaces between words or numbers within the cells, or perhaps you have 8 different formats for phone numbers, use Find and Replace on the column.

To get rid of spaces within cells. Maybe there's 2 spaces between some of your words and you have multiple words in the cells.

To cleanup phone lists when 8 people have submitted them all differently. Once you've deleted all the extra characters, and you have just 9 digits, you can then format the cells as a phone number, and it works just right.

Use the TRIM Function to Remove Spaces

The TRIM function removes spaces to the left and right of your data, but not spaces between data within a single cell.

Insert a blank column to the right of the column you want to trim. For our purposes, we'll use column A as the column we want to trim, and B2 is where we'll put our first formula.

=TRIM(A2)

Copy the formula down using the fill handle. In order to work with the trimmed text, highlight the cells, hit EditàCopy, then hit EditàPaste Special, Values, OK. That removes the underlying formulas. You can now move your column heading over to your new column of trimmed data, and delete the original column.

Use the CLEAN Function to Remove Non-Printing Characters

Thanks to reader "Andy" for pointing out to me that the CLEAN function belonged here!

Insert a blank column to the right of the column you want to clean. For our purposes, we'll use column A as the column we want to clean, and B2 is where we'll put our first formula.

=CLEAN(A2)

Copy the formula down using the fill handle. In order to work with the cleaned text, highlight the cells, hit EditàCopy, then hit EditàPaste Special, Values, OK. That removes the underlying formulas. You can now move your column heading over to your new column of cleaned data, and delete the original column.

Vertical to Horizontal Addresses

Pretty much everyone can figure out that, in the demonstration below, we need to use formulas. But what they can't figure out is how to copy them properly.

Create three "sets". A set is the formula, and the corresponding blank rows that follow it. In the case below, we make 3 sets, and select them. From here, we copy down. DO NOT STOP dragging until you've copied all the way down. If you do, your "set" becomes however many cells are selected at the time you begin dragging again.

If you have addresses that have two lines and four lines, then you'll be better off using simple COUNTs to determine how many lines there are in the address, then cut those addresses out and work on those on a different sheet. Here, I see if the 3rd cell below the name is blank. If it is, I do nothing. If there's something in it, we're alerted that there's a problem with this record.

Sorting as a Data Cleanup Tool

In the scenario above, however, we could still easily use formulas to get our data separated into columns. So what if we have many that have city addresses in Columns E and F? We can sort them once we've converted them to values. Sorting makes data cleanup a breeze. It helps to find wrongly spelled city names, bad zip codes, and other data that's been fouled by the scanning process.