Complete Excel


Complete Excel

  • Complete Excel
  • Cleaning Data in Excel
  • Pivot Table

Complete Excel

Why do we clean data in Excel ?

  • Complete Excel / Why do we clean data in Excel

Complete Excel why is it so important because for me this is one of the most important things that often gets overlooked so we’ll go through why we clean data and we’ll take a look at some of the common issues you might find when you’re working with data that isn’t necessarily your own so maybe you’ve downloaded data from a system or maybe you’ve been sent a spreadsheet by somebody else and you need to tidy it up we’ll take a look at some of the common issues that might arise and then we’ll go through and we’ll look at how we can deal with those so I’m going to show you things like how to quickly adjust rows and columns how you can delete out blank rows from your data and also quickly fill any blank cells we’ll take a look at things like removing duplicates, This Step will help when upload it to Power BI

  • Common Issues

If we don’t have our data clean we’re going to have problems so this is why it is so important so what are some of the issues that we might come across well I’ve got a few of them listed out here

    • Inconsistent column widths and row Heights
    • Adjust Rows & Columns

    • Columns formatted incorrectly

    • Blank Rows

    • Empty Cells

HomeAttention Make Sure to Select Table NOT Entire Sheet (CTRL +END)

Always better to have a value in here than just leave the cell blank if we’re doing some kind of analysis so even if you just put a zero in here that is better than nothing so how do we automatically input a zero into all of the blank cells well again we can select we can select the entire data set if you wanted to control a or if it’s just in specific columns you can just select those columns and again we’re going to use go to special so Ctrl G again special and we’re going to choose blanks this time it’s just going to highlight those blank cells and if I want to put a zero in all of those in one go I just press zero and control enter and it’s going to fill them all in.

Selected Empty  Cells

Those blank cells and if I want to put a zero in all of those in one go I just press zero and control enter and it’s going to fill them all in okay so go to special type in zero control enter

Complete Excel / Now Empty Cells Filled in with Zero

    • Unwanted Formatting Borders
    • Duplicated Rows
    • Text Not Split
    • Inconsistent Case
    • Replace Abbreviations
    • Incorrect Spacing
    • Headers Not Specify
    • Number Formatted as Text

Complete Excel  / Conti.

  • Removing Duplicates

  • Messy Formatting

Here let’s deal with the formatting because we’ve got these Light Blue shaded backgrounds on some cells we’ve got orange and I can also see that I’ve got some weird sort of borders up here as well so I want to make everything more consistent and just remove all of this formatting so I’m going to select all of my data so I can click anywhere in my data little keyboard shortcut to select all of your data control a to select everything now that control a keyboard shortcut is worth noting if you’re clicked outside of your data set so if I’m just clicked randomly on a cell here and I press Ctrl a that’s going to select the entire spreadsheet if you just want to select your data make sure you’re clicked within your data when you press Ctrl a this is a really simple one we can go to the Home tab in the editing group we have a clear option just here and we have a few things that we can clear but I just want to clear the formatting so I’m just going to choose clear formats and that’s going to remove those weird borders and the background shading that I have there nice and straightforward

  • Delete out Blank Rows

Here well something which is actually quite hard to see just by looking at your data is if you have any duplicates in your data set and when I say duplicate I mean an exact duplicate so where everything every single column is duplicated

So it’s a true duplicate how can we remove those or how can we check for those well in Excel we do have a remove duplicates button which you will find on the data ribbon and remove duplicates is over in the data tools group and I think it’s this one just here yes now the thing with this button is when you click it

You can tell Excel that your data has headers which we do and then we can choose which columns we’re looking for duplicates in so if I want to only remove exact duplicate records where we have exactly the same thing in every single cell duplicated

I want to make sure that I have all of the columns selected just here so I’m only looking for exact duplicate records let’s click on OK and you can see it’s found five duplicate values and it’s removed them click on OK again and they’re gone okay so a really nice quick way to remove exact duplicates from your data set

  • Convert Numbers Stored as Text

Look here at column f because we’ve got a couple of little things going on here now notice that um manufacturing price we have values in here so this is uh basically three dollars this is supposed to be but take a look at these two just here we have little green triangles next to them now this is because the number in this cell for whatever reason Excel thinks this is text as opposed to a number and we can tell that because text is always aligned to the left hand side of the cell whereas numbers are always aligned to the right so for the rest of these These are aligned to the right so those are correct but for these two here they are aligned to the left hand side of the cell so it thinks it’s text

Complete Excel  / Conti.

Now I want to make sure that these are converted back to numbers now there are a couple of different ways that you can do this if I’ve just got one or two the easiest way to do this is simply to click on the little triangle and just say convert to number and that’s literally going to take it from a number back to text now let me just undo that control Z because

What if I have quite a few of these through a very large data set maybe I’ve got over a hundred of them I don’t want to have to go through each one selecting the green the triangle and converting them to text

So there is a little trick that we can use to do these all in one go and this does seem a little bit strange but let me show you how this works what I’m going to do is somewhere over in a blank cell I’m going to type in the number one and what we’re going to do is we’re going to copy this number one control C I’m going to go to the column but I want to apply it to so let’s select all of that data (CTRL + Shift + ↓) control shift down arrow to select everything and then we’re going to go to paste we’re going to go into paste special and we’re going to say multiply click on OK and Will convert everything to a number so that is a much quicker way of doing things

  • Add Columns

Well let’s talk about number formatting let’s deal with that now making sure that the text or the columns are formatted correctly with the correct uh number formatting is really important now for the most part if we take a look at the home ribbon in this number group most things are formatted as general and General is kind of your middle of the road formatting you can really apply General to to most things particularly text columns

There are some columns though that you want to make sure that you have the correct formatting such as if you have currency or numbers or dates you want to make sure we apply the correct formatting for those so let’s take a look through here units sold that’s a number

So I want to make sure that we have this set to number and again we can adjust those decimal places I’m going to take those down to zero now all of the next few these ones are basically currencies so Total Revenue & ADR & Rev sales

So I’m going to select all of these columns and I’m going to press Ctrl 1 to bring up my format cells dialog box now

I’m going to change these to accounting format now people always ask me what is the difference between Currency format and accounting format

I tend to use accounting format because with a counting format what you’ll find is that when you have numbers in a Cell accounting format make sure that the decimal place is always lined up and that the currency symbol is kind of shoved across to the left hand side to keep it out of the way of the numbers if we use currency format the currency symbol is right next to your number and the decimal places

Press Ctrl + 1 and select Account

I prefer accounting format I find it a lot easier to read so I’m going to apply that I’m going to take my decimal places down to zero and click on OK to apply that formatting

Dates formatting

let’s move across to the date column now again in this day column I’ve got some strange numbers and this is a very common thing to find when you’re working in Excel you’ll have a date column and you’ll look in that date column and these look nothing like dates and you’re left wondering

What on Earth these exactly are well if you want to know what these numbers actually are these this is basically the number of days past the first of the first 1900 so according to excel the beginning of time was the first of the first 1900 and this is just the number of days past that date

So effectively dates are simply numbers and but we don’t display them as number we don’t display

them as dates until we apply date formatting so if you take a look at column B currently this is set to

Complete Excel General formatting so it’s just displaying the number underneath so all we need to do here is change this to short date format and then we get our correct dates

  • Remove Weird Spaces (line breaks & non-printing characters)

We have a few different issues so we have some inconsistent case so you can see that some of these are in capital letters and some of them are in lower case uh not lowercase sorry proper case where the first letter is capitalized

I want everything to look like that so I need to convert the uppercase words into proper case also notice that I have some weird indentation going on here so if you look just before Hotel Name in row two we’ve got a large gap just there the same thing down here we have some weird large gaps so I want to make sure that I remove any weird spacing issues from here as well

Now we can use Excel text functions to do both of these things in one go so let’s take a look at how we would do that now when I’m going through and cleaning up or tidying up a specific column

To add in a helper column  brand new empty column is to just

  1. Select where you want to add and then Ctrl shift plus give you a new column
  2. Change the case Use functions There are three functions
    1. Upper  =UPPER(A2)
    2. Lower =LOWER(A2)
    3. Proper =PROPER(A2)

Now take a look at a couple of things in here it hasn’t done anything for the spacing that we have at the beginning just here so we want to combine it with another Excel function called trim and what trim does is it will remove any erroneous spaces at the beginning of words or at the end of words

Complete Excel  / Conti.

We can simply combine trim with our proper command so let’s go up to the formula bar

  1. Upper  =Trim(UPPER(A2))
  2. Lower =Trim(LOWER(A2))
  3. Proper =Trim(PROPER(A2))

You can add in here which will deal with things like um line breaks when people press Alt Enter in between two words and it looks something like that if we have line breaks we can effectively remove those using another command called clean and clean will also remove any non-printing characters that it finds in a cell

  1. Upper  =Clean(Trim(UPPER(A2)))
  2. Lower =Clean(Trim(LOWER(A2)))
  3. Proper =Clean(Trim(PROPER(A2)))

One little trick here because we don’t want two columns with the same thing we only want to keep our clean column look what happens if I try and delete out this column just here I’m going to get a whole bunch of reference errors now we get those because the formulas that we’re using in that cell

Complete Excel  / Conti.

So these formulas just here reference column B so if we delete column B that is why we’re getting those reference errors so what we need to do and

This is a little trick that I use all the time is we need to basically use paste special and just paste the values because that is going to remove all of the underlying formulas so what we can do here is we can select column Ctrl C to copy and then we’re going to go to paste and we’re going to say paste the values only so what that does is it just leaves the text that we can see but it removes the formulas underneath so now you can see when I click in the formula bar that trim and proper formula has disappeared which means we can now delete out that column safely and nothing’s going to happen to our text all right so really useful little functions

  • Find & Replace

Complete Excel

Let’s take a look let’s deal with this Hotels  in here I’m going to do a find and replace so we’re going to select column B and let’s press Ctrl H to bring up replace so this is where we can just simply say what we want to find and then what we want to replace it with so I want it to find Casa and I want it to replace it with the capitalized version of Casa I can click replace all it’s made 365 Replacements click on OK and

Complete Excel

This is very straightforward you can go up to the review ribbon and we have the spelling just here yes I want to continue checking it’s telling me that all of my spelling is correct which is great

  • Functions to Extract Data

  • Format Data as a Table

Complete Excel  / Conti.

I’m going to do here the final step I always put my data into a table before I even attempt to analyze it so a quick way of doing that is to press Ctrl t it’s going to pick up where my table is located you can see those marching ants around the outside very important to specify my table has headers click on OK

Now I have my data in an Excel table now notice what happens as soon as I do this I now get a new ribbon called table design and we can see that at the top here and this is where I have all different kinds of tools um for my table

So if I want to change the color of my table maybe I don’t like this style particularly I can change it from there I can add a total row things like that another thing that we can do here is we can give our table a name

Complete Excel

So when we create a table in Excel in this Properties Group you can see it’s called table one by default and every time you create a table Excel is going to give it that really generic name table one table two table three So on and so forth now is always advantageous to give your table a more meaningful name

So I’m going to call my table sales underscore data and again when you’re naming your tables you can’t have any spaces in those names so if you’ve got two words like I do here you need to separate them with an underscore or just have no space in the middle there

Complete Excel  / Conti.

Tables so important well tables Auto expand so if you create some kind of chart or pivot table and then you want to add more data into the bottom of this table next month it means that the table will expand to accommodate the new data and it means that any charts that are built off of this data can be updated simply by clicking one button

Complete Excel

Complete Excel / Conti.

Pivot Table

A very quick pivot table just to show you how this works once we’ve cleaned up our data so I’ve got my nice looking clean data set I’m going to put it into a pivot table first of all so from the table design ribbon that I’m currently clicked on in the tools group I’m going to choose summarize with pivot table so it’s picked up my table range

I’m going to put this on a new worksheet so let’s click on OK and here is my blank pivot table report if you’re not used to working with pivot tables what it basically does is it Imports all of the column headings of our data and puts them into this pivot table Fields area and we can see all of the column headings listed just here we can then drag and drop these column headings into any of these areas to build a pivot table and the cool thing about pivot tables is that they are completely Dynamic

Let’s do something like let’s go for Hotel let’s drag that into rows and maybe I want to see the ADR, Occupancy by Month

So I’m going to drag Hotel  into there and now I have my pivot table if I hadn’t cleaned my data I could have some very weird things going on in there

I could have blanks the data might not be adding up correctly it wouldn’t look as tidy as it looked if I hadn’t cleaned my data first and it could be very confusing for anybody who’s looking at this pivot table report so really important to clean your data now what else can we do here with this pivot table well let’s put it into a quick pivot chart

Complete Excel  / Conti.

So I can show you some of the differences so we are going to do a little bit of formatting here I’m going to right click and we’re going to go to Value field I’m going to right click and we’re going to go to Value field settings because I want to tidy up these numbers just here so if we go into number format I’m going to apply accounting format to these numbers in the pivot table I’m going to take the decimal places down let’s click on OK

Home

Attention if its Sum Or Average

 

I have a small data set here I’m going to create a very quick pivot chart but I don’t want to include the grand total in my Pivot chart

So I’m going to turn that off first of all so let’s go up to the design tab to Grand totals and I’m going to say let’s turn off Grand totals for rows and columns

Complete Excel  / Conti.

Now I can create a little pivot chart and we’ll just do a very basic column chart like so and we can add that into our worksheet so this is all coming from our cleaned up data set if we hadn’t done any of that cleaning this is not going to look as nice as it does now we could make this look a lot nicer we could spend a lot more time formatting this

I am going to show you a couple of little tricks here which I think are worthwhile once we get our pivot chart into here in general I tend to hide these gray filter buttons

So I’m going to right click and say hide all field buttons on charts I like to keep my charts looking as clean as possible

If we want to make any changes to the way that these numbers are formatted if we go back to our pivot table right click and go into value field settings and into number format we can apply some custom number formatting of Our Own so when we choose custom we can type into here

Check Final Project on Daily Blog

check out related Posts