Data Cleaning & Analysis


Moondollars Coffee is a national chain of coffee shops. The company recently purchased another small coffee shop chain, Perks. Due to this merger, Moondollars extracted customer information from Perks’ customer relationship management system. Moondollars realized that the data they extracted is of low quality and needs to be cleaned to properly analyze the small chain’s existing customer base alongside their own.  Using the data cleaning techniques you learned, clean the new data so that it is consistent with the existing data.  Then analyze the data to answer the questions.

Don't use plagiarized sources. Get Your Custom Essay on
Data Cleaning & Analysis
Just from $13/Page
Order Essay





  • Review the tabs of data provided for you in the Moondollars_Perks.xlsx file. Note that the tab prefixed with “Perks” shows the data extracted from Perks’ systems and needs to be cleaned.
  • Perform data cleaning on customer data: The goal is to clean the data so that it can be imported into the existing Moondollars’ systems seamlessly and eventually combined with the existing data.  Thus, the data should be cleaned to meet the following standards:
    • All first names and last names should start with a capital letter and not have an extra spaces or symbols before or after the name
    • Remove any blank rows from the customer list.
    • Remove any duplicate customers from the list. Customers are considered duplicate if their ID, first name, and last name all match another customer.
    • Street address, city, and state should each be in their own columns (Make sure the columns match with Moondollars list format)
    • All zip codes should be 5 digits long. If they are not five digits, the zip code is not valid. Create a conditional column that shows whether a zip code is valid or invalid.
    • Create a custom column called “Source” that contains a P so that the Perks’ customers are marked as coming from the Perks source data.


  • Conduct an analysis of Moondollars’ and Perks’ customer bases: Using visualizations, answer the question below about Moondollars’ and Perk’s customer bases.
    • Similar to Coca-Cola’s recent “Share a Coke” campaign (see image) that placed names on Coke bottles to remind consumers to share a coke with friends, Moondollars would like to pre-print names on the cups used in the coffee shops to entice customers to purchase an extra item for friends. What are the most popular first names in Moondollars’ customer base? What are the most popular first names in Perk’s customer base? If they were to choose a single name, which name should be chosen?  Create one or more visualizations that answers these questions in a report page in Power BI.

Leave a Reply

Your email address will not be published. Required fields are marked *