AutomateMyJob
Back to BlogExcel Tips

Power Query 101: Clean Messy Data in Minutes

Jennifer Walsh11 min read

Power Query 101: Clean Messy Data in Minutes

You've received a data export, and it's a mess. Headers in the wrong place, dates formatted as text, columns that need splitting, and values scattered across rows instead of columns. You could spend hours with formulas and manual editing—or you could let Power Query do it in minutes.

The Problem

Real-world data is rarely clean. You deal with:

  • CSV exports with inconsistent formatting
  • Multiple files that need combining
  • Text that needs parsing into columns
  • Data types that Excel doesn't recognize
  • Headers buried in the wrong rows
  • Transformations you repeat every week

The Solution

Power Query is Excel's built-in data transformation engine. It records every step you take to clean data, then replays those steps automatically whenever you refresh. Clean once, refresh forever.

What You'll Need

  • Excel 2016 or later (Power Query is built-in; also available as a free add-in for Excel 2010 and 2013)
  • A messy data source (CSV, Excel file, or database)
  • About 20 minutes to learn the basics

Step 1: Launch Power Query

Get Data from Excel Table:

  1. Click in your data
  2. Data tab > From Table/Range
  3. If prompted to create a table, click OK

Get Data from External File:

  1. Data tab > Get Data > From File > From Text/CSV (or From Workbook)
  2. Navigate to your file and click Import
  3. Power Query Editor opens with a preview

Step 2: Understanding the Power Query Editor

The editor has four main areas:

Query Settings (right): Shows all applied steps. Click any step to see data at that point.

Formula Bar (top): Shows the M code for the selected step. You don't need to write M code—it's generated automatically.

Data Preview (center): Shows your transformed data.

Queries Pane (left): Lists all queries in this workbook.

Step 3: Common Transformations

Remove Unwanted Rows

Remove Top Rows (header junk):

  1. Home > Remove Rows > Remove Top Rows
  2. Enter the number of rows to remove
  3. Click OK

Use First Row as Headers:

  1. Home > Use First Row as Headers
  2. Your data now has proper column names

Remove Blank Rows:

  1. Home > Remove Rows > Remove Blank Rows
  2. All empty rows disappear

Transform Columns

Split Column:

  1. Select the column to split
  2. Transform > Split Column > By Delimiter
  3. Choose delimiter (comma, space, custom)
  4. Decide: split at each occurrence or just the first

Example: Split "John Smith" into "John" and "Smith"

Change Data Type:

  1. Click the icon left of the column header (ABC, 123, etc.)
  2. Select the correct type: Text, Whole Number, Decimal, Date, etc.

Replace Values:

  1. Select the column
  2. Transform > Replace Values
  3. Enter value to find and replacement
  4. Click OK

Example: Replace "N/A" with blank

Filter Data

Basic Filter:

  1. Click the dropdown arrow in a column header
  2. Uncheck values you want to remove
  3. Or use Text Filters/Number Filters for conditions

Remove Errors:

  1. Click column dropdown
  2. Uncheck "(Error)" if present
  3. Or: Home > Remove Rows > Remove Errors

Add Calculated Columns

Add Custom Column:

  1. Add Column > Custom Column
  2. Name your column
  3. Write a formula: [Column1] * [Column2]
  4. Click OK

Common Custom Column Formulas:

Prompt
// Concatenate
[FirstName] & " " & [LastName]

// Conditional
if [Amount] > 1000 then "High" else "Low"

// Extract year from date
Date.Year([OrderDate])

// Clean whitespace
Text.Trim([CustomerName])

Unpivot Data

When data is in a "wide" format (months as columns), unpivot to make it "long":

  1. Select the columns to keep (like Name, Region)
  2. Transform > Unpivot Other Columns
  3. Your month columns become "Attribute" and "Value" columns
  4. Rename as needed

The Complete Solution

Here's a typical Power Query workflow:

Scenario: Monthly sales CSV with messy data

Applied Steps:

  1. Source (auto-generated)
  2. Promoted Headers - First row becomes headers
  3. Removed Top Rows (2) - Skip junk rows
  4. Changed Type - Dates and numbers recognized
  5. Removed Blank Rows - Clean up gaps
  6. Replaced "N/A" with null - Standard missing data
  7. Filtered Status = "Active" - Only current records
  8. Split Name column - First and Last separate
  9. Added Total column - Quantity × Price
  10. Sorted by Date - Most recent first

Step Configuration

StepPurpose
SourceConnects to your data file
Promoted HeadersUses first row as column names
Changed TypeEnsures proper data types
Filtered RowsRemoves unwanted records
Split ColumnSeparates combined values
Added CustomCreates calculated columns

Loading Your Data

When transformations are complete:

Load to Table:

  1. Home > Close & Load
  2. Data appears in a new worksheet as a Table

Load to Existing Location:

  1. Home > Close & Load To...
  2. Choose Table, PivotTable, or Connection Only
  3. Specify location

Connection Only: Keeps the query without loading data—useful for intermediate queries or to reduce file size.

Refreshing Your Data

The magic of Power Query is repeatable transformations:

Manual Refresh:

  • Click in your data table
  • Data > Refresh (or Ctrl+Alt+F5)

Refresh All Queries:

  • Data > Refresh All

Auto-Refresh:

  1. Right-click query in Queries & Connections pane
  2. Properties
  3. Set "Refresh every X minutes"

Common Variations

Variation 1: Combine Multiple Files

Folder Source:

  1. Data > Get Data > From File > From Folder
  2. Navigate to folder containing files
  3. Click Combine > Combine & Transform
  4. All files are merged with transformations applied to each

Perfect for monthly reports in separate files!

Variation 2: Append Queries (Union)

Stack queries vertically (same columns, more rows):

  1. Home > Append Queries
  2. Select queries to combine
  3. Click OK

Variation 3: Merge Queries (Join)

Combine queries horizontally (like VLOOKUP):

  1. Home > Merge Queries
  2. Select matching columns from each query
  3. Choose join type (Left, Right, Inner, Full)
  4. Expand the new column

Pro Tips

  • Name your steps: Right-click > Rename for clarity
  • Add comments: Right-click > Properties > Description
  • Duplicate before experimenting: Right-click query > Duplicate
  • Reference queries: Build from existing queries instead of duplicating steps
  • Use Parameters: Make file paths or filter values dynamic

Troubleshooting

IssueCauseSolution
"Query failed"Source file movedUpdate source step with new path
Wrong data typesAuto-detect guessed wrongChange Type step manually
Refresh takes foreverToo much dataFilter earlier in the process
Missing columnsSource changedCheck column names match
Circular referenceQuery references itselfUse separate staging queries

Power Query vs. Formulas

TaskFormulasPower Query
One-time clean✅ Quick setup❌ Overkill
Repeated clean❌ Manual each time✅ Automatic refresh
Multiple files❌ Very complex✅ Built-in
Complex transforms❌ Formula nightmare✅ Visual steps
Data volume❌ Slows workbook✅ Handles millions

Real-World Applications

Finance: Expense Report Processing

  • Import CSV from expense system
  • Filter to current period
  • Split vendor name from description
  • Categorize by keyword matching
  • Aggregate by category and department

HR: Employee Data Consolidation

  • Combine regional spreadsheets
  • Standardize column names
  • Clean up date formats
  • Fill in missing manager data from lookup
  • Remove terminated employees

Sales: CRM Data Cleanup

  • Import daily export
  • Remove duplicates by email
  • Parse full name into first/last
  • Calculate days since last contact
  • Flag accounts needing attention

Conclusion

Power Query changes how you think about data preparation. Instead of dreading that weekly data cleanup, you build the transformation once and click Refresh forever. Every step is recorded, documented, and repeatable.

Start with a simple import and basic transformations. As you get comfortable, add more complex steps—merges, custom columns, unpivots. Within a few weeks, you'll be combining multiple files, performing sophisticated data reshaping, and wondering how you ever survived without it.

The time you invest learning Power Query pays dividends on every future data task. It's not just a tool—it's a superpower.

Your messy data doesn't stand a chance.

Sponsored Content

Interested in advertising? Reach automation professionals through our platform.

Share this article