5. Shape and transform data with Power Query (M) – part 2

The matches data contain several information in a flat format, We will split this data in several entities. Follow the steps in the previous post to remove the first 3 rows of the Matches query and expand the binary column on the remaining row.

Rename the query to Matches Original. We will keep a copy of the original query, but we will disable load and will not included it in report refresh.

Tournaments table

Duplicate the Matches Original query and name it Tournamets. Select the columns tourney_name, surface and tourney_level and right click to choose Remove Other Columns, then right click and Remove Duplicates. Rename the columns Tournament, Surface and Level respectively.

Add an Index Column from the Add Column ribbon bar and name it Tournament Id.

Matches table

Make another duplicate of the Matches Original query and name it Matches. Keep the columns: tourney_name, tourney_date, winner_id, loser_id, score, round and minutes and rename them Tournament, Date, Winner Id, Opponent Id, Score, Round and Minutes respectively. Add an index column as shown above, and name it Match Id.

1. Merge data

The next thing we will do, is to bring the Tournament Id value corresponding to each Tournament. Choose the Merge Queries option in the Home ribbon bar. We have the option to merge in the existing query, or merge queries as a new one. In our case we merge in the same query. Choose the table you want to merge – Tournaments, and select a matching column – Tournament column, then click OK.

Click the header image of the newly merged table column and keep selected only the Tournament Id.

You can now delete the Tournament column. Your screen should look like the image below.

2. Transform integer to date

Follow the steps to transform the Date column from integer to data type Date.

Click Add Column tab and select Extract > Range, set Starting Index to 4 and Number of Characters to 2. This will give us the month.

Similarly, extract the day by choosing Extract > Last Characters and set Count to 2. Extract the year by choosing Extract > First Characters and set Count to 4.

Remove the old Date column. Select all three columns created above and choose Merge Columns in the Transform ribbon bar menu. Set the separator to Custom, use a / as the separator, then name the column Date.

Finally change the Data Type to Date.

Date dimension

We are going to create another query for our date dimension. More specifically, we will use M script to generate de date dimension in Power Query. For this project, we will generate only date, year, month and month name. But keep in mind that depending on your project specific business needs, you may need a wider date dimension, or store information about the fiscal year for example.

From the Home menu, select New Source > Blank query and in the Advance Editor paste the script below and name the query Dates.

let
    FromYear = 2017, // set the start year of the date dimension
    ToYear=2021, // set the end year of the date dimension
    FromDate=#date(FromYear,1,1),
    ToDate=#date(ToYear,12,31),
    Source=List.Dates(
        FromDate,
        Duration.Days(ToDate-FromDate)+1,
        #duration(1,0,0,0)
    ),
    #"Converted to Table" = Table.FromList(Source, Splitter.SplitByNothing(), null, null, ExtraValues.Error),
    #"Renamed Columns" = Table.RenameColumns(#"Converted to Table",{{"Column1", "Date"}}),
    #"Changed Type" = Table.TransformColumnTypes(#"Renamed Columns",{{"Date", type date}}),
    #"Inserted Year" = Table.AddColumn(#"Changed Type", "Year", each Date.Year([Date]), Int64.Type),
    #"Inserted Month" = Table.AddColumn(#"Inserted Year", "Month", each Date.Month([Date]), Int64.Type),
    #"Inserted Month Name" = Table.AddColumn(#"Inserted Month", "Month Name", each Date.MonthName([Date]))
in
    #"Inserted Month Name"

Your screen should look like this:

Click Save and Apply to save your changes.

What’s next
Follow me in the next post, where we leave Power Query for now and switch over to the Model view.


Leave a comment