Power BI Advanced Table Functions

Getting to know advanced DAX table functions in Power BI

Laten we beginnen. Het is Gratis
of registreren met je e-mailadres
Power BI Advanced Table Functions Door Mind Map: Power BI Advanced Table Functions

1. CROSSJOIN

1.1. This produces Cartesian Products from two tables (or table functions) that do not necessarily have to be related

1.2. We will typically use it inside other functions

1.2.1. For example, let's use it to find the minimum [Total Sales] per product and per month

1.2.1.1. Min Monthly Sale by Product = MINX( CROSSJOIN( VALUES( Products[Product Name] ), VALUES( Dates[Month & Year] ) ), [Total Sales] )

1.2.1.1.1. Again, we don't have to use physical tables for the "Table" arguments of CROSSJOIN; in this example we use the VALUES table function to return single column tables of unique row values

1.2.1.1.2. Here we see the measure in context

1.2.1.1.3. Although CROSSJOIN produces the Cartesian Product of product names and month+year combos, the evaluation context converts this into reduced virtual tables for each row of this table visual (i.e. one virtual table per city in this example)

2. ROW

2.1. Returns a single row table

2.1.1. You basically specify a column label followed by either a DAX expression to return a value for that column or alterntively you supply a literal column value, then repeat this pattern for as many columns as you need

2.1.1.1. The instructor does not have a good use case for it, but here is an example:

2.1.1.1.1. Table Function Example = ROW( "Group", "Top", "Min", 0, "Max", 1 )

3. UNION

3.1. Unions together tables (physical or virtual) with identical columns

3.1.1. Here's a very basic example that uses ROW to define each table to be unioned:

3.1.1.1. Table Function Example = UNION( ROW( "Group", "Top", "Min", 0.8, "Max", 1 ), ROW( "Group", "Middle", "Min", 0.45, "Max", 0.85 ), ROW( "Group", "Bottom", "Min", 0.1, "Max", 0.35 ) )

3.1.1.1.1. Here's the result

3.1.1.1.2. Bear in mind that ROW can be replaced by any table expressions that return the same combo of columns, so it's possible to create much more dynamic results than this example

4. CALCULATETABLE

4.1. Belongs to same family as CALCULATE; where CALCULATE calculates an expression in an altered filter context, CALCULATETABLE calculates a table expression in an altered context

4.1.1. It is easier to understand by following a worked example

4.1.1.1. Imagine we start by wanting a measure that gives us the total number of unique products sold

4.1.1.1.1. Products Sold = COUNTROWS( VALUES( 'Sales'[Product Description Index] ) )

4.1.1.2. Now imagine we want to see the same count but for the previous month; this is where CALCULATETABLE comes in

4.1.1.2.1. Products Sold Previous Month = COUNTROWS( CALCULATETABLE( VALUES('Sales'[Product Description Index]), DATEADD( 'Dates'[Date], -1, MONTH ) ) )

4.1.1.3. Finally, we can combine the two in a third measure that calculates the difference in distinct products sold this month vs previous month

4.1.1.3.1. Products Sold Increase = VAR ProductsSold = [Products Sold] VAR ProductsSoldPM = [Products Sold Previous Month] RETURN ProductsSold - ProductsSoldPM

5. INTERSECT and EXCEPT

5.1. These two functions do exactly what you'd expect, coming from a SQL background

5.2. They are most likely to be used with virtual tables produced by other table functions rather than physical tables in your data model

5.3. Let's see them in action via an example: imagine that we want to know all the distinct products sold this month compared with the distinct products sold the previous month

5.3.1. Products Sold = COUNTROWS( VALUES( 'Sales'[Product Description Index] ) )

5.3.1.1. This measure does not feature INTERSECT or EXCEPT but is included in the example to serve as a comparison reference measure

5.3.2. Products Sold This Month And Previous = COUNTROWS( INTERSECT( VALUES( 'Sales'[Product Description Index] ), CALCULATETABLE( VALUES( 'Sales'[Product Description Index] ), DATEADD( 'Dates'[Date], -1, MONTH ) ) ) )

5.3.2.1. We use VALUES (a table function) for the left table argument in the INTERSECT function, and CALCULATETABLE (another table function) to change the context to get the products sold in the previous month

5.3.2.1.1. For this to work as a measure, we wrap it in COUNTROWS to get the count of distinct of products sold both in the current month and the previous one

5.3.3. Products Sold This Month Not Previous = COUNTROWS( EXCEPT( VALUES( 'Sales'[Product Description Index] ), CALCULATETABLE( VALUES( 'Sales'[Product Description Index] ), DATEADD( 'Dates'[Date], -1, MONTH ) ) ) )

5.3.3.1. We use VALUES (a table function) for the left table argument in the EXCEPT function, and CALCULATETABLE (another table function) to change the context to get the products sold in the previous month

5.3.3.1.1. For this to work as a measure, we wrap it in COUNTROWS to get the count of distinct of products sold only in the current month and not the previous one

5.3.4. Here we can see the measures in report context

6. GENERATE

6.1. This function is the equivalent of the SQL CROSS APPLY operator

6.1.1. If you need the equivalent of the OUTER APPLY operator, you'll need the GENERATEALL function instead

6.2. It takes two table arguments

6.2.1. It is most useful when you ensure the second table is relatable to the first

6.3. Here is an example: imagine we want a customised range of dates between 01-Jan-2020 and 03-Jan-2020, plus 09-Jan-2020 and 12-Jan-2020

6.3.1. In this case, we want our left side table to specify the custom ranges and the right side table to provide the individual dates that match the range

6.3.2. Table Function Example = VAR CustomDates = UNION( ROW("First Date", DATE(2020,1,1), "Last Date", DATE(2020,1,3)), ROW("First Date", DATE(2020,1,9), "Last Date", DATE(2020,1,12)) ) VAR DatesExpanded = GENERATE( CustomDates, DATESBETWEEN( 'Dates'[Date], [First Date], [Last Date] ) ) RETURN DatesExpanded

6.3.2.1. Note how we use the DATESBETWEEN function to leverage the Date column in our Dates table and include columns from the left-side table argument (CustomDates) to feed in the start and end arguments

6.3.2.1.1. The result is a cross apply of the two tables

7. High level uses for table functions

7.1. With iterating functions like SUMX. The first argument for iterating functions is a table, but for more advanced analytics, we can use a virtual table for this, which is what all table functions return.

7.2. With CALCULATE, for the filter arguments we can use any table function, not just FILTER. This can be done to improve calculation performance.

8. In order to check and verify our understanding of table functions, we can make use of the option to create new tables in the Power BI Desktop Report Editor

8.1. When we click the option to create a new table (e.g. via Table view, Table Tools | New Table), we are prompted to enter DAX

8.1.1. For example, we can explore the ADDCOLUMNS table function

8.1.1.1. Table Function Example = ADDCOLUMNS( VALUES( 'Products'[Product Name] ), "Profit Margin", [Profit Margin], "Quantity Sold", [Total Sales Quantity] )

8.1.1.1.1. Now we can visualise the result of the table function

9. ADDCOLUMNS

9.1. We can use this with an iterator function to create an advanced measure

9.1.1. For example, imagine we want to calculate sales quantity but only for high margin products

9.1.1.1. High Frequency Sales = SUMX( ADDCOLUMNS( VALUES( 'Products'[Product Name] ), "Product Profit Margin", [Profit Margin], "Product Quantity", [Total Sales Quantity] ), IF( [Product Profit Margin] > 0.32, [Product Quantity], 0 ) )

9.1.1.1.1. I don't think "High Frequency Sales" is a good name for this measure, but it's what the course instructor gave it (High Margin Sales Quantity seems better to me)

9.1.1.1.2. We use ADDCOLUMNS for the table argument of SUMX, noting that SUMX is going to iterate over every row returned by ADDCOLUMNS and then implement its expression defined by the 2nd argument for each row.

9.1.1.1.3. Note how the SUMX expression (i.e. the IF function) allows us to refer to the two virtual table columns produced by ADDCOLUMNS

9.1.1.1.4. Let's look at the measure calculated in a report context

10. SUMMARIZE

10.1. Like ADDCOLUMNS, we typically use this for the table argument of an iterator function like SUMX

10.1.1. For example, imagine we want to summarize our sales data by product name and channel name, and sum up sales amount when it meets some "high revenue" threshold

10.1.1.1. Summarize Example = SUMX( SUMMARIZE( Sales, Products[Product Name], Channels[Channel], "High Revenue", [High Revenue Sales] ), [High Revenue] )

10.1.1.1.1. SUMX expects a table for its first argument, and we supply SUMMARIZE, which generates a virtual table

10.1.1.1.2. In the SUMMARIZE function, we choose 'Sales' as the table argument and 'Products'[Product Name] and 'Channels'[Channel] as grouping columns

10.1.1.1.3. "High Revenue" is the name assigned to a custom column for the SUMMARIZE virtual table and for the expression that defines the value for that custom column, we can use pre-existing measures if we want ([High Revenue Sales] in this example)

10.1.1.1.4. The second argument of SUMX is the expression [High Revenue]

10.1.1.1.5. Let's look at the measure calculated in report context

11. Combining table functions

11.1. Example: combine FILTER with SUMMARIZE

11.1.1. Table Function Example = FILTER( SUMMARIZE( Sales, Products[Product Name], Channels[Channel], "High Revenue", [High Revenue Sales] ), [High Revenue] > 5000000 )

11.1.1.1. Remember that wherever we see "Table" as an argument, we are not limited to selecting a physical table; we can use a table function such as SUMMARIZE

11.1.1.2. FILTER is itself a table function

11.1.1.3. We can use table functions in this way to greatly reduce the number of rows that an outer table function has to iterate over, thereby making the calculation significantly more efficient

11.1.1.4. If we use the DAX to create a table (rather than a measure) then we can visualise it as a table

11.1.1.4.1. Just remember that when the same DAX is used to create a measure, this potentially becomes many virtual tables by context (e.g. slicer selections, dimension attributes added to the same visual, etc.)

12. SUMMARIZECOLUMNS

12.1. The course instructor does a poor job at explaining this function; he says he never really uses it and finds he can do what he needs with SUMMARIZE or ADDCOLUMNS

12.1.1. Here is the example given as a table rather than a measure:

12.1.1.1. Table Function Example = FILTER( SUMMARIZECOLUMNS( Products[Product Name], Channels[Channel], "Sales", [Total Sales] ), [Sales] > 6000000 )

12.1.1.1.1. Resultant table

12.1.2. SUMMARIZECOLUMNS is actually an advanced function that you use to form full table queries

12.1.2.1. It supports specifying multiple grouping columns, multiple filters and multiple custom columns

12.1.2.2. It can only be used to create DAX-based tables; you ***cannot*** use it as part of a measure definition

12.1.2.2.1. This is because SUMMARIZECOLUMNS cannot be executed in row context or filter context caused by context transition

12.1.2.2.2. If you attempt to use SUMMARIZECOLUMNS inside a measure definition (e.g. as an argument of SUMX) you will see this error:

12.1.2.2.3. If you are trying to create a measure, this is the scenario where you must use the simpler SUMMARIZE and ADDCOLUMNS functions because these ***can*** be executed in row and filter context

12.1.2.2.4. You can still create reporting visuals from SUMMARIZECOLUMNS; you just have to create the visual using only the available columns from the table created by SUMMARIZECOLUMNS

13. GROUPBY

13.1. This function is similar to SUMMARIZE but it has some peculiar syntax requirements

13.1.1. GROUPBY is best used when you want to group by columns that have been created via DAX, such as via ADDCOLUMNS, whereas SUMMARIZE is best used when you want to group by columns that are part of the data model

13.1.1.1. There can be a substantial performance penalty if you use GROUPBY or SUMMARIZE in the wrong use case, but my course instructor was not aware of this and the example given works, plus the data set is so small that you do not notice a performance penalty

13.1.1.1.1. Let's imagine we want to calculate the maximum sales per product per city, and we can achieve this with the following measure: