XYZ Analysis for Dynamics 365 FO in PowerBI

XYZ analysis is used to categorize products based on the variance of their demand. Products with a low demand variance, i.e. same quantity demanded regulary, are categorized with X, products with an unstable demand Y and products with a high variance in demand as Z.

The categorization is based on a calculated measure, often referred to as variance coeffizient. This coefficient is calculated by the standard deviation of the demand divided by the mean.

Video

Here is a video tutorial how to build the XYZ analysis in PowerBI

Example

Here is an example of three products with different demand over a year. Toilet paper is needed every month in the same quantity, car tires have a higher demand in spring and autum, firework is demanded only on special ocasions.

Demand of different products
Demand of different products in a year

Prepare Data in PowerBI

The basis of the XYZ analysis will be the SalesInvoiceLines entity. At least three columns are needed. The InvoiceDate, the InvoicedQuantity and the Product Name. In this example I renamed the dataset to “Demand”

In PowerQuery create two new columns, one for the year based on the InvoiceDate and one for the month, also based on the InvoiceDate. Afterwards remove the InvoiceDate column

YEAR  = Date.Year([InvoiceDate])
MONTH = Date.Month([InvoiceDate]) 

Next, remove the InvoiceDate column and group the records by ProductName, Year and Month and aggregate the InvoicedQuantity column. Here is an example:

Aggreage demand by product name, year and month
Aggregation in PowerBI

The second dataset contains the XYZ data template, including the ProductName, Year and Month. For simplicity you can enter the 12 records for Year / Month combinations manually. Add an additional column containing the distinct list of ProductNames and expand the rows.

ProductName = List.Distinct(Demand[ProductName])
Product name, year and month

Finally, merge the two datasets to a new one using a left outer join based on the second dataset. As result you get a list of ProductName, Year, Month, Qty combination for each product and every month no matter if there was an acutal demand or not. If there was no demand, the Qty will be null and needs to be converted to 0.

Left Outer Join on calendar and demand
Merge queries in PowerBI

The resulting dataset has records for eacht month and product. It can be used to calculated the mean, standard deviation and variation coefficient. To do so, create a new measure in PowerBI. It will calculate a coefficient value that can be used to categorize the products in X, Y and Z.

Variation Coefficient in PowerBI

Create an Email Tag Cloud with PowerBI and Cognitive Services

PowerBI and Cognitive Services are a powerful combination. A nice example is a tag cloud based on the key phrases in your daily emails. This example requires the following cloud components:

  • PowerBI (of course)
  • Cognitive Services for Key Phrase extraction
  • Exchange Online
  • Flow and Table Storage in Azure

Cloud Infrastructure

First, go to your Azure portal and create a new Cognitive Services Resource. In the creation wizard place the cognitive services to a data center near your Office subscription. I’d also recomend to creata a seperate resource group where you place all the services.

Cognitive Services in Azure

At the Cognitive Services Overview tab, copy the Endpoint URL. From the Cognitive Services > Key tab also copy the Key1. You need both to connecto to the cognitive services.

Azure Storage Account

Next create a new stroage account. Like in the Cognitive services place it in the same resource group and same data center. After the storage account has been created successfuly go to the overview tab.

Azure Table Storage

Select “Tables” and create a new table. Give it a useful name e.g. keystorage. A table storage can be used to place structured data, which require at least to fields a RowKey and a PartitionKey. It is up to you to provide meaningful values to theses fields when inserting data.

Copy the storage account name and from the Access Keys tab the Key1 value. You will need both to connect to the storage account.

Implement transformation pipeline in Flow (first naive approach)

Now, lets create the extraction logic using Flow. There are some limitations with this approach that will result in errors. A more stable version of the flow is discussed at the end. Go to https://flow.microsoft.com and create a new triggered flow from blank.

Automated Flow from Blank

The trigger for the flow is Outlook > When a new email arrives.

Because almost all my mails are HTML formated, I need to add the Content Conversion > HTML to Text step to remove the HTML code from the email body.

The third step in the flow is the key phrase extraction. Therefore add the Text Analysis > Key Phrase extraction step. There you need to provide the Cognitive Services Account Key and Endpoint. The text to analyze is the output from the HTML to Text step.

The last step writes the key phrases to the Azure Table Storage. Like in the Cognitive Services step, you have to provide the name and a key. From the Table dropdown select the table you have create earlier in the Azure portal. The entity has to be a JSON string. In my example the Partition is always 1 and the Row key is a Guid. Because, one mail will have more than one key phrase, the insert is encapsulated in an Apply-to-each block

{
“PartitionKey”:”1″,
“RowKey”:@{guid()},
“KeyPhrase”:@{items(‘Apply_to_each’)}
}

Keyword Extraction Flow

Test your flow by sending an Email to your account. All the steps should succeed

Keyword Extraction Flow Test

You can use the Azure Storage Explorer in the Azure portal to lookup the phrases extracted from the email. In this example I sent an email from my company account, to my private mail account. The flow extracted the key words from the mail (Signature).

Azure Storage Explorer

Tag Cloud in PowerBI

In PowerBI add a new data source from the Azure Table storage. Again you need to provide the storage name and one of the keys. After connecting successfuly to the table, open the transformation window an take a look at the retrieved keys. You can remove the PartitionKey, RowKey and Timestamp from the data set.

Azure Table Storage in PowerBI

In the PowerBI report window, from the Visuals, klick on the Elipsis (…) and search for the Word Cloud in the marketplace. Add the Word Cloud Visual to PowerBI

Word Cloud Visual for PowerBI

Add the visual to the PowerBI report window. Set the Key Phrases as category in the visual.

Word Cloud in PowerBI Desktop

PowerBI Online Service and automated Refresh

Publish the PowerBI report to your workspace. Within PowerBI Online, go to your workspace and navigate to the dataset. From the Elipsis (…) open the settings page. Provide the Key for Azure Table storage.

Azure Table Storage Connection

Now you can also schedule the automatic refresh

Automatic Refresh from Azure Table Storage in PowerBI Online Services

Implement transformation pipline with a more stable Flow

Unfortunatelly, the text processing in Cognitive Services is limited to 5120 characters. In many cases, Emails contain more characters than this and the flow will fail with an error from the Cognitive Services. One way to address this issue, is to implement a loop that cuts the Email body into pieces of 5120 characters or less before feeding it to Cognitive Services. However, Flow is not very developer focused and requires some workarounds for simple tasks like assigning function calls with a variable to itself e.g substring()

In the first place, delcare 4 variables

Some required variables in Flow

Next execute the HTML to Text block. An optimization is to use the Builtin Data-Operations action Compose to trim() the result to remove blanks from the start and end, and populate the STRLEN and EMAILBODY. Whereas the STRLEN requires a function: length(outputs(‘Trim_Text’))

Set the variables in Flow

Next, create a Do-While Loop from the Control elements in Flow. The condition for the Loop is STRLEN <= 0 because we are cutting the Email into pieces until nothing is left

A loop to cut the Email into pieces of 5120 characters (or less)

Within the Loop, create a IF decision depending on the STRLEN. If the STRLEN variable is less then 5120, the STRLEN is set to 0 to end the Loop. The variable TEXT is set to the EMAILBODY.

Email body is shorter than 5120

If the Emailbody is longer than 5120 characters, the first 5120 characters are copied to the TEXT variable: substring(variables(‘EMAILBODY’),0,5120)

Next the variable STRLEN is reduced by 5120: sub(length(variables(‘EMAILBODY’)),5120)

In the third step, the variable EMAILBODY_SHORT is set to the substring starting at 5121 till the end of the original EMAILBODY. Is is done, because Flow does not support variable asignment by a function that contains the variable itself: substring(variables(‘EMAILBODY’),5121,sub(variables(‘STRLEN’),1))

In the last step the orignial EMAILBODY variable is set to be the EMAILBODY_SHORT. It contains now the body without the first 5120 characters.

Email body is larger than 5120

Within the loop, after the IF condition, Cognitive Services are called with the TEXT variable and the results are written to the Azure Table Storage like in the first naive implementation.

Save Cognitive Services Results to Azure Table Storage

More Optimization

There are three additional ways to optimize this solution.

One may argue, that cutting the text into pieces might cut a releveant word for the Word Cloud into pieces and therefore cannot be recognized by Cognitive Services, e.g. Micros … oft. One way to address this is to modify the substring function, by checking the last index of “_” (Blank) and cut there.

Another issue is that Cognitive Services are not aware of all stop words. Especially if using Non-English Key Phrases you may end up with a messy cloud. However, there are public available lists of stopwords in certain languages out there, that can be loaded into PowerBI and used to exclude certain findings from Cognitive Services. The Word Cloud visual provides an Exclude property where you can provide stop words to exclude.

In the example from above, the language for Cognitive Services is set to DE (german). Howerver, this might not be optimal if you receive Emails in different languages. An optimzation could be to use Cognitive Service to detect the language, and switch the Key Phrase Detection Call for the most common languages in your Email inbox, in my case German and English.

Flow Download (package)

Please find the Flow Package in the Sources Onedrive Folder. Import the .zip File in your Flow Tenant. You need to map Outlook, Cognitive Services, Azure Table Storage, etc. to your configurations.

Connect Azure Data Lake Storage with PowerBI dataflow

PowerBI dataflow performs ETL (Extract Transform Load) workloads in the cloud. PowerBI Pro and Premium Users get dataflow storage without additional charges. However, this storage is managed by PowerBI and you cannot access it directly. Therefor BYOSA (Bring Your Own Storage Account) is support to connect you own Azure storage account with PowerBI dataflow. I’ve made a video, following the documentation, how to connect an Azure storage account with PowerBI. Please find my video youtube:

Configure Azure Data Lake storage with PowerBI dataflow

Configure PowerBI on Dynamics 365 FO developer VM

I’ve created a video tutorial how to configure PowerBI on a stand alone Dynamics 365 Finance and Operations developer VM

PowerBI finance report grouped by main accounts and cost center

A customer recently asked to create a PowerBI report in order to group and compare ledger postings. The report had to meet the following requirements:

  • Compare two time periods
  • Group postings by account and type (e.g. Assets)
  • Postings on a certain cost center need to be shown separately
  • Grouping of accounts and cost centers are defined by a key user in Excel
    e.g. 1001 – 1104 and 5001 – 5002 are “Assets”
  • Cost center groups are defined by using wildcard style
    e.g. 601500 – 61500 and Costcenter ?6?? is Development/Compliance

Dynamics AX ledger postings grouped by account and cost center

Report Data from Dynamics AX

First, we loaded the report data from Dynamics AX into PowerBI. The data contained the LedgerJournalTrans joined with the LedgerAccountView. The Dimenions were renamed to Department, CostCenter and CostUnit.

select
T.TRANSDATE, T.AMOUNTCURCREDIT, T.AMOUNTCURDEBIT, T.ACCOUNTNUM, V.ACCOUNTNAME, DIMENSION as DEPARTMENT, DIMENSION2_ as COSTCENTER, DIMENSION3_ as COSTUNIT
from LedgerJournalTrans as T
join LEDGERACCOUNTVIEW as V
on T.ACCOUNTNUM = V.ACCOUNTNUM
where ACCOUNTTYPE <= 2

Report Definition Data

Next we defined and loaded the report definition from Excel. We used to sheets, one for the definition of the two time frames and one for the group definition

date range for ledger postings

grouping definition for ledger postings

Like initially described the Accounts sheet defines the grouping of postings regarding their account number. For example in line 2 and 3 postings on  accounts 1101 .. 1104 and 5001 .. 5002 shall be grouped as “Assets”. Postings on accounts 601500 .. 606300 with a cost center where the 2nd character is 6 shall be grouped separately as “Development/Compliance” regardless if they are also part of the group “Expenses”.

Both Excel worksheets were loaded into PowerBI. The Accounts was modifed to replace an empty value in the Costcenter column with the text “%”. This was done to use the Costcenter value in a SQL statement with a Like clause (see section “Calling function from the account definition”).

Query Parameter

We added 3 parameters to the PowerBI called FromAccountNum, ToAccountNum and CostCenter. The default values are the smallest account number for the FromAccountNum, the largest account number for the ToAccountNum and the text % for the CostCenter.

PowerBI parameters

Next we changed the query of LedgerJournalTrans and added the parameter to the query. This can be done used “Edit query” on the data set and opening “Advanced Editor”

Parameters in PowerBI query

The new query text in the advanced editor looked like this

let
Source = Sql.Database(“localhost”, “DynamicsAx”, [Query=”select T.TRANSDATE, T.AMOUNTCURCREDIT,T.AMOUNTCURDEBIT, T.ACCOUNTNUM,V.ACCOUNTNAME,DIMENSION as DEPARTMENT,DIMENSION2_ as COSTCENTER,DIMENSION3_ as COSTUNIT #(lf)from LedgerJournalTrans as T#(lf)join LEDGERACCOUNTVIEW as V on T.ACCOUNTNUM = V.ACCOUNTNUM #(lf)where ACCOUNTTYPE <= 2 AND T.ACCOUNTNUM >= “&FromAccountNum&” AND T.ACCOUNTNUM <= “&ToAccountNum&” AND T.DIMENSION2_ like ‘”&CostCenter&”‘ #(lf)order by T.ACCOUNTNUM,T.TRANSDATE#(lf)”, CreateNavigationProperties=false])
in
Source

With the parameter in the query and the default values set to the parameters the dataset did not change. Next we added a new function to the LedgerJournalTrans. This can be done from the context menu of the query “Create Function”. PowerBI inspects the query statement and creates function parameter for each parameter in the query. In this case FromAccountNum, ToAccountNum and CostCenter.

Calling function from the Account Definitions

In PowerBI a function call can be used like a row wise join. A function can be added by using the used defined function button in the query editor. So we added the function call to the Accounts dataset, i.e. each account definition row fetches all postings from the LedgerJournalTrans with the corresponding accounts and costcenter.

PowerBI calling user defined function

The query parameter are mapped to the fields in the Accounts table.

Parameter values for used defined function

PowerBI will popup a warning that function calls can have a security impact. In the actual version (Mai 2018) PowerBI was quite annoying with security warnings and required a restart (close&open) to stop asking again and again. Finally, expand the Accounts table and the function call results by clicking on the Arrow Right-Left Button next to “GetPostings”. Per default PowerBI adds the fields from the function call result with the function name prefix e.g. GetPostings.Transdate, GetPostings.AmountCurCredit, etc.

As you can see below the Account definition 1101 – 1104 was expanded with all resulting rows from the LedgerJournalTrans that have an account between 1101 and 1104 and any Costcenter (%)

Expanding used defined function call results

Calculate Period Amounts

To get the amount values for each of the two periods, defined in the Daterange Excel sheet, we added 4 additional columns to the Accounts. A debit and credit column for period 1 and period 2.

Ledger postings in period

The code looks like this

CreditDate1 = IF(Accounts[GetPostings.TRANSDATE]>=FIRSTDATE(Daterange[FromDate1]);IF(Accounts[GetPostings.TRANSDATE]<=FIRSTDATE(Daterange[ToDate1]);Accounts[GetPostings.AMOUNTCURCREDIT];0);0)

 

DebitDate1 = IF(Accounts[GetPostings.TRANSDATE]>=FIRSTDATE(Daterange[FromDate1]);IF(Accounts[GetPostings.TRANSDATE]<=FIRSTDATE(Daterange[ToDate1]);Accounts[GetPostings.AMOUNTCURDEBIT];0);0)

 

CreditDate2 = IF(Accounts[GetPostings.TRANSDATE]>=FIRSTDATE(Daterange[FromDate2]);IF(Accounts[GetPostings.TRANSDATE]<=FIRSTDATE(Daterange[ToDate2]);Accounts[GetPostings.AMOUNTCURCREDIT];0);0)

 

DebitDate2 = IF(Accounts[GetPostings.TRANSDATE]>=FIRSTDATE(Daterange[FromDate2]);IF(Accounts[GetPostings.TRANSDATE]<=FIRSTDATE(Daterange[ToDate2]);Accounts[GetPostings.AMOUNTCURDEBIT];0);0)

Display results in matrix

Finally we used a matrix to display the values from the Accounts dataset and grouped it by Name and Accounts.

PowerBI matrix for ledger postings

Security considerations

Injecting range values in a query is not the best way to do this. A better way would be to refactor the LedgerJournalTrans query into a stored procedure and provide the FromAccountNum, ToAccountNum and Costcenter as parameter to the SP.

Calculate and Visualize the Lorenz Curve in Power BI

The Lorenz curve (1) is tool in statistics to visualize the concentration of values within a dataset. A typical well known example is the concentration of wealth within the population. For example in 2017 the top 1% of the US population controls 36,8% of wealth (2). However, the Lorenz curve can also be utilized to visualize the product groups share on revenue.

Demo Data

There are 5 product groups (A,B,C,D,E,F) and multiple sales. The data is stored in the table “Sales” including three columns ProductGroup, Product and Qty. For demonstration purpose, each product group has the same share on total sold products: 20%

ProductGroup

Product

Qty

A A123 2
A A234 2
B B123 4
C C123 1
C C123 1
C C123 1
C C234 1
D D123 4
E E123 2
E E234 2

The invoices are stored in a separate table called Invoice, it contains three columns: The ProductGroup, SalesId and InvoiceAmount.

ProductGroup

SalesId

InvoiceAmount

A S-0001 5000
A S-0002 6000
E S-0003 1000
E S-0004 10000
D S-0005 11000
B S-0006 4000
B S-0007 4000
B S-0008 5000
C S-0009 5000
C S-0010 5000
C S-0011 4000

In order to visualize the lorenz curve, a data set is required that holds the cumulated frequency of Qty and InvoiceAmount, in ascending order based on the InvoiceAmount. . The first simple SQL Statements and results look like this

select ProductGroup, sum(Qty) as QtySum 
from Sales
group by ProductGroup

ProductGroup

QtySum

A 4
B 4
C 4
D 4
E 4

select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
from Invoice
group by ProductGroup
order by InvoiceAmountSum asc

ProductGroup

InvoiceAmountSum

A 11000
D 11000
E 11000
B 13000
C 14000

Now comes the tricky part. A SQL statement is required which creates the cumulated sums. To do this the SQL function row_number is used to create a line number in the result set. Then the same result set is joined with all records smaller or equal the current row number and the InvoiceAmount is summed up. The SQL Statement looks like this.

select temp1.idx, temp1.ProductGroup, temp1.InvoiceAmountSum,
sum(temp2.InvoiceAmountSum) as CumulateSum
from
(

select ROW_NUMBER() over (order by InvoiceAmountSum asc) as idx, ProductGroup, InvoiceAmountSum
from (
    select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
    from Invoice
    Group by ProductGroup)
temptable

)
as temp1
inner join
(
    select ROW_NUMBER() over (order by InvoiceAmountSum asc) as idx, ProductGroup, InvoiceAmountSUm
    from (
        select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
        from Invoice
        Group by ProductGroup)
    temptable
)
as temp2 on temp1.idx >= temp2.idx
group by temp1.idx, temp1.ProductGroup, temp1.InvoiceAmountSum

The result looks like this

Idx

ProductGroup

InvoiceAmountSum

CumulatedSum

1 A 11000 11000
2 D 11000 22000
3 E 11000 33000
4 B 13000 46000
5 C 14000 60000

Finally  the SQL query needs to bee extended to include also the cumulated sales qty. Moreover, a line starting with 0 values is required to make the Lorenz cure look correct. This is done by adding a row using UNIOIN. Here is the complete SQL statement and the result set:

select temp1.idx, temp1.ProductGroup, temp1.InvoiceAmountSum,
sum(temp2.InvoiceAmountSum) as CumulatedSum, s.SalesQty,
sum(s.SalesQty) as CumulatedSales
from
(

select ROW_NUMBER() over (order by InvoiceAmountSum asc) as idx, ProductGroup, InvoiceAmountSum
from (
    select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
    from Invoice
    Group by ProductGroup
    )
temptable

)
as temp1
inner join
(
    select ROW_NUMBER() over (order by InvoiceAmountSum asc) as idx,   
    ProductGroup, InvoiceAmountSUm
    from (
        select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
        from Invoice
        group by ProductGroup
    )
    temptable
)
as temp2 on temp1.idx >= temp2.idx

inner join
(
    Select ProductGroup, sum(Qty) as SalesQty
    from Sales as s 
    group by ProductGroup
)
as s on s.ProductGroup = temp1.ProductGroup

group by temp1.idx, temp1.ProductGroup, temp1.InvoiceAmountSum, s.SalesQty

union

select 0 as idx,” as ProductGroup, 0 as InvoiceAmountSum,0 as CumulatedSum,
0 as SalesQty,0 as CumulatedSales

Idx

ProuctGroup

InvoiceAmountSum

CumulatedSum

SalesQty

CumulatedQty

0   0 0 0 0
1 A 11000 11000 4 4
2 D 11000 22000 4 8
3 E 11000 33000 4 12
4 B 13000 46000 4 16
5 C 14000 60000 4 20

 

Power BI

The next step is load the data into Power BI and calculate the relative cumulated frequency  for Sales and Invoice. The is done by diving the cumulated frequency by the total numbers.

  1. Open a new Power BI report
  2. Select Get Data from SQL Server and provide the connection information
    In the details copy & paste the SQL Statement from above
  3. Rename the data set to SalesInvoice

After this there should be a dataset with the values from the table above in Power BI. Add two additional columns.to calculate the relative cumulated frequency. Format the columns as percentage.

RelativeFreqInvoice = SalesInvoice[CumulatedSum] / 
                      sum(SalesInvoice[InvoiceAmountSum])

and

RelativeFreqSales = SalesInvoice[CumulatedSales] / sum
                    (SalesInvoice[SalesQty])

The dataset in Power BI should look like this

Idx

Prouct
Group

InvoiceAmount
Sum

Cumulated
Sum

Sales
Qty

Cumulated Qty

Relative
FreqInvoice

Relative FreqSales

0   0 0 0 0 0% 0%
1 A 11000 11000 4 4 18% 20%
2 D 11000 22000 4 8 37% 40%
3 E 11000 33000 4 12 55% 60%
4 B 13000 46000 4 16 77% 80%
5 C 14000 60000 4 20 100% 100%

Go to the report designer and add a new line chart. Set as Axis the RelativeFreqSales. To get the diagonal line for the Lorenz curve add the RelativeFreqSales again as Value. Then also add the RelativeFreqInvoice as Value in the diagram properties. Now there are two lines, the 45° line and the lorenz curve underneath. In this example it should look like this:

Lorenz Curve in Power BI

1) https://en.wikipedia.org/wiki/Lorenz_curve 
2) http://money.cnn.com/2017/09/27/news/economy/inequality-record-top-1-percent-wealth/index.html

Visualize the firms contribution to total revenue in Power BI

We recently discussed at work how to display the contribution of multiple firms to the total revenue in Power BI, in a way that the firms can be filtered but the total revenue stays constant. We came up with a very simple no-code solution.

Data from Dynamics AX 2012 R3

All the required data can be fetched from the CustInvoiceJour table in Dynamics AX. There are three datasets in the Power BI data model. First, the invoices including the InvoiceAmountMST, InvoiceYear and DataAreaId. Second the sum of InvoiceAmountMST grouped by Year and DataAreaId. Finally, a table that contains the years. Here are the SQL statements to load the data.

select
InvoiceId, InvoiceAmountMST, InvoiceDate, DataAreaId,
year(InvoiceDate) as InvoiceYear
from CustInvoiceJour

Total Revenue:

select
sum(InvoiceAmountMST) as InvoiceAmountTotal,
year(InvoiceDate) as InvoiceYear, DATAAREAID
from CustInvoiceJour
group by year(InvoiceDate),DATAAREAID
order by DATAAREAID

In Power BI create a new dataset and enter the years manually, like 2010, 2011, 2012, etc.

Power BI Years Table

In Power BI go to the data model an connect the Invoices and Total Revenue with the dataset that contains the years

image

Line and stacked Column

Add a new Line and stacked Column chart to the Power BI Report. Set the field Year from the Year table as shared axis. From the CustInvoiceJour set the DataAreaId as Colum Series and the InvoiceAmountMST as Column  values. Set the InvoiceAmountTotal from the Totals table as line values. In the format tab disable the Y-Axis property Show Secondary.

Diagram settings 

Your diagram should look like this. Filtering the companies does not effect the total revenue line.

image