Use Power BI dataflow to decouple report design from ETL logic in an ERP upgrade project

A common requirement during an ERP upgrade project (e.g. from AX 2012 to D365 Finance) and transition phase is to include both systems in the BI or reporting environment. Because of its tight integration with Dynamics, in many cases PowerBI is the preferred reporting and BI platform. PowerBI is capable to combine different data sources like OData feeds from D365 and SQL connections via gateway. However, for the person developing reports, it will become complicated to integrate cloud and on-prem datasources. For example, to create a sales report, one would need to include the Customers, SalesInvoiceHeader and SalesInvoiceLine entities as well as the CustTable, DirPartyTable, CustInvoiceJour and CustInvoiceTrans tables.

Different data sources in one PowerBI report

One way to address this issue can be to separate ETL logic from report design. PowerBI supports this approach by using dataflows. By using dataflows you can place PowerQuery logic direct in the Microsoft cloud and offer reuseable data artefacts. People designing reports simply connect to the dataflow but are not concerned with the ETL logic required to combine data from the old AX installation and a new Dynamics 365 ERP cloud environment.

Use PowerBI dataflow to decouple ETL logic from report design

Example

From PowerBI workspace create a new entity using dataflow. Choose the OData feed for Dynamics 365 and provide the URL for the CustomersV3 entity.

OData feed for entities from Dynamcis 365 Finance

Clicking next will open the Power Query editor and load the customers from Dynamics 365 Finance. Remove all the fields you don’t need in your application. In this example I’m using the DataAreaId, Account, Name, Group, Address and Delivery mode + terms.

PowerBI dataflow based on Dynamics 365 Finance OData CustomerV3 entity

For an on-premises AX 2012 installation you need to install a data gateway, so PowerBI can access the local SQL database. If you already have a gateway, create a new dataflow in PowerBI and use the SQL connection. I’d recommend to create a view on the database instead of loading tables in PowerBi.

CREATE VIEW [dbo].[PBIX_Customer] AS
select
DataAreaId, DirPartyTable.NAME, ACCOUNTNUM, CUSTGROUP, TAXGROUP, LogisticsPostaladdress.ADDRESS, DlvTerm, DLVMODE
from CUSTTABLE
join DIRPARTYTABLE
on CUSTTABLE.PARTY = DIRPARTYTABLE.RECID
join DIRPARTYLOCATION on
DIRPARTYTABLE.RECID = DIRPARTYLOCATION.PARTY
join LOGISTICSPOSTALADDRESS
on DIRPARTYLOCATION.LOCATION = LOGISTICSPOSTALADDRESS.LOCATION
where
LOGISTICSPOSTALADDRESS.VALIDFROM <= GETDATE() and LOGISTICSPOSTALADDRESS.VALIDTO >= GETDATE()
GO

Choose SQL Server data source for PowerBI dataflow

Select the data gateway and provide a user to access the database

Connect a PowerBI dataflow to your on-premises AX 2012 database using a gateway

Select the view and load the AX 2012 data to PowerBI. Save the dataflow

Dynamics AX 2012 customer data via data gateway

After you have created both dataflows return to your workspace, go to your dataflows and refresh both to load the data.

Refresh dataflow from Dynamics 365 Finance and Dynamics AX 2012

Next, create a third dataflow to combine the data from the Dynamics 365 Finance and AX dataflow. This time choose to link entities from the other dataflows:

Link PowerBI entities via dataflow

Select both dataflows

Select PowerBI dataflows to merge

In the Power Query Online editor rename the fields in both dataflow entities so you can append both queries. Be aware that Power Query is case sensitive and dataAreaId is not the same as DATAAREAID. When you have done this, append both queries as new one.

Append queries in PowerBI

From the new query make sure to remove duplicate customers

Remove duplicates in Power Query Online

If your have a PowerBI Pro but not a Premium subscription, deactivate load of the underlying queries.

Deable load when using PowerBI Pro

Save and refresh the dataflow. From the settings schedule the refresh and endorse the dataflow as “Promoted” or “Certified”. This is not necessary but it adds a label to dataflow and your report designer users see that they can trust the datasource. In PowerBI Desktop open Get-Data and choose PowerBI dataflow as data source:

Get data from PowerBI dataflow

Select the merged Customer data source.

Promoted and certified PowerBI dataflows

You can use the dataflows in your PowerBI datamodel but dont have to worry about the logic behind

Linked dataflow sources in a PowerBI data model

Conclusion

Using dataflows has some advantages. It helps you to decouple ETL logic from design logic. Especially when working with older versions of Dynamics AX you have to have deeper knowledge about the data structure. Another advantage is the reuse of dataflows. Typically you are not creating 1 single report, but more reports that require the same dimensions e.g. customers. By using dataflows you don’t need to maintain the load and merge in multiple PowerBI files.

Video: Configure PowerBI for Dynamics 365 Finance in a Cloud hosted Environment

I’ve recorded a walkthrough how to configure PowerBI for Dynamics 365 Finance and Supply Chain Management and deploy the standard dashboards.

PowerBI dataflow for Self-Service BI

A typical challenge in a BI project is to integrate data from different sources. For example files stored locally, ERP databases, cloud services, etc. On the other hand, PowerBI (desktop) is designed for power users to develop reports quickly. However, power users may have business knowledge but in most cases lack the technical knowledge to integrate all the data they need. With PowerBI dataflow it is possible to break the workload into a technical IT-related part and a business analysis part.

PowerBI dataflow hides the complexity of integrating differnt data sources, but provides an ready-to-use data source for PowerBI desktop. I’ve recorded a video how to integrate Excel expenses from a local folder with Dynamics 365 Sales customers and promote result as certified data source. The power user accesses this promoted data source in the report.

https://youtu.be/1umwqpiozbM

Create an Email Tag Cloud with PowerBI and Cognitive Services

PowerBI and Cognitive Services are a powerful combination. A nice example is a tag cloud based on the key phrases in your daily emails. This example requires the following cloud components:

  • PowerBI (of course)
  • Cognitive Services for Key Phrase extraction
  • Exchange Online
  • Flow and Table Storage in Azure

Cloud Infrastructure

First, go to your Azure portal and create a new Cognitive Services Resource. In the creation wizard place the cognitive services to a data center near your Office subscription. I’d also recomend to creata a seperate resource group where you place all the services.

Cognitive Services in Azure

At the Cognitive Services Overview tab, copy the Endpoint URL. From the Cognitive Services > Key tab also copy the Key1. You need both to connecto to the cognitive services.

Azure Storage Account

Next create a new stroage account. Like in the Cognitive services place it in the same resource group and same data center. After the storage account has been created successfuly go to the overview tab.

Azure Table Storage

Select “Tables” and create a new table. Give it a useful name e.g. keystorage. A table storage can be used to place structured data, which require at least to fields a RowKey and a PartitionKey. It is up to you to provide meaningful values to theses fields when inserting data.

Copy the storage account name and from the Access Keys tab the Key1 value. You will need both to connect to the storage account.

Implement transformation pipeline in Flow (first naive approach)

Now, lets create the extraction logic using Flow. There are some limitations with this approach that will result in errors. A more stable version of the flow is discussed at the end. Go to https://flow.microsoft.com and create a new triggered flow from blank.

Automated Flow from Blank

The trigger for the flow is Outlook > When a new email arrives.

Because almost all my mails are HTML formated, I need to add the Content Conversion > HTML to Text step to remove the HTML code from the email body.

The third step in the flow is the key phrase extraction. Therefore add the Text Analysis > Key Phrase extraction step. There you need to provide the Cognitive Services Account Key and Endpoint. The text to analyze is the output from the HTML to Text step.

The last step writes the key phrases to the Azure Table Storage. Like in the Cognitive Services step, you have to provide the name and a key. From the Table dropdown select the table you have create earlier in the Azure portal. The entity has to be a JSON string. In my example the Partition is always 1 and the Row key is a Guid. Because, one mail will have more than one key phrase, the insert is encapsulated in an Apply-to-each block

{
“PartitionKey”:”1″,
“RowKey”:@{guid()},
“KeyPhrase”:@{items(‘Apply_to_each’)}
}

Keyword Extraction Flow

Test your flow by sending an Email to your account. All the steps should succeed

Keyword Extraction Flow Test

You can use the Azure Storage Explorer in the Azure portal to lookup the phrases extracted from the email. In this example I sent an email from my company account, to my private mail account. The flow extracted the key words from the mail (Signature).

Azure Storage Explorer

Tag Cloud in PowerBI

In PowerBI add a new data source from the Azure Table storage. Again you need to provide the storage name and one of the keys. After connecting successfuly to the table, open the transformation window an take a look at the retrieved keys. You can remove the PartitionKey, RowKey and Timestamp from the data set.

Azure Table Storage in PowerBI

In the PowerBI report window, from the Visuals, klick on the Elipsis (…) and search for the Word Cloud in the marketplace. Add the Word Cloud Visual to PowerBI

Word Cloud Visual for PowerBI

Add the visual to the PowerBI report window. Set the Key Phrases as category in the visual.

Word Cloud in PowerBI Desktop

PowerBI Online Service and automated Refresh

Publish the PowerBI report to your workspace. Within PowerBI Online, go to your workspace and navigate to the dataset. From the Elipsis (…) open the settings page. Provide the Key for Azure Table storage.

Azure Table Storage Connection

Now you can also schedule the automatic refresh

Automatic Refresh from Azure Table Storage in PowerBI Online Services

Implement transformation pipline with a more stable Flow

Unfortunatelly, the text processing in Cognitive Services is limited to 5120 characters. In many cases, Emails contain more characters than this and the flow will fail with an error from the Cognitive Services. One way to address this issue, is to implement a loop that cuts the Email body into pieces of 5120 characters or less before feeding it to Cognitive Services. However, Flow is not very developer focused and requires some workarounds for simple tasks like assigning function calls with a variable to itself e.g substring()

In the first place, delcare 4 variables

Some required variables in Flow

Next execute the HTML to Text block. An optimization is to use the Builtin Data-Operations action Compose to trim() the result to remove blanks from the start and end, and populate the STRLEN and EMAILBODY. Whereas the STRLEN requires a function: length(outputs(‘Trim_Text’))

Set the variables in Flow

Next, create a Do-While Loop from the Control elements in Flow. The condition for the Loop is STRLEN <= 0 because we are cutting the Email into pieces until nothing is left

A loop to cut the Email into pieces of 5120 characters (or less)

Within the Loop, create a IF decision depending on the STRLEN. If the STRLEN variable is less then 5120, the STRLEN is set to 0 to end the Loop. The variable TEXT is set to the EMAILBODY.

Email body is shorter than 5120

If the Emailbody is longer than 5120 characters, the first 5120 characters are copied to the TEXT variable: substring(variables(‘EMAILBODY’),0,5120)

Next the variable STRLEN is reduced by 5120: sub(length(variables(‘EMAILBODY’)),5120)

In the third step, the variable EMAILBODY_SHORT is set to the substring starting at 5121 till the end of the original EMAILBODY. Is is done, because Flow does not support variable asignment by a function that contains the variable itself: substring(variables(‘EMAILBODY’),5121,sub(variables(‘STRLEN’),1))

In the last step the orignial EMAILBODY variable is set to be the EMAILBODY_SHORT. It contains now the body without the first 5120 characters.

Email body is larger than 5120

Within the loop, after the IF condition, Cognitive Services are called with the TEXT variable and the results are written to the Azure Table Storage like in the first naive implementation.

Save Cognitive Services Results to Azure Table Storage

More Optimization

There are three additional ways to optimize this solution.

One may argue, that cutting the text into pieces might cut a releveant word for the Word Cloud into pieces and therefore cannot be recognized by Cognitive Services, e.g. Micros … oft. One way to address this is to modify the substring function, by checking the last index of “_” (Blank) and cut there.

Another issue is that Cognitive Services are not aware of all stop words. Especially if using Non-English Key Phrases you may end up with a messy cloud. However, there are public available lists of stopwords in certain languages out there, that can be loaded into PowerBI and used to exclude certain findings from Cognitive Services. The Word Cloud visual provides an Exclude property where you can provide stop words to exclude.

In the example from above, the language for Cognitive Services is set to DE (german). Howerver, this might not be optimal if you receive Emails in different languages. An optimzation could be to use Cognitive Service to detect the language, and switch the Key Phrase Detection Call for the most common languages in your Email inbox, in my case German and English.

Flow Download (package)

Please find the Flow Package in the Sources Onedrive Folder. Import the .zip File in your Flow Tenant. You need to map Outlook, Cognitive Services, Azure Table Storage, etc. to your configurations.

Dynamic colored R Diagram in Power BI using Earthtone

Power BI integrates R to perform complex analysis and sophisticated visualization. Earthtones is an R library which takes a screenshot from Google Maps of certain geo coordinate and extracts the landscape colors. Earthtones can be used to color diagrams based on the local color schema.

Earthtones

The package can be found on github. There is also a description how to donwload and install the package. Using earthtones is easy. The function get_earththones takes the parameters longitude and latitude, zoom and the number of colors to extract. The earthtones for Steyr look like this:

get_earthtones(latitude=48.045,longitude=14.422,zoom=15,number_of_colors=8)

Steyr Earthtones

Power BI Data Model

The data model in this example is very simple. There are two excel sheets, one for the revenue by city and item group, another for the geo coordinates (longitude / latitude) and optimal zoom level per city.

Excel Sheet Revenue per City and Item Group

City Geo Coordinates

The Power BI model is very simple, both data sources are linked by the city name

Power BI Data Model

R Boxplot diagram in Power BI

In this example a simple boxplot is used to visualize the revenue by item group. A data slicer for the column city is used to filter the data. The R diagram takes the following columns as input:

  • Longitude
  • Latitude
  • Zoom
  • City
  • Price
  • Group

If only one city is selected, the R script shall gather the cities earthtone colors and format the diagram. If more than one city is selected, the diagram shall be formatted in red, blue and green. The following script loads the earthtone library and gets the distinct number of city names from the dataset. If there is more than 1 distinct name in the dataset the color variable is set to red,blue,green. Otherwise, earthtone is used to get the city typical color schema.

library(earthtones)

numCities <- length(unique(dataset$Stadt))
if(numCities > 1) {
color <- c(“red”,”blue”,”green”)
} else {
color <- get_earthtones(latitude = dataset$Lat[1],
longitude=dataset$Lon[1],
zoom= dataset$Zoom[1],
number_of_colors=3,include.map=FALSE)
}

boxplot(Preis~Gruppe,dataset,col=(color),ylab=”Revenue”,xlab = “Item Group”)

The R script in Power BI looks like this:

R Script and Boxplot in Power BI

If a city is selected, for example San Francisco, the diagram is formatted in the colors blue, gray and brown.

R Diagram in Power BI with dynamic color

The colors fit the blue sea, the bay and the city seen from space.

R Earthtone for San Francisco

If another city, for example Cairo, is selected the diagram gets formatted in dark green, dark- and light brown.

R Diagram in Power BI with dynamic color

That fits the cities local color schema, the brown buildings, the green plants along the Nile and the desert sand.

R Earthtone for Cairo