Calculate and Visualize the Lorenz Curve in Power BI

The Lorenz curve (1) is tool in statistics to visualize the concentration of values within a dataset. A typical well known example is the concentration of wealth within the population. For example in 2017 the top 1% of the US population controls 36,8% of wealth (2). However, the Lorenz curve can also be utilized to visualize the product groups share on revenue.

Demo Data

There are 5 product groups (A,B,C,D,E,F) and multiple sales. The data is stored in the table “Sales” including three columns ProductGroup, Product and Qty. For demonstration purpose, each product group has the same share on total sold products: 20%

ProductGroup

Product

Qty

A A123 2
A A234 2
B B123 4
C C123 1
C C123 1
C C123 1
C C234 1
D D123 4
E E123 2
E E234 2

The invoices are stored in a separate table called Invoice, it contains three columns: The ProductGroup, SalesId and InvoiceAmount.

ProductGroup

SalesId

InvoiceAmount

A S-0001 5000
A S-0002 6000
E S-0003 1000
E S-0004 10000
D S-0005 11000
B S-0006 4000
B S-0007 4000
B S-0008 5000
C S-0009 5000
C S-0010 5000
C S-0011 4000

In order to visualize the lorenz curve, a data set is required that holds the cumulated frequency of Qty and InvoiceAmount, in ascending order based on the InvoiceAmount. . The first simple SQL Statements and results look like this

select ProductGroup, sum(Qty) as QtySum 
from Sales
group by ProductGroup

ProductGroup

QtySum

A 4
B 4
C 4
D 4
E 4

select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
from Invoice
group by ProductGroup
order by InvoiceAmountSum asc

ProductGroup

InvoiceAmountSum

A 11000
D 11000
E 11000
B 13000
C 14000

Now comes the tricky part. A SQL statement is required which creates the cumulated sums. To do this the SQL function row_number is used to create a line number in the result set. Then the same result set is joined with all records smaller or equal the current row number and the InvoiceAmount is summed up. The SQL Statement looks like this.

select temp1.idx, temp1.ProductGroup, temp1.InvoiceAmountSum,
sum(temp2.InvoiceAmountSum) as CumulateSum
from
(

select ROW_NUMBER() over (order by InvoiceAmountSum asc) as idx, ProductGroup, InvoiceAmountSum
from (
    select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
    from Invoice
    Group by ProductGroup)
temptable

)
as temp1
inner join
(
    select ROW_NUMBER() over (order by InvoiceAmountSum asc) as idx, ProductGroup, InvoiceAmountSUm
    from (
        select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
        from Invoice
        Group by ProductGroup)
    temptable
)
as temp2 on temp1.idx >= temp2.idx
group by temp1.idx, temp1.ProductGroup, temp1.InvoiceAmountSum

The result looks like this

Idx

ProductGroup

InvoiceAmountSum

CumulatedSum

1 A 11000 11000
2 D 11000 22000
3 E 11000 33000
4 B 13000 46000
5 C 14000 60000

Finally  the SQL query needs to bee extended to include also the cumulated sales qty. Moreover, a line starting with 0 values is required to make the Lorenz cure look correct. This is done by adding a row using UNIOIN. Here is the complete SQL statement and the result set:

select temp1.idx, temp1.ProductGroup, temp1.InvoiceAmountSum,
sum(temp2.InvoiceAmountSum) as CumulatedSum, s.SalesQty,
sum(s.SalesQty) as CumulatedSales
from
(

select ROW_NUMBER() over (order by InvoiceAmountSum asc) as idx, ProductGroup, InvoiceAmountSum
from (
    select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
    from Invoice
    Group by ProductGroup
    )
temptable

)
as temp1
inner join
(
    select ROW_NUMBER() over (order by InvoiceAmountSum asc) as idx,   
    ProductGroup, InvoiceAmountSUm
    from (
        select ProductGroup, sum(InvoiceAmount) as InvoiceAmountSum
        from Invoice
        group by ProductGroup
    )
    temptable
)
as temp2 on temp1.idx >= temp2.idx

inner join
(
    Select ProductGroup, sum(Qty) as SalesQty
    from Sales as s 
    group by ProductGroup
)
as s on s.ProductGroup = temp1.ProductGroup

group by temp1.idx, temp1.ProductGroup, temp1.InvoiceAmountSum, s.SalesQty

union

select 0 as idx,“ as ProductGroup, 0 as InvoiceAmountSum,0 as CumulatedSum,
0 as SalesQty,0 as CumulatedSales

Idx

ProuctGroup

InvoiceAmountSum

CumulatedSum

SalesQty

CumulatedQty

0   0 0 0 0
1 A 11000 11000 4 4
2 D 11000 22000 4 8
3 E 11000 33000 4 12
4 B 13000 46000 4 16
5 C 14000 60000 4 20

 

Power BI

The next step is load the data into Power BI and calculate the relative cumulated frequency  for Sales and Invoice. The is done by diving the cumulated frequency by the total numbers.

  1. Open a new Power BI report
  2. Select Get Data from SQL Server and provide the connection information
    In the details copy & paste the SQL Statement from above
  3. Rename the data set to SalesInvoice

After this there should be a dataset with the values from the table above in Power BI. Add two additional columns.to calculate the relative cumulated frequency. Format the columns as percentage.

RelativeFreqInvoice = SalesInvoice[CumulatedSum] / 
                      sum(SalesInvoice[InvoiceAmountSum])

and

RelativeFreqSales = SalesInvoice[CumulatedSales] / sum
                    (SalesInvoice[SalesQty])

The dataset in Power BI should look like this

Idx

Prouct
Group

InvoiceAmount
Sum

Cumulated
Sum

Sales
Qty

Cumulated Qty

Relative
FreqInvoice

Relative FreqSales

0   0 0 0 0 0% 0%
1 A 11000 11000 4 4 18% 20%
2 D 11000 22000 4 8 37% 40%
3 E 11000 33000 4 12 55% 60%
4 B 13000 46000 4 16 77% 80%
5 C 14000 60000 4 20 100% 100%

Go to the report designer and add a new line chart. Set as Axis the RelativeFreqSales. To get the diagonal line for the Lorenz curve add the RelativeFreqSales again as Value. Then also add the RelativeFreqInvoice as Value in the diagram properties. Now there are two lines, the 45° line and the lorenz curve underneath. In this example it should look like this:

Lorenz Curve in Power BI

1) https://en.wikipedia.org/wiki/Lorenz_curve 
2) http://money.cnn.com/2017/09/27/news/economy/inequality-record-top-1-percent-wealth/index.html

Über erpcoder
Software Engineer and Dynamics AX enthusiast working in Research & Development for InsideAx

Kommentar verfassen

Trage deine Daten unten ein oder klicke ein Icon um dich einzuloggen:

WordPress.com-Logo

Du kommentierst mit Deinem WordPress.com-Konto. Abmelden /  Ändern )

Google+ Foto

Du kommentierst mit Deinem Google+-Konto. Abmelden /  Ändern )

Twitter-Bild

Du kommentierst mit Deinem Twitter-Konto. Abmelden /  Ändern )

Facebook-Foto

Du kommentierst mit Deinem Facebook-Konto. Abmelden /  Ändern )

w

Verbinde mit %s

%d Bloggern gefällt das: