Market Segmentation with Qlik Set Analysis and Qlik Set Operations

In this post, we’ll review two elusive methods within Qlik by which key business questions can be addressed: Qlik Set Analysis and Qlik Set Operations.

A common business objective is to expand product sales or determine strategic effectiveness.  These problems generally take a form like one of the following questions and are asked with an eye toward historic performance.

  • Which of my current customers purchased my product?
  • Which of my current clients are benefitting from my programs?

Qlik provides an array of tools to aid in the answers to these questions.  We will use Qlik Set Analysis to identify customers with specific characteristics or behaviors and then combine this with Qlik Set Operations to further understand where we might expect opportunities.

Qlik Set Analysis

Our sample data set is a list of fictitious customers and their orders.  We know their geographic details and their order history.  From here we can begin to glean some historical trends and target behavior, geographic or other attribute data from which to identify additional sales opportunities.

Let’s begin by identifying those customers purchasing motorcycles.  Using Qlik Set Analysis we can identify those customers who have purchased motorcycles in the past.  One way to do this is the following:

COUNT( { $ <PRODUCTLINE={"Motorcycles"}> } Distinct CUSTOMERNAME)

In the table below we see the customer’s name, a count of customers and a count of customers who have purchased motorcycles.

Qlik Table

Negating this, we might then expect to find those customers NOT purchasing motorcycles.

COUNT({$<PRODUCTLINE-={"Motorcycles"}>} Distinct CUSTOMERNAME)
Qlik Table Example

We see the 2nd and 3rd measure columns above are not mutually exclusive.  Why is this? 

What is being identified in the set are the ORDERS rather than the CUSTOMERS and while this is equivalent for the first case, it is clearly not for its negation in the second case. 

A more effective method to achieve this and retain the ability to effectively identify the complimentary set is to use the P() and E() functions provided by Qlik for this purpose.

Instead of:

COUNT( { $ <PRODUCTLINE={"Motorcycles"}> } Distinct CUSTOMERNAME)

We use:

COUNT({$<CUSTOMERNAME=P({<PRODUCTLINE={"Motorcycles"}>})>}Distinct CUSTOMERNAME)

This is read as ‘Which customers have EVER purchased motorcycles’ where P() indicates Possible.

To achieve the complimentary set of those customers who have NEVER purchased motorcycles [where E() indicates Excluded] we can do one of the following:

                COUNT({$<CUSTOMERNAME=E({<PRODUCTLINE={“Motorcycles”}>})>}Distinct CUSTOMERNAME)

– OR –

COUNT({$<CUSTOMERNAME-=P({<PRODUCTLINE={“Motorcycles”}>})>}Distinct CUSTOMERNAME)

We can now observe that for every customer they either HAVE or HAVE NOT purchased motorcycles.  (Note – as written, the Set Analysis will retain context of any dimensional selections due to the $ notation).  As confirmation of this fact, we can see that the sum of the two groups (49 + 43) sum to the total (92).

Qlik Set Operations

As it stands, this can be useful, however the methods’ value is amplified when combined with other sets via Qlik Set Operations.

COUNT({$
                <CUSTOMERNAME=P({<PRODUCTLINE={"Motorcycles"}>})>
    *
<CUSTOMERNAME=P({<PRODUCTLINE={"Planes"}>})> 
    } Distinct CUSTOMERNAME)

The Motorcycle set element is multiplied (*) with the Planes set element to give us the intersection of these two sets.  In this case, we have those customers who have EVER purchased both Motorcycles AND Planes.  We can then quickly manipulate the sets to answer which ever questions we’d like to pose.

Which customers have EVER purchased motorcycles, but NEVER purchased Planes?

COUNT({$
                <CUSTOMERNAME=P({<PRODUCTLINE={"Motorcycles"}>})>
    *
<CUSTOMERNAME=E({<PRODUCTLINE={"Planes"}>})>    
    } Distinct CUSTOMERNAME)

Alternatively:

COUNT({$
                <CUSTOMERNAME=P({<PRODUCTLINE={"Motorcycles"}>})>
    -
<CUSTOMERNAME=P({<PRODUCTLINE={"Planes"}>})>    
    } Distinct CUSTOMERNAME)

Qlik Set Operations Summary

Qlik Set Operations Summary

Combining Qlik Set Analysis and Qlik Set Operations

If, instead of seeking simple attribute identifiers, we wish to understand behavioral thresholds, i.e., Sales above $175k, we can leverage search in a more advanced Qlik Set Analysis.

SUM({$<CUSTOMERNAME=P({<CUSTOMERNAME={"=SUM(SALES)>=175000"}>})>} SALES)

This can be further altered and combined via Qlik Set Analysis Functions P() and E() and Qlik Set Operations (* and -) to identify a very specific subset of customers for potential analysis.

Those customers…

SUM( {$
                // never having over 175k in sales (see E() exclude function below)
                <CUSTOMERNAME=E({<CUSTOMERNAME={"=SUM(SALES)>=175000"}>})>
     *
// who have ever purchased Planes (see P() possible function below, * operator above)
    <CUSTOMERNAME=P({<PRODUCTLINE={"Planes"}>})>
     -
//but are not located in USA or Australia (see subtraction operator above)
    <CUSTOMERNAME=P({<COUNTRY={"USA","Australia"}>})>
    } SALES)

See the ‘Combined’ column below for the sales of the specified set of customers.

We now have the ability to ask and answer questions which can target subsets of customers based on any attribute or behavior and which can be easily and reliably manipulated without lengthy or complex editing.

Share this post