Pattern Analysis Tutorial
In this tutorial we will use the Asia sample network, which is included with Bayes Server, to demonstrate how to use the Pattern Analysis tool.
The Pattern Analysis tool, which is similar to the Auto Insight tool, allows us to determine how one state of a discrete variable of interest (the target) differs from the others, in terms of the other variables in the network.
The Pattern Analysis tool also supports continuous target variables.
- Open the Asia sample network included with Bayes Server, either from the Start Page or from the File menu, click Open.
The Asia network with no evidence set should look like this...
- Click on Has Bronchitis to select the node, without setting evidence on it.
- From the Analyze menu click Insight and then Pattern Analysis.
The Pattern Analysis dialog is displayed.
- Confirm that Has Bronchitis has automatically been selected as the Target (Hypothesis) variable (since it was selected when we launched the Pattern Analysis tool). If not, select it in the drop down.
If a discrete variable has more than 2 states, say {S1, S2, S3}, then we would instead be comparing S1 to {S2, S3}, rather than just a single state.
The options page should look like the following:
- Click Run
The results page should look like the following:
We can clearly see that with no evidence set Dyspnea is the variable that differs the most between when Has Bronchitis=True and Has Bronchitis=False with a strength (JS divergence) of 36%.
Strength: The distance between the test distribution from when Target Variable=Target State and when Target Variable = NOT Target State. The strength, which is reported using Jenson Shannon divergence, can range between 0% and 100%.
We can also see that Dyspnea=True changes from 80.80% when Has Bronchitis=True to 13.16% when Has Bronchitis=False, which is a large change.
The Pattern Analysis tool also supports evidence that is already set on the network. Note that the Auto Insight tool, which is similar to the pattern analysis tool, supports dynamic setting of evidence (drill-down).
The pattern analysis tool provides valuable insight into how one state varies from all the others.
Now we will use the pattern analysis tool on a different network, which is an example of a mixture model.
- Open the Iris sample network included with Bayes Server, either from the Start Page or from the File menu, click Open.
The Iris mixture model contains a latent variable called Cluster. This variable has a number of states (clusters). We will use the pattern analysis to see how each cluster differs.
Click on the Cluster node, without setting evidence on it.
From the Analyze menu click Insight and then Pattern Analysis.
The Pattern Analysis dialog is displayed.
Confirm that Cluster has automatically been selected as the Target (Hypothesis) variable (since it was selected when we launched the Pattern Analysis tool). If not, select it in the drop down.
Change Top N variables to 4
The options page should look like the following:
- Click Run
The results page should look like the following:
We can see that Cluster 2 is differentiated from the other clusters predominatly by Petal Length (99.82%) and Petal Width (98.23%) and to a lesser extent by Sepal Length (63.76%).
We used the mesh query tool with the following axes, to generate the image below (left as an optional exercise):
- X axis: Petal Length
- Y axis: Petal Width
- Z axis: Likelihood
Cluster 2 is on the far left, verifying that it is indeed very different to the other clusters in the dimensions identified by the pattern analysis tool.