Classification is the process of using a model such as a Bayesian network to predict unknown values (output variables), using a number of known values (input variables).
For example we could use a classification model to diagnose a disease given a number of symptoms, or we could detect a fault in a mechanical system, based on readings from sensors.
Tip |
---|
Typically, the term classification refers to models that predict discrete variables. Regression is the term used for models predicting continuous variables. |
Classification using Bayesian networks requires that we model the relationship between the input variables and the output variables we are predicting.
If we have a dataset containing both input and outputs, of sufficient size, we can train a Bayesian network which can then be used on data containing only inputs, to predict outputs.
Tip |
---|
Expert opinion (i.e. manually specifying the model) could also be used to build or enhance a model |
Tip |
---|
Classification is a type of supervised learning, because a model is trained specifically for the purpose of predicting the output variable. |
Key features supported by Bayesian network classifiers
Multiple inputs (multiple factors)
Multiple outputs (e.g. different faults)
Links can be from input->output or output->input
Intermediate nodes (e.g. logical nodes, divorcing)
Links between inputs
Links between output
Pre-input nodes (e.g. mixtures)
Post-output nodes (e.g. aggregate information)
The Bayesian network classifier shown below, demonstrates different design methods.