Package com.bayesserver.data.discovery
Class Clustering
- java.lang.Object
-
- com.bayesserver.data.discovery.Clustering
-
- All Implemented Interfaces:
Discretize
public final class Clustering extends Object implements Discretize
Discretizes continuous data in bins, using a probabilistic clustering algorithm.
-
-
Constructor Summary
Constructors Constructor Description Clustering()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description List<DiscretizationInfo>
discretize(DataReaderCommand dataReaderCommand, List<DiscretizationColumn> dataColumns, DiscretizationAlgoOptions options)
Discretizes one or more data columns, that may contain missing (null) values.List<Interval<Double>>
discretize(Iterable<Double> unsortedData, DiscretizationOptions options, String dataColumn)
Discretizes unsorted continuous data that may contain missing (null) values.List<Interval<Double>>
discretizeWeighted(Iterable<WeightedValue> unsortedData, DiscretizationOptions options, String dataColumn)
Discretizes unsorted weighted continuous data that may contain missing (null) values.DiscretizeProgress
getProgress()
Gets an instance that receive progress notifications.void
setProgress(DiscretizeProgress value)
Gets an instance that receive progress notifications.
-
-
-
Method Detail
-
getProgress
public DiscretizeProgress getProgress()
Gets an instance that receive progress notifications.- Specified by:
getProgress
in interfaceDiscretize
-
setProgress
public void setProgress(DiscretizeProgress value)
Gets an instance that receive progress notifications.- Specified by:
setProgress
in interfaceDiscretize
-
discretize
public List<Interval<Double>> discretize(Iterable<Double> unsortedData, DiscretizationOptions options, String dataColumn)
Discretizes unsorted continuous data that may contain missing (null) values.- Specified by:
discretize
in interfaceDiscretize
- Parameters:
unsortedData
- The data to discretize.options
- Options that affect how discretization is performed, such as the algorithm to use.dataColumn
- The name of the source column. This is only used for error reporting.- Returns:
- A number of bins each identified by an interval.
-
discretizeWeighted
public List<Interval<Double>> discretizeWeighted(Iterable<WeightedValue> unsortedData, DiscretizationOptions options, String dataColumn)
Discretizes unsorted weighted continuous data that may contain missing (null) values.- Specified by:
discretizeWeighted
in interfaceDiscretize
- Parameters:
unsortedData
- The weighted data to discretize.options
- Options that affect how discretization is performed, such as the algorithm to use.dataColumn
- The name of the source column. This is only used for error reporting.- Returns:
- A number of bins each identified by an interval.
-
discretize
public List<DiscretizationInfo> discretize(DataReaderCommand dataReaderCommand, List<DiscretizationColumn> dataColumns, DiscretizationAlgoOptions options)
Discretizes one or more data columns, that may contain missing (null) values.- Specified by:
discretize
in interfaceDiscretize
- Parameters:
dataReaderCommand
- The data reader command to allow iteration of data.dataColumns
- The data columns that should be discretized and options per column.options
- Options governing the overall discretization algorithm. Each data column also has options.- Returns:
- A number of bins each identified by an interval for each data column.
-
-