Interface Discretize

    • Method Detail

      • getProgress

        DiscretizeProgress getProgress()
        Gets an instance that receive progress notifications.
      • setProgress

        void setProgress​(DiscretizeProgress value)
        Gets an instance that receive progress notifications.
      • discretize

        List<DiscretizationInfo> discretize​(DataReaderCommand dataReaderCommand,
                                            List<DiscretizationColumn> dataColumns,
                                            DiscretizationAlgoOptions options)
        Discretizes one or more data columns, that may contain missing (null) values.
        Parameters:
        dataReaderCommand - The data reader command to allow iteration of data.
        dataColumns - The data columns that should be discretized and options per column.
        options - Options governing the overall discretization algorithm. Each data column also has options.
        Returns:
        A number of bins each identified by an interval for each data column.
      • discretize

        List<Interval<Double>> discretize​(Iterable<Double> unsortedData,
                                          DiscretizationOptions options,
                                          String dataColumn)
        Discretizes unsorted continuous data that may contain missing (null) values.
        Parameters:
        unsortedData - The data to discretize.
        options - Options that affect how discretization is performed, such as the algorithm to use.
        dataColumn - The name of the source column. This is only used for error reporting.
        Returns:
        A number of bins each identified by an interval.
      • discretizeWeighted

        List<Interval<Double>> discretizeWeighted​(Iterable<WeightedValue> unsortedData,
                                                  DiscretizationOptions options,
                                                  String dataColumn)
        Discretizes unsorted weighted continuous data that may contain missing (null) values.
        Parameters:
        unsortedData - The weighted data to discretize.
        options - Options that affect how discretization is performed, such as the algorithm to use.
        dataColumn - The name of the source column. This is only used for error reporting.
        Returns:
        A number of bins each identified by an interval.