Sampling procedures for testing models (testing
)¶

class
Orange.evaluation.testing.
Results
(data=None, nmethods=0, *, learners=None, train_data=None, nrows=None, nclasses=None, store_data=False, store_models=False, domain=None, actual=None, row_indices=None, predicted=None, probabilities=None, preprocessor=None, callback=None, n_jobs=1)[source]¶ Class for storing predictions in model testing.

data
¶ Data used for testing. When data is stored, this is typically not a copy but a reference.
 Type
Optional[Table]

models
¶ A list of induced models.
 Type
Optional[List[Model]]

row_indices
¶ Indices of rows in data that were used in testing, stored as a numpy vector of length nrows. Values of actual[i], predicted[i] and probabilities[i] refer to the target value of instance data[row_indices[i]].
 Type
np.ndarray

actual
¶ Actual values of target variable; a numpy vector of length nrows and of the same type as data (or np.float32 if the type of data cannot be determined).
 Type
np.ndarray

predicted
¶ Predicted values of target variable; a numpy array of shape (numberofmethods, nrows) and of the same type as data (or np.float32 if the type of data cannot be determined).
 Type
np.ndarray

probabilities
¶ Predicted probabilities (for discrete target variables); a numpy array of shape (numberofmethods, nrows, numberofclasses) of type np.float32.
 Type
Optional[np.ndarray]

folds
¶ A list of indices (or slice objects) corresponding to rows of each fold.
 Type
List[Slice or List[int]]

get_augmented_data
(model_names, include_attrs=True, include_predictions=True, include_probabilities=True)[source]¶ Return the data, augmented with predictions, probabilities (if the task is classification) and folds info. Predictions, probabilities and folds are inserted as meta attributes.
 Parameters
model_names (list) – A list of strings containing learners’ names.
include_attrs (bool) – Flag that tells whether to include original attributes.
include_predictions (bool) – Flag that tells whether to include predictions.
include_probabilities (bool) – Flag that tells whether to include probabilities.
 Returns
Data augmented with predictions, (probabilities) and (fold).
 Return type

fit
(train_data, test_data=None)[source]¶ Fits self.learners using folds sampled from the provided data.


class
Orange.evaluation.testing.
CrossValidation
(data, learners, k=10, stratified=True, random_state=0, store_data=False, store_models=False, preprocessor=None, callback=None, warnings=None, n_jobs=1)[source]¶ Kfold cross validation.
If the constructor is given the data and a list of learning algorithms, it runs cross validation and returns an instance of Results containing the predicted values and probabilities.

k
¶ The number of folds.

random_state
¶


class
Orange.evaluation.testing.
CrossValidationFeature
(data, learners, feature, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)[source]¶ Cross validation with folds according to values of a feature.

feature
¶ The feature defining the folds.


class
Orange.evaluation.testing.
LeaveOneOut
(data, learners, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)[source]¶ Leaveoneout testing

class
Orange.evaluation.testing.
ShuffleSplit
(data, learners, n_resamples=10, train_size=None, test_size=0.1, stratified=True, random_state=0, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)[source]¶

class
Orange.evaluation.testing.
TestOnTestData
(train_data, test_data, learners, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)[source]¶ Test on a separate test dataset.

class
Orange.evaluation.testing.
TestOnTrainingData
(data, learners, store_data=False, store_models=False, preprocessor=None, callback=None, n_jobs=1)[source]¶ Trains and test on the same data

Orange.evaluation.testing.
sample
(table, n=0.7, stratified=False, replace=False, random_state=None)[source]¶ Samples data instances from a data table. Returns the sample and a dataset from input data table that are not in the sample. Also uses several sampling functions from scikitlearn.
 tabledata table
A data table from which to sample.
 nfloat, int (default = 0.7)
If float, should be between 0.0 and 1.0 and represents the proportion of data instances in the resulting sample. If int, n is the number of data instances in the resulting sample.
 stratifiedbool, optional (default = False)
If true, sampling will try to consider class values and match distribution of class values in train and test subsets.
 replacebool, optional (default = False)
sample with replacement
 random_stateint or RandomState
Pseudorandom number generator state used for random sampling.