This is documentation for Orange 2.7. For the latest documentation, see Orange 3.
Imputation (imputation)ΒΆ
Imputation replaces missing feature values with appropriate values. The example below shows how to replace the missing values with variables’ averages:
import Orange
bridges = Orange.data.Table("bridges")
imputed_bridges = Orange.data.imputation.ImputeTable(bridges,
method=Orange.feature.imputation.AverageConstructor())
print "Original data set:"
for e in bridges[:3]:
print e
print "Imputed data set:"
for e in imputed_bridges[:3]:
print e
The output of this code is:
Original data set:
['M', 1818, 'HIGHWAY', ?, 2, 'N', 'THROUGH', 'WOOD', 'SHORT', 'S', 'WOOD']
['A', 1819, 'HIGHWAY', 1037, 2, 'N', 'THROUGH', 'WOOD', 'SHORT', 'S', 'WOOD']
['A', 1829, 'AQUEDUCT', ?, 1, 'N', 'THROUGH', 'WOOD', '?', 'S', 'WOOD']
Imputed data set:
['M', 1818, 'HIGHWAY', 1300, 2, 'N', 'THROUGH', 'WOOD', 'SHORT', 'S', 'WOOD']
['A', 1819, 'HIGHWAY', 1037, 2, 'N', 'THROUGH', 'WOOD', 'SHORT', 'S', 'WOOD']
['A', 1829, 'AQUEDUCT', 1300, 1, 'N', 'THROUGH', 'WOOD', 'MEDIUM', 'S', 'WOOD']
The function uses feature imputation methods from Imputation (imputation) and applies them on entire data set. The supported methods are:
- imputation of minimal, maximal, average value (uses Orange.feature.imputation.Defaults),
- imputation of random value (uses Orange.feature.imputation.Random),
- imputation based on a predictive model (uses Orange.feature.imputation.Model),
- imputation where missing value is treated as a value (uses Orange.feature.imputation.AsValue).