This is documentation for Orange 2.7. For the latest documentation, see Orange 3.
Utilities (utils)¶
Value transformers¶
Value transformers take care of simple transformations of values. Discretization, for instance, creates a transformer that converts continuous values into discrete, while continuizers do the opposite. Classification trees use transformers for binarization where values of discrete attributes are converted into binary.
These objects are most often constructed by other classes and only seldom manually. See information on Data discretization (discretization) and Continuization (continuization).
- class TransformValue¶
The abstract root of the hierarchy of transformers, which provides the call operator and chaining of transformers.
- subtransformer¶
The transformation that takes place prior to this. This way, transformations can be chained.
- class Ordinal2Continuous¶
Converts ordinal values to continuous. For example, variable values values small, medium, large, extra large (if given in that order) would be, by default, converted to 0.0, 1.0, 2.0 and 3.0. It is possible to add a factor by which the values are multiplied. If the factor for the above case were 0.3333, the value would be converted to 0, 0.3333, 0.6666 and 0.9999.
- factor¶
The factor by which the values are multiplied.
import Orange.data import Orange.feature lenses = Orange.data.Table("lenses") age = lenses.domain["age"] age_c = Orange.feature.Continuous("age_c") age_c.getValueFrom = Orange.classification.ClassifierFromVar(whichVar = age) age_c.getValueFrom.transformer = Orange.data.utils.Ordinal2Continuous() age_cn = Orange.feature.Continuous("age_cn") age_cn.getValueFrom = Orange.classification.ClassifierFromVar(whichVar = age) age_cn.getValueFrom.transformer = Orange.data.utils.Ordinal2Continuous() age_cn.getValueFrom.transformer.factor = 0.5 newDomain = Orange.data.Domain([age, age_c, age_cn], lenses.domain.classVar) newData = Orange.data.Table(newDomain, lenses)
The values of attribute age (young, pre-presbyopic and presbyopic) are transformed to 0.0, 1.0 and 2.0 in age_c and to 0, 0.5 and 1 in age_cn.
- class Discrete2Continuous¶
Converts a discrete value to a continuous so that some chosen value is converted to 1.0 and all others to 0.0 or -1.0, depending on the settings.
- value¶
The value that is converted to 1.0; others are converted to 0.0 or -1.0, depending on zero_based. Value needs to be specified by an integer index.
- zero_based¶
Decides whether the other values will be transformed to 0.0 (True, default) or -1.0 (False). When False, undefined values are transformed to 0.0; otherwise, undefined values yield an error.
- invert¶
If True (default is False), the transformations are reversed - the selected value becomes 0.0 (or -1.0) and others 1.0.
The following script loads the Monks 1 data set and constructs a new attribute e1 that will indicate whether e is 1 or not.
import Orange.data monks = Orange.data.Table("monks-1") e1 = Orange.feature.Continuous("e=1") e1.getValueFrom = Orange.classification.ClassifierFromVar(whichVar=monks.domain["e"]) e1.getValueFrom.transformer = Orange.data.utils.Discrete2Continuous()
- class NormalizeContinuous¶
Normalizes continuous values by subtracting the average and dividing the difference by half of the span.
- average¶
The value that is subtracted from the original.
- span¶
Divisor
The following script “normalizes” all attribute in the Iris dataset by subtracting the average value and dividing by the half of deviation.
for attr in iris.domain.features: attr_c = Orange.feature.Continuous(attr.name + "_n") attr_c.getValueFrom = Orange.classification.ClassifierFromVar(whichVar=attr) transformer = Orange.data.utils.NormalizeContinuous() attr_c.getValueFrom.transformer = transformer transformer.average = domstat[attr].avg transformer.span = domstat[attr].dev newattrs.append(attr_c) newDomain = Orange.data.Domain(newattrs, iris.domain.classVar) newData = Orange.data.Table(newDomain, iris) for ex in newData[:5]: print ex
- class MapIntValue¶
A discrete-to-discrete transformer that changes values according to the given mapping. MapIntValue is used for binarization in decision trees.
- mapping¶
A mapping that determines the new value: v = mapping[v]. Undefined values remain undefined. Elements of the mapping are contains integer indices of values.
The following script transforms the value of age in dataset lenses from ‘young’ to ‘young’, and from ‘pre-presbyopic’ and ‘presbyopic’ to ‘old’.
import Orange lenses = Orange.data.Table("lenses") age = lenses.domain["age"] age_b = Orange.feature.Discrete("age_c", values = ['young', 'old']) age_b.getValueFrom = Orange.classification.ClassifierFromVar(whichVar = age) age_b.getValueFrom.transformer = Orange.data.utils.MapIntValue() age_b.getValueFrom.transformer.mapping = [0, 1, 1] newDomain = Orange.data.Domain([age_b, age], lenses.domain.classVar) newData = Orange.data.Table(newDomain, lenses)
The mapping tells that the 0th value of age maps to the 0th of age_b, and the 1st and 2nd value go to the 1st value of age_b.