PRTools definitions |
Objects and classes
In the context of pattern recognition an object is any observable entity that has as such a meaning for the observer, e.g. he is able to name it or to give it a place in an ontology. In automatic pattern recognition it is tried to the ability of recognizing objects by learning from examples. The aim is to build a system that can learn from a given set of observations (in this case not from senses but from sensors) of objects and which is thereafter able to classify new objects into one of previously defined subsets of the examples. Subsets of objects that share a name or that belong to the same ontological category are called a (pattern) class.
Features
Objects for PRTools
are either a file on disk containing a sensor measurement of a physical object (e.g. an image, a time-signal or a spectrum) or a vector containing a set of values for pre-specified object properties (e.g. its size, weight, color, etcetera). Such properties are usually called features and the vector a feature vector.
Feature matrix
A set of objects can now be represented by a feature matrix: a collection of feature vectors, one for every object. In PRTools
use has been made of the Matlab
programming concepts, for our context confusingly called 'objects' and 'classes'. They have nothing to do with the objects and classes in pattern recognition and when we refer to them we will always use 'programming objects' and 'programming class'. A programming class is a definition of a type of variables, programming objects, that have specific properties for which special operation are defined. This can be new operation, but also already existing ones, like multiplication, that have a special meaning for these variables.
Dataset
In order to handle feature matrices the programming class of dataset has been defined. Feature matrices are stored as a dataset and various types of operations related to learning from sets of feature vectors and classifying them into pattern classes are defined specifically for datasets.
Mapping
Operations on datasets have always the character of an operation on a set of feature vectors, i.e. on a cloud of points in a feature space. Such an operation may be a transformation to another space, or the application of a decision function to decide on the class membership of the objects represented by the feature vectors. Such an operation is called in the PRTools terminology a mapping.
Classification
Mappings have their own programming class as it is needed to perform also operations on mappings, e.g. to optimize them for a given dataset (called training), or to combine several mappings. This is needed as pattern recognition is usually not achieved by a single mapping, but by an series of them in which objects are gradually represented in more and more simple (or better 'normalized') feature spaces and finally classified. A classification is also a mapping, as it maps the last feature vector on a label, i.e. the class name or the class number estimated for an object to be classified.
Datafile
In the beginning of the chain, in the constitution of feature vectors, also mappings can be used. Raw data like images may be stored on disk and by image processing and image analysis properties may be measured that can be used as features. The definition of a raw data items is enabled in PRTools
by the programming class called datafile. Datafiles are a type of a pre-stage of datasets. By mappings defining the proper preprocessing a dataset may be constructed. By following mappings classifiers can be trained and applied, resulting in an overall recognition system.
R.P.W. Duin
, January 28, 2013PRTools definitions |