A basic PRTools program |
We will develop here a very basic, but complete PRTools
based recognition system. It is meant to give the reader a first flavor of the concept. More worked out examples as well as a proper description of the commands will follow. The main point that readers should realize in studying the next lines is that the application of a mapping to a dataset or a datafile as well as the concatenation of mappings are written in PRTools
by the overloaded *
-operator.
The total recognition chain in PRTools
terms consists of the following steps:
A
pointing to the raw data
W_prepproc
for an appropriate preprocessing and analyzing the datafile
B = A*W_preproc
W_featred
B
: C = B*W_featred
W_classf
labels = C*W_classf = B*W_featred*W_classf = A*W_preproc*W_featred*W_classf.
As the mappings W_preproc
, W_featred
and W_classf
are stored in variables and as the concatenations of a sequence of mappings is defined in PRTools
the entire recognition system can be stored in a single variable: W_recsys = W_preproc*W_featred*W_classf
. New objects, e.g. images stored on disk as a datafile A
, can now be classified by labels = A*W_recsys
.
In this example three mappings have to be specified by the user. The first, W\_preproc
, is usually entirely based on the background knowledge of the user of the type of images he wants to classify. The other two, the feature reduction and the classifier, have to be derived from data based on an optimization of a cost function or an estimation of parameters given a model assumption. In pattern recognition terms, these mappings are thereby the result from training. Datasets are needed for this, based on the same preprocessing and representation of the data to be classified later. There are many routines in PRTools available for training mappings and classifiers. It is in fact the core of the toolbox.
Consequently we distinguish two sets of objects: a training set with given labels (class memberships) to be used for designing the system and a an unlabeled set for which the class memberships have to be found. The first step of the program is the definition of these sets such that they can be handled by PRTools
. Let us assume that the raw data has been stored in two directories, 'directory_1'
and 'directory_2'
:
A_labeled = datafile('directory_1'); A_unlabeled = datafile('directory_2');
It will be described later how the labels of A_labeled
have to be supplied and how they are stored. The first mapping has to define features for objects. A simple command is the use of histograms which can be specified by the following mapping:
W_preproc = histm([],[1:256]);
The preprocessing of the two datafiles and their conversion to datasets is performed by
B_labeled = dataset(A_labeled*W_preproc); B_unlabeled = dataset(A_unlabeled*W_preproc);
Let us assume that a feature reduction by PCA is demanded to 5 features. It has to be derived from the preprocessed data, of course.
W_featred = pca(B_labeled,5);
Suppose that finally the Fisher classifier is used. It has to be found in the reduced feature space:
W_classf = = fisherc(B_labeled*W_preproc*W_featred);
The labels for B_unlabeled
can now be estimated by
labels = B_unlabeled*W_preproc*W_featred*W_classf*labeld;
in which labeld
is a standard PRTools
mapping that maps classifier outcomes to labels. The classification system can also be stored in a single variable W_class_sys
:
W_class_sys = W_preproc*W_featred*W_classf*labeld; labels = B_unlabeled*W_class_sys;
In the next subsections some worked out examples are presented.
R.P.W. Duin
, January 28, 2013A basic PRTools program |