Programming trainable mappings and classifiers |
-> Be aware: this section is under construction !! <-
Note that classifiers are a special case of trainable mappings as they map input vectors on class confidences or distances. A user supplied trainable mapping routine called mymapping
may be called in one of the following ways:
% mapping definition, generating an 'untrained mapping' mymapping_def = mymapping([],parameters); % mapping definition with default parameters, generating an 'untrained mapping' mymapping_def = mymapping % call to train the mapping directly by a training set mytrainedmapping = mymapping(mydataset_train,parameters) % call to train a predefined untrained mapping by a training set mytrainedmapping = mydataset_train*mymapping_def % PRTools transforms this into mytrainedmapping = mymapping(mydataset_train,getdata(mymapping_def)) % call to apply a trained mapping on a test set mydataset_out = mydataset_test*mytrainedmapping % PRTools transforms this into mydataset_out = mymapping(mydataset_test,mytrainedmapping)
In these calls mydataset_train
and mydataset_test
should be proper PRTools
datasets. As the calls to train a mapping by a training set and to apply the mapping on a test set are similar, they can only be distinguished as trainable mappings have two possible states: 'untrained'
and 'trained'
. In the above code lines mymapping_def
is 'untrained'
and mytrainedmapping
is 'trained'
.
All parameters that defined the training procedure have to be stored in the definition of the untrained mapping (mymapping_def
). They should be stored in the data-field as a cell array. This cell array is unpacked by PRTools
when the the untrained mapping is called for training.
All data needed to apply a trained mapping to a test set have also to be stored in the data-field of the trained mapping (mytrainedmapping
). In this case cell arrays as well as structures or even a single array of doubles are possible depending on the routine that executes the mapping. Which routine that is has to be stored in the mapping_file field. In the below table some routines are listed that may be used for this purpose. Very common, however, is to define the execution of the mapping in the routine that defines the mapping itself. The code for training and execution, as well as for storing the training results and unpacking it, are thereby in the same file. An example is shown in the skeleton of a trainable mapping below. Routines may also share their execution parts, e.g. nusvc
calls svc
for execution.
> Routines for executing special mappings | |
| All affine and linear transformations. |
normal_map | Computes the density of test objects for a defined Gaussian distribution or a given mixture of Gaussians. Alternatively it may be used to classify objects based on such densities. |
knn_map | Executes a k-NN (k-Nearest Neighbor) classifier. |
parzen_map | Computes the density of test objects for a defined Parzen distribution (a set of objects and their kernel widths). Alternatively it may be used to classify objects based on such densities. |
Here is the skeleton of a routine that users may use to define their own trainable mapping. In this example we make use of some higher level routines to improve the readability of the code.
%MYTRAINABLEMAPPING Skeleton for a user supplied trainable mapping % % U = MYTRAINABLEMAPPING([],PAR) % U = MYTRAINABLEMAPPING % W = TRAINSET*U % W = MYTRAINABLEMAPPING(TRAINSET,PAR) % D = TESTSET*W % % INPUT % TRAINSET Dataset used for training the mapping % TESTSET Dataset used for testing (evaluating) the mapping % PAR Parameter(s) % % OUTPUT % U Untrained mapping % W Trained mapping % D Mapping result of TESTSET % % DESCRIPTION % This is an example routine just offering the skeleton of a user supplied % trainable mapping. By changing the lines indicated in the source by %% % and renaming the routine a new mapping can be created. The routine as it % is just selects the features (columns of A) as defined in PAR and % normalizes them by shifting and scaling: mean in the origin and variances % equal to one, as determined from the training set. This transformation, % using the shift and scaling determined by training is applied to the % testset on execution. % % SEE ALSO % DATASETS, MAPPINGS, MYFIXEDMAPPING % Copyright: R.P.W. Duin, r.p.w.duin@prtools.org % Faculty EWI, Delft University of Technology % P.O. Box 5031, 2600 GA Delft, The Netherlands function OUT = mytrainablemapping(varargin) % define first the default values for all parameters. Defaults might be % empty ([]), but the routine should test on that if the parameter % is used. % default here is []: to be interpreted as: use all features argin = setdefaults(varargin); %% % determine the type (task) of the call if mapping_task(argin,'definition') % we are here if the routine is called by W = mytrainablemapping, or by % W = mytrainablemapping([],PAR) % define the name of the mapping (just used for annotation). name = 'Trainable Mapping Skeleton'; %% % define the mapping OUT = define_mapping(argin,'untrained',name); elseif mapping_task(argin,'training') % we are here if the routine is called by % W = mytrainablemapping(TRAINSET,PAR), or by % W = TRAINSET*mytrainablemapping([],PAR) % Retrieve the input parameters from argin % For readability, this should correspond to the description the help % block of this function [TRAINSET,PAR] = deal(argin{:}); %% % Check the user supplied inputs % Replace all tests in this section by the appropriate ones for % the chosen input parameters isdataset(TRAINSET); % returns an error if TRAINSET is not a dataset %% % note that PRTools datafiles are formally, due to the Matlab class % definitions, also datasets. A call like isa(TRAINSET,'dataset') will % thereby generate no error in case TRAINSET is a datafile. if isempty(PAR) %% % select all features by defaults (i.e. do nothing) %% PAR = [1:size(A,2)]; %% end %% if any(PAR < 1) | any(PAR > size(TRAINSET,2)) %% % Features to be selected should be in a proper range %% error('Feature number out of range') %% end %% % Compute what is needed to create the desired mapping X = TRAINSET(:,PAR); % Reduced feature size of training set %% V = scalem(X,'variance'); % Shift and scaling %% % create a trained mapping, store necessary parameters as cell array OUT = trained_mapping(TRAINSET,{PAR,V});% (1) should correspond to (2)%% elseif mapping_task(argin,'execution') % we are here if the trained mapping, possibly constructed by % W = mytrainablemapping(TRAINSET,PAR); % is applied to a testset by D = TESTSET*W % This is converted by PRTools to D = mytrainablemapping(TESTSET,W) % neglecting the original meaning of the function parameters. % Retrieve the input parameters from argin. % the following statement should always look like this [TESTSET,W] = deal(argin{:}); % Retrieved the parameters stored by the user in W % This should correspond to the trained mapping definition above [PAR,V] = getdata(W); % (2) should correspond to (1) %% % Now we are ready to execute the function on the TESTSET. % Replace the next two lines by how your mapping would compute an % output dataset OUT from TESTSET and the parameters retrieved from % training. X = TESTSET(:,PAR); %% OUT = X*V; %% else error('Illegal call') end return
R.P.W. Duin
, January 28, 2013Programming trainable mappings and classifiers |