PRTools structural indexing |
The fields of the variables of the PRTools
programming classes dataset
, datafile
and mapping
can be retrieved by get
-commands, e.g. getdata
or getlabels
and can be set by corresponding set
-commands like setdata
and setlabels
. A different way to do about the same (but not exactly) is by the use of substructures, e.g.
a = dataset(randn(250,5)) a.featsize % 5 a.objsize % 250
These commands show the same information as can be retrieved by getfeatsize(a)
or getobjsize(a)
. The substructure construct can also be used for assignment:
a.featsize = 6; a.objsize = 100;
The big and significant difference with the set
- and get
-commands is that the use of substructures gives direct access to the fields of the variables without any error checking. This is evident from the above example as after the assignment there is a conflict between the size of the data field and featsize
and objsize
.
The substructure handling is just implemented for maintenance by the PRTools
designers who are aware of all underlying constructions. Users are severely discouraged to use the substructure retrieval and assignment as described in this section, certainly in their own coded commands. Changes in PRTools
may effect such operations and they might become incompatible.
An exception is that the direct access of the structure fields can be very useful for debugging purposes from the command line. Here is an example which is only useful for advanced users that are able and prepared to study PRTools
sources. It shows how the data field of a combined classifier including some preprocessing is inspected.
A = gendatb; U1 = parzenc([],1); U2 = treec([],'maxcrit'); U3 = qdc; U = scalem([],'variance')*[U1 U2 U3]*maxc; W = A*U % Minimum combiner, 2 to 2 trained mapping --> fixedcc
The last line shows that the final classifier has as a name 'Minimum combiner' and is should be executed by the routine fixedcc
. This is the general routine that executes all fixed combiners.
W.data % [2x6 mapping] 'max' 'Maximum combiner' []
In the data field of W
a cell array is stored with the above 4 elements. The second, 'max', is the type of combiner, the third is the name and the empty field is there because the max
combiner has no parameters. The first element is inspected by:
W.data{1} % unit-var+, 2 to 6 trained mapping --> sequential
So in this field a mapping named unit-var+ is stored, to be executed by the procedure sequential. It is a 2 by 6 classifier, which fits with the fact that there are 3 2-class base classifiers in 2 dimensions. The data field of this sequential classifier contains:
W.data{1}.data % [2x2 mapping] [2x6 mapping]
The two mappings that are combined by sequential
are:
W.data{1}.data{1} % unit-var, 2 to 2 trained mapping --> affine W.data{1}.data{2} % 2 to 6 trained mapping --> stacked
Apparently an affine transform and a stacked combination. Details can be found by
W.data{1}.data{1}.data % rot: [0.2143 0.3535] % offset: [0.5279 0.8172] % lablist_in: [2x1 double]
which is the result of the feature normalization by scalem([],'variance'), and
W.data{1}.data{2}.data % [2x2 mapping] [2x2 mapping] [2x2 mapping] W.data{1}.data{2}.data{1} % Parzen Classifier, 2 to 2 trained mapping --> parzen_map W.data{1}.data{2}.data{2} % Decision Tree, 2 to 2 trained mapping --> tree_map W.data{1}.data{2}.data{3} % Bayes-Normal-2, 2 to 2 trained mapping --> normal_map
These are the three base classifiers. The final details can be found by the last steps, e.g.
W.data{1}.data{2}.data{1}.data % [100x2 dataset] [2x2 double] W.data{1}.data{2}.data{1}.data{2} % 1 1 % 1 1
showing that the Parzen classifier stores the entire training dataset of size [100,2]
and the smoothing parameters for both classes and both features, all 1.
A quick summary of the above analysis is produced by the command parsc
which breaks down the data fields of mappings recursively:
parsc(w) % Minimum combiner, 2 to 2 trained mapping --> fixedcc % unit-var+, 2 to 6 trained mapping --> sequential % unit-var, 2 to 2 trained mapping --> affine % 2 to 6 trained mapping --> stacked % Parzen Classifier, 2 to 2 trained mapping --> parzen_map % Decision Tree, 2 to 2 trained mapping --> tree_map % Bayes-Normal-2, 2 to 2 trained mapping --> normal_map
It tells that fixedcc
operates on a sequential combination of an affine transform (defined by scalem
) and the stacked combination of three classifiers.
R.P.W. Duin
, January 28, 2013PRTools structural indexing |