PRTools contents

PRTools manual

gendat

GENDAT

Random sampling of datasets for training and testing

   [A,B,IA,IB] = GENDAT(X,N)
    A = X*GENDAT([],N)
   [A,B,IA,IB] = GENDAT(X)
   [A,B,IA,IB] = GENDAT(X,ALF)
    A = X*GENDAT([],ALF)

Input
 X Dataset
 N,ALF Number/fraction of objects to be selected  (optional; default: bootstrapping)

Output
 A,B Datasets
 IA,IB Original indices from the dataset X

Description

Generation of N objects from dataset X. They are stored in dataset A,  the remaining objects in dataset B. IA and IB are the indices of the  objects selected from X for A and B. The random object generation follows  the class prior probabilities. So is the prior probability of a class is  PA, then in expectation PA*N objects are selected from that class. If N is large or if one of the classes has too few objects in A, the number of  generated objects might be less than N.

If N is a vector of sizes, exactly N(i) objects are generated for class i.  Classes are ordered using RENUMLAB(GETLAB(X)).

If the function is called without specifying N, the data set X is  bootstrapped and stored in A. Not selected samples are stored in B.

ALF should be a scalar < 1. For each class a fraction ALF of the objects  is selected for A and the not selected objects are stored in B.

If X is a cell array of datasets the command is executed for each  dataset separately. Results are stored in cell arrays. For each dataset  the random seed is reset, resulting in aligned sets for the generated  datasets if the sets in X were aligned.

Example(s)

prex_plotc,

See also

datasets, gensubsets,

PRTools contents

PRTools manual